Open Jiseon623 opened 4 years ago
Hi,
The main error seems to be module: command not found
Are you running this from a computer or a cluster?
Can you please attach the .nextflow.log
file, which might have more information.
Thanks
Thank you for your quick response
I'm using a computer. I pasted the contents of the log file and attached the file.
Thanks
<.nextflow.log>
main.nf
[lonely_koch] - revision: 0a592c2713
Apr-14 12:12:07.501 [main] DEBUG nextflow.config.ConfigBuilder - Found
config local: /home/jiseon623/cegwas2-nf/nextflow.config
Apr-14 12:12:07.502 [main] DEBUG nextflow.config.ConfigBuilder - Parsing
config file: /home/jiseon623/cegwas2-nf/nextflow.config
Apr-14 12:12:07.524 [main] DEBUG nextflow.config.ConfigBuilder - Applying
config profile: standard
Apr-14 12:12:08.018 [main] DEBUG nextflow.Session - Session uuid:
f1d88ff2-55e6-41fc-a119-ef561e6da3da
Apr-14 12:12:08.018 [main] DEBUG nextflow.Session - Run name: lonely_koch
Apr-14 12:12:08.018 [main] DEBUG nextflow.Session - Executor pool size: 40
Apr-14 12:12:08.031 [main] DEBUG nextflow.cli.CmdRun -
Version: 19.07.0 build 5106
Created: 27-07-2019 13:22 UTC (22:22 KDT)
System: Linux 4.15.0-91-generic
Runtime: Groovy 2.5.6 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_241-b07
Encoding: UTF-8 (UTF-8)
Process: 14928@nematode [127.0.1.1]
CPUs: 40 - Mem: 188.8 GB (781.7 MB) - Swap: 466.7 GB (464.8 GB)
Apr-14 12:12:08.151 [main] DEBUG nextflow.Session - Work-dir:
/home/jiseon623/cegwas2-nf/work [ext2/ext3]
Apr-14 12:12:08.321 [main] DEBUG nextflow.Session - Session start invoked
Apr-14 12:12:08.787 [main] DEBUG nextflow.script.ScriptRunner - > Launching
execution
Apr-14 12:12:08.867 [main] INFO nextflow.Nextflow -
Apr-14 12:12:08.867 [main] INFO nextflow.Nextflow -Apr-14 12:12:08.867 [main] INFO nextflow.Nextflow - Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - Phenotype Directory = null Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - VCF = bin/WI.20180527.impute.vcf.gz Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - CeNDR Release = 20180527 Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - P3D = true Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - Significance Threshold = BF Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Max AF for Burden Mapping = 0.05 Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Min Strains with Variant for Burden = 2 Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Significance Threshold = BF Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Gene File = bin/gene_ref_flat.Rda Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Result Directory = Analysis_Results-20200414 Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Eigen Memory allocation = 100 GB Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Apr-14 12:12:08.991 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:08.992 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:08.997 [main] DEBUG nextflow.executor.Executor - [warm up] executor > local Apr-14 12:12:09.004 [main] DEBUG n.processor.LocalPollingMonitor - Creating local task monitor for executor 'local' > cpus=40; memory=188.8 GB; capacity=40; pollInterval=100ms; dumpInterval=5m Apr-14 12:12:09.043 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > fix_strain_names_bulk -- maxForks: 40 Apr-14 12:12:09.082 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:09.082 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.083 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > vcf_to_geno_matrix -- maxForks: 40 Apr-14 12:12:09.094 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.094 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.100 [main] DEBUG nextflow.processor.TaskProcessor - Creating combiner operator for each param(s) at index(es): [1] Apr-14 12:12:09.108 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > chrom_eigen_variants -- maxForks: 40 Apr-14 12:12:09.123 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:09.123 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.125 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > collect_eigen_variants -- maxForks: 40 Apr-14 12:12:09.147 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.147 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.148 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > rrblup_maps -- maxForks: 40 Apr-14 12:12:09.157 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.157 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.161 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > summarize_maps -- maxForks: 40 Apr-14 12:12:09.198 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.198 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.199 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > prep_ld_files -- maxForks: 40 Apr-14 12:12:09.205 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.205 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.207 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > rrblup_fine_maps -- maxForks: 40 Apr-14 12:12:09.211 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:09.211 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.212 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > concatenate_LD_per_trait -- maxForks: 40 Apr-14 12:12:09.218 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.218 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.219 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > plot_genes -- maxForks: 40 Apr-14 12:12:09.225 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.225 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.226 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > burden_mapping -- maxForks: 40 Apr-14 12:12:09.229 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:09.229 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.230 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > plot_burden -- maxForks: 40 Apr-14 12:12:09.232 [main] DEBUG nextflow.script.ScriptRunner - > Await termination Apr-14 12:12:09.232 [main] DEBUG nextflow.Session - Session await Apr-14 12:12:09.274 [Task submitter] DEBUG nextflow.executor.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Apr-14 12:12:09.279 [Task submitter] INFO nextflow.Session - [09/194f45] Submitted process > fix_strain_names_bulk (BULK TRAIT) Apr-14 12:12:09.314 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 1; name: fix_strain_names_bulk (BULK TRAIT); status: COMPLETED; exit: 127; error: -; workDir: /hom$ Apr-14 12:12:09.321 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump output of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegw$ Apr-14 12:12:09.323 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump error of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegwa$ Apr-14 12:12:09.339 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'fix_strain_names_bulk (BULK TRAIT)'
Caused by:
Process fix_strain_names_bulk (BULK TRAIT)
terminated with an error
exit status (127)
Command executed:
Rscript --vanilla which Fix_Isotype_names_bulk.R
data.tsv fix
Command exit status: 127
Command output: (empty)
Command wrapper: .command.run: line 202: module: command not found
Work dir: /home/jiseon623/cegwas2-nf/work/09/194f45d74a0f318a891e5615bd3045
Tip: when you have fixed the problem you can continue the execution adding
the option -resume
to the run command line
Apr-14 12:12:09.351 [main] DEBUG nextflow.Session - Session await > all
process finished
Apr-14 12:12:09.355 [Task monitor] DEBUG nextflow.Session - Session aborted
-- Cause: Process fix_strain_names_bulk (BULK TRAIT)
terminated with an
error exit status (127)
Apr-14 12:12:09.373 [Task monitor] DEBUG nextflow.processor.TaskRun -
Unable to dump error of process 'fix_strain_names_bulk (BULK TRAIT)' --
Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegwa$
Apr-14 12:12:09.373 [Task monitor] DEBUG nextflow.processor.TaskRun -
Unable to dump output of process 'fix_strain_names_bulk (BULK TRAIT)' --
Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegw$
Apr-14 12:12:09.374 [main] DEBUG nextflow.Session - Session await > all
barriers passed
Apr-14 12:12:09.375 [main] DEBUG nextflow.processor.TaskRun - Unable to
dump error of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause:
java.nio.file.NoSuchFileException: /home/jiseon623/cegwas2-nf/wo$
Apr-14 12:12:09.375 [main] DEBUG nextflow.processor.TaskRun - Unable to
dump output of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause:
java.nio.file.NoSuchFileException: /home/jiseon623/cegwas2-nf/w$
Apr-14 12:12:09.383 [main] ERROR nextflow.script.WorkflowMetadata - Failed
to invoke workflow.onComplete
event handler
java.io.FileNotFoundException: Analysis_Results-20200414/log.txt (No such
file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.
{"fix_strain_names_bulk":{"cpu":null,"mem":null,"vmem":null,"time":{"mean":6,"min":6,"q1":6,"q2":6,"q3":6,"max":6,"minLabel":"fix_strain_names_bulk (BULK TRAIT)","maxLabel":"fix_strain_names_bulk (BULK T$ Apr-14 12:12:10.214 [main] DEBUG nextflow.CacheDB - Closing CacheDB done Apr-14 12:12:10.228 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye
2020년 4월 13일 (월) 오후 11:55, Stefan notifications@github.com님이 작성:
Hi,
The main error seems to be module: command not found
Are you running this from a computer or a cluster?
Can you please attach the .nextflow.log file, which might have more information.
Thanks
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/AndersenLab/cegwas2-nf/issues/24#issuecomment-612933292, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANUZHDXCHHPZ3H7PK6SMC6TRMMRVXANCNFSM4MGZ5CVA .
I think this might be a platform issue because we made this pipeline on a Linux cluster. It looks like you are running on a Linux computer, which we have not tested the pipeline on. I am not sure how different the personal Linux computer is from the cluster.
Here are some things I can recommend:
1) Have you verified that Nextflow works on your system? Nextflow has a test after install that you can run to verify it is all working smoothly. See here
2) If you found that Nextflow is successfully working on your machine: go to the directory that the pipeline failed: /home/jiseon623/cegwas2-nf/work/09/194f45d74a0f318a891e5615bd3045
and attempt to run the command outside of Nextflow. This can be done by running the following command in the above directory:
Rscript --vanilla path/to/this/file/Fix_Isotype_names_bulk.R data.tsv fix
3) Finally, I think an old lab member of mine set up a docker container tagging him here: @faithman that might work more robustly on your personal machine. Here is the link
Let me know what happens when you try these things because it will help us make the pipeline better.
Thank you for your advice
1. I ran tutorial.nf http://tutorial.nf and it worked well.
2. I ran the command you wrote in the directory. I paste the output.
jiseon623@nematode:~/cegwas2-nf/work/e2/f60714f6f3409547771fd56372d8f6$ Rscript --vanilla /home/jiseon623/cegwas2-nf/bin/Fix_Isotype_names_bulk.R ../../../data.tsv fix ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.3.0 ✔ purrr 0.3.3 ✔ tibble 3.0.0 ✔ dplyr 0.8.5 ✔ tidyr 1.0.2 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.5.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag()
Attaching package: ‘data.table’
The following objects are masked from ‘package:dplyr’:
between, first, last
The following object is masked from ‘package:purrr’:
transpose
downloaded 311.0 MB
Warning message: Grouping rowwise data frame strips rowwise nature
3. I'm not root so I can't use docker. Instead, I ran main.nf
http://main.nf without nextflow.config file and with a modified file.
3-1 without nextflow.config
N E X T F L O W ~ version 19.07.0
Launching main.nf
[nasty_yalow] - revision: 0a592c2713
Phenotype Directory = null VCF = bin/WI.20180527.impute.vcf.gz CeNDR Release = 20180527 P3D = true Significance Threshold = BF Max AF for Burden Mapping = 0.05 Min Strains with Variant for Burden = 2 Significance Threshold = BF Gene File = bin/gene_ref_flat.Rda Result Directory = Analysis_Results-20200420 Eigen Memory allocation = 100 GB
executor > local (16) [16/224a95] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔ [fd/2dbaf6] process > vcf_to_geno_matrix (1) [100%] 1 of 1 ✔ [4c/02c8c7] process > chrom_eigen_variants (IV) [100%] 6 of 6 ✔ [fc/2f34be] process > collect_eigen_variants [100%] 1 of 1 ✔ [68/4a79d2] process > rrblup_maps (m2) [100%] 2 of 2 ✔ [64/782952] process > summarize_maps [ 0%] 0 of 1 [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [9d/aaf905] process > burden_mapping (m2) [100%] 2 of 2 ✔ [e7/c51ddb] process > plot_burden (m2) [100%] 2 of 2 ✔ Error executing process > 'summarize_maps'
Caused by:
Process summarize_maps
terminated with an error exit status (1)
Command executed:
Rscript --vanilla which Summarize_Mappings.R
cat *processed_mapping.tsv |\ awk '$0 !~ "\tNA\t" {print}' |\ awk '!seen[$2,$5,$12,$13,$14]++' |\ awk 'NR>1{print $5, $2, $12, $13, $14}' OFS="\t" > QTL_peaks.tsv
sig_maps=wc -l QTL_peaks.tsv | cut -f1 -d' '
if [ $sig_maps = 0 ]; then
max_log10=cat *processed_mapping.tsv | awk 'BEGIN {max = 0} {if ($4>max && $4!= "log10p") max=$4} END {print max}'
echo "NO TRAITS HAD SIGNIFICANT MAPPINGS - MAXIMUM -log10p IS
$max_log10 - CONSIDER SETTING BF THRESHOLD BELOW THIS VALUE"
exit
fi
Command exit status: 1
Command output: [1] "m2_processed_mapping.tsv" "mt_processed_mapping.tsv"
Command error:
── Attaching packages ─────────────────────────────────────── tidyverse
1.2.1 ──
✔ ggplot2 3.3.0 ✔ purrr 0.3.3
✔ tibble 3.0.0 ✔ dplyr 0.8.5
✔ tidyr 1.0.2 ✔ stringr 1.4.0
✔ readr 1.3.1 ✔ forcats 0.5.0
── Conflicts ──────────────────────────────────────────
tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
Error: Assigned data c(...)
must be compatible with row subscript 1
.
✖ 1 row must be assigned.
✖ Assigned data has 18 rows.
ℹ Only vectors of size 1 are recycled.
Backtrace:
█
[<-
(...)[<-.tbl_df
(...)Work dir: /home/jiseon623/test/work/64/7829520b2799e5432b5b4eee571481
Tip: view the complete command output by changing to the process work dir
and entering the command cat .command.out
Pipeline execution summary
---------------------------
Completed at: Mon Apr 20 18:02:54 KST 2020
Duration : 7m 40s
Success : false
workDir : /home/jiseon623/test/work
exit status : 1
Error report: Error executing process > 'summarize_maps'
Caused by:
Process summarize_maps
terminated with an error exit status (1)
Command executed:
Rscript --vanilla which Summarize_Mappings.R
cat *processed_mapping.tsv |\ awk '$0 !~ "\tNA\t" {print}' |\ awk '!seen[$2,$5,$12,$13,$14]++' |\ awk 'NR>1{print $5, $2, $12, $13, $14}' OFS="\t" > QTL_peaks.tsv
sig_maps=wc -l QTL_peaks.tsv | cut -f1 -d' '
if [ $sig_maps = 0 ]; then
max_log10=cat *processed_mapping.tsv | awk 'BEGIN {max = 0} {if ($4>max && $4!= "log10p") max=$4} END {print max}'
echo "NO TRAITS HAD SIGNIFICANT MAPPINGS - MAXIMUM -log10p IS
$max_log10 - CONSIDER SETTING BF THRESHOLD BELOW THIS VALUE"
exit
fi
Command exit status:
executor > local (16)
[16/224a95] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔
[fd/2dbaf6] process > vcf_to_geno_matrix (1) [100%] 1 of 1 ✔
[4c/02c8c7] process > chrom_eigen_variants (IV) [100%] 6 of 6 ✔
[fc/2f34be] process > collect_eigen_variants [100%] 1 of 1 ✔
[68/4a79d2] process > rrblup_maps (m2) [100%] 2 of 2 ✔
[64/782952] process > summarize_maps [100%] 1 of 1,
failed: 1 ✘
[- ] process > prep_ld_files -
[- ] process > rrblup_fine_maps -
[- ] process > concatenate_LD_per_trait -
[- ] process > plot_genes -
[9d/aaf905] process > burden_mapping (m2) [100%] 2 of 2 ✔
[e7/c51ddb] process > plot_burden (m2) [100%] 2 of 2 ✔
WARN: Access to undefined parameter email
-- Initialise it to a default
value eg. params.email = some_value
Error executing process > 'summarize_maps'
Caused by:
Process summarize_maps
terminated with an error exit status (1)
Command executed:
Rscript --vanilla which Summarize_Mappings.R
cat *processed_mapping.tsv |\ awk '$0 !~ "\tNA\t" {print}' |\ awk '!seen[$2,$5,$12,$13,$14]++' |\ awk 'NR>1{print $5, $2, $12, $13, $14}' OFS="\t" > QTL_peaks.tsv
sig_maps=wc -l QTL_peaks.tsv | cut -f1 -d' '
if [ $sig_maps = 0 ]; then
/home/max_log10=cat *processed_mapping.tsv | awk 'BEGIN {max = 0} {if ($4>max && $4!= "log10p") max=$4} END {print max}'
echo "NO TRAITS HAD SIGNIFICANT MAPPINGS - MAXIMUM -log10p IS
$max_log10 - CONSIDER SETTING BF THRESHOLD BELOW THIS VALUE"
p: vieexit
fi
Command exit status: 1
Command output: [1] "m2_processed_mapping.tsv" "mt_processed_mapping.tsv"
Command error:
── Attaching packages ─────────────────────────────────────── tidyverse
1.2.1 ──
✔ ggplot2 3.3.0 ✔ purrr 0.3.3
✔ tibble 3.0.0 ✔ dplyr 0.8.5
✔ tidyr 1.0.2 ✔ stringr 1.4.0
✔ readr 1.3.1 ✔ forcats 0.5.0
── Conflicts ──────────────────────────────────────────
tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
Error: Assigned data c(...)
must be compatible with row subscript 1
.
✖ 1 row must be assigned.
✖ Assigned data has 18 rows.
ℹ Only vectors of size 1 are recycled.
Backtrace:
█
[<-
(...)[<-.tbl_df
(...)Work dir: /home/jiseon623/test/work/64/7829520b2799e5432b5b4eee571481
Tip: view the complete command output by changing to the process work dir
and entering the command cat .command.out
3-2 with modified nextflow.config file contents: process.container = 'docker://faithman/cegwas2:latest' singularity.enabled = true singularity.cacheDir = "$PWD"
Phenotype Directory = null VCF = bin/WI.20180527.impute.vcf.gz CeNDR Release = 20180527 P3D = true Significance Threshold = BF Max AF for Burden Mapping = 0.05 Min Strains with Variant for Burden = 2 Significance Threshold = BF Gene File = bin/gene_ref_flat.Rda Result Directory = Analysis_Results-20200420 Eigen Memory allocation = 100 GB
executor > local (4) [94/d8122b] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔ [de/72a90e] process > vcf_to_geno_matrix (1) [ 0%] 0 of 1 [- ] process > chrom_eigen_variants - [- ] process > collect_eigen_variants - [- ] process > rrblup_maps - [- ] process > summarize_maps - [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [80/846eaa] process > burden_mapping (mt) [ 0%] 0 of 2 [- ] process > plot_burden - Error executing process > 'burden_mapping (m2)'
Caused by:
Process burden_mapping (m2)
terminated with an error exit status (127)
Command executed:
Rscript --vanilla which makeped.R
pr_m2.tsv
n_strains=wc -l pr_m2.tsv | cut -f1 -d" "
min_af=bc -l <<< "2/($n_strains-1)"
rvtest \ --pheno m2.ped \ --out m2 \ --inVcf WI.20180527.impute.vcf.gz \ --freqUpper 0.05 \ --freqLower $min_af \ --geneFile refFlat.ws245.txt \ --vt price \ --kernel skat
Command exit status: 127
Command output: (empty)
Command error: ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.2.1 ✔ purrr 0.3.2 ✔ tibble 2.1.3 ✔ dplyr 0.8.3 ✔ tidyr 0.8.3 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.4.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() Parsed with column specification: cols( strain = col_character(), m2 = col_double() ) .command.sh: line 5: bc: command not found
Work dir: /home/jiseon623/test2/work/18/c2b6c23d8b8bad167383f02099c71e
Tip: when you have fixed the problem you can continue the execution adding
the option -resume
to the run command line
Pipeline execution summary
---------------------------
Completed at: Mon Apr 20 15:29:32 KST 2020
Duration : 8.4s
Success : false
workDir : /home/jiseon623/test2/work
exit status : 127
Error report: Error executing process > 'burden_mapping (m2)'
Caused by:
Process burden_mapping (m2)
terminated with an error exit status (127)
Command executed:
Rscript --vanilla which makeped.R
pr_m2.tsv
n_strains=wc -l pr_m2.tsv | cut -f1 -d" "
min_af=bc -l <<< "2/($n_strains-1)"
rvtest \
--pheno m2.ped \
executor > local (4)
[94/d8122b] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔
[de/72a90e] process > vcf_to_geno_matrix (1) [100%] 1 of 1,
failed: 1
[- ] process > chrom_eigen_variants -
[- ] process > collect_eigen_variants -
[- ] process > rrblup_maps -
[- ] process > summarize_maps -
[- ] process > prep_ld_files -
[- ] process > rrblup_fine_maps -
[- ] process > concatenate_LD_per_trait -
[- ] process > plot_genes -
[80/846eaa] process > burden_mapping (mt) [100%] 2 of 2,
failed: 2
[- ] process > plot_burden -
WARN: Access to undefined parameter email
-- Initialise it to a default
value eg. params.email = some_value
WARN: Killing pending tasks (2)
Error executing process > 'burden_mapping (m2)'
Caused by:
Process burden_mapping (m2)
terminated with an error exit status (127)
Command executed:
Rscript --vanilla which makeped.R
pr_m2.tsv
n_strains=wc -l pr_m2.tsv | cut -f1 -d" "
min_af=bc -l <<< "2/($n_strains-1)"
rvtest \ --pheno m2.ped \ --out m2 \ --inVcf WI.20180527.impute.vcf.gz \ --freqUpper 0.05 \ --freqLower $min_af \ --geneFile refFlat.ws245.txt \ --vt price \ --kernel skat
Command exit status: 127
Command output: (empty)
Command error: ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.2.1 ✔ purrr 0.3.2 ✔ tibble 2.1.3 ✔ dplyr 0.8.3 ✔ tidyr 0.8.3 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.4.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() Parsed with column specification: cols( strain = col_character(), m2 = col_double() ) .command.sh: line 5: bc: command not found
Work dir: /home/jiseon623/test2/work/18/c2b6c23d8b8bad167383f02099c71e
Tip: when you have fixed the problem you can continue the execution adding
the option -resume
to the run command line
Hi again and sorry for the delays.
It looks like something has changed since your original post.. as you have made it further through the pipeline, which is indicated by:
[16/224a95] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔
[fd/2dbaf6] process > vcf_to_geno_matrix (1) [100%] 1 of 1 ✔
[4c/02c8c7] process > chrom_eigen_variants (IV) [100%] 6 of 6 ✔
[fc/2f34be] process > collect_eigen_variants [100%] 1 of 1 ✔
[68/4a79d2] process > rrblup_maps (m2) [100%] 2 of 2 ✔
[64/782952] process > summarize_maps [100%] 1 of 1,
failed: 1 ✘
[- ] process > prep_ld_files -
[- ] process > rrblup_fine_maps -
[- ] process > concatenate_LD_per_trait -
[- ] process > plot_genes -
[9d/aaf905] process > burden_mapping (m2) [100%] 2 of 2 ✔
[e7/c51ddb] process > plot_burden (m2) [100%] 2 of 2 ✔
I am curious to know what changed?
Regarding the current error, I am wondering if a significant QTL was identified by the mapping pipeline. I can't recall if I included a "Terminate pipeline if no significant QTL were identified" in the script. This might explain the current issue.
A couple of things to look for:
Check out the plots that are output in the Mappings/Plots
folder, are any above the horizontal significance threshold line? Are there any pxgplot.pdf
files, which would suggest a significant QTL. If the answer to these questions is no, then you can try lowering the significance threshold using the --sthresh=EIGEN
flag when executing the nextflow pipeline. This threshold is usually between 4-5 on the y axis of the manhattan plot, so if some markers are above that, you should identify marginally significant QTL using the --sthresh=EIGEN
flag.
If the answer is yes a significant QTL was identified but the pipeline still failed, then I would suggest entering the /home/jiseon623/test/work/64/7829520b2799e5432b5b4eee571481
directory where the pipeline failed and try executing the commands as last time to see if they work with on your current machine.
Please let me know how this troubleshooting goes, as we are gearing up to revamp this entire workflow
Thanks for your patience and your help!
Hi
Thank you for your help and suggestions
I thought nextflow.config file was causing the failure of the first process, so I tried running main.nf in a directory that does not contain nextflow.config file. I'm sorry that the text I wrote was too long to recognize that I ran it without the config file.
In Mapping/Plots, I have files named "(trait)_manplot.pdf", but no pxgplot.pdf file. In the manplot files, several dots indicated by red are above the threshold.
When I ran "Rscript --vanilla /home/jiseon623/test/bin/Summarize_Mappings.R" in "~/test/work/64/7829520b2799e5432b5b4eee571481", the result was:
jiseon623@nematode:~/test/work/64/7829520b2799e5432b5b4eee571481$
Rscript --vanilla /home/jiseon623/test/bin/Summarize_Mappings.R
── Attaching packages ─────────────────────────────────────── tidyverse
1.2.1 ──
✔ ggplot2 3.3.0 ✔ purrr 0.3.3
✔ tibble 3.0.0 ✔ dplyr 0.8.5
✔ tidyr 1.0.2 ✔ stringr 1.4.0
✔ readr 1.3.1 ✔ forcats 0.5.0
── Conflicts ──────────────────────────────────────────
tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
[1] "m2_processed_mapping.tsv" "mt_processed_mapping.tsv"
Error: Assigned data c(...)
must be compatible with row subscript 1
.
✖ 1 row must be assigned.
✖ Assigned data has 18 rows.
ℹ Only vectors of size 1 are recycled.
Backtrace:
█
[<-
(...)[<-.tbl_df
(...)Thanks again for your help and suggestions
Interesting...
At this point it seems like the issue is at a post-mapping processing step.
It will be difficult for me to offer more suggestions without doing some troubleshooting of my own.
Usually no output pxgplot.pdf means there were no significant QTL detected. This might make sense that the pipeline is crashing on the summarize mapping step if there are no QTL above the BF threshold. But you mentioned there are red dots in the manhattan plot so I am confused about that. If QTL above threshold are detected should also see blue regions surrounding each QTL that correspond to genomic regions that are processed further. If you do not see these in your manhattan plot, try running the pipeline again with --sthresh=EIGEN
, which will lower the threshold for post-mapping QTL processing. Note that you can also add the -resume
flag so you don't have to run the steps that already completed.
If this is not the issue, please let me know what files are in the summarize mapping directory and if you would be willing to share your data so I can see what the issue is.
Hello
When I ran the pipeline with --sthresh=BF, I couldn't see any blue regions in manplot files. So I lowered the threshold to 3, and then I could see pxgplot files and blue regions surrounding red dots in manplot files. however, summarize_maps process failed again with the same error as before. There were only trait_processed_mapping.tsv files in the directory that the pipeline failed, and I ran the command below in the directory. (I'm not sure that "summarize mapping directory" means this directory.)
Rscript --vanilla /home/jiseon623/cegwas2-nf/bin/Summarize_Mappings.R
cat *processed_mapping.tsv | awk '$0 !~ "\tNA\t" {print}' | awk '!seen[$2,$5,$12,$13,$14]++' | awk 'NR>1{print $5, $2, $12, $13, $14}' OFS="\t" > QTL_peaks.tsv
sig_maps= wc -l QTL_peaks.tsv | cut -f1 -d' '
if [ "$sig_maps" = 0 ]; then max_log10= cat *processed_mapping.tsv | awk 'BEGIN {max = 0} {if ($4>max && $4!= "log10p") max=$4} END {print max}' echo "NO TRAITS HAD SIGNIFICANT MAPPINGS - MAXIMUM -log10p IS $max_log10
Then, a QTL_peaks.tsv file containing information(start, peak, and end position) of QTL peaks was created in the directory.
Sorry, I didn't understand exactly what "your data" means.
Thanks
Hello, I tried using cegwas2-nf and it doesn't even go through the first process. I tried to find the cause of this issue, but couldn't find it.
I attach the output when running the command
NXF_VER=19.07.0 nextflow main.nf --traitfile=test_traits/PC1.tsv --vcf=bin/WI.20180527.impute.vcf.gz --p3d=TRUE --sthresh=BF:
Thank you in advance for your help
N E X T F L O W ~ version 19.07.0 Launching
main.nf
[confident_northcutt] - revision: 0a592c2713C. elegans GWAS pipeline
Phenotype Directory = null VCF = bin/WI.20180527.impute.vcf.gz CeNDR Release = 20180527 P3D = true Significance Threshold = BF Max AF for Burden Mapping = 0.05 Min Strains with Variant for Burden = 2 Significance Threshold = BF Gene File = bin/gene_ref_flat.Rda Result Directory = Analysis_Results-20200413 Eigen Memory allocation = 100 GB
executor > local (1) [e4/6cd3a3] process > fix_strain_names_bulk (BULK TRAIT) [ 0%] 0 of 1 [- ] process > vcf_to_geno_matrix - [- ] process > chrom_eigen_variants - [- ] process > collect_eigen_variants - [- ] process > rrblup_maps - [- ] process > summarize_maps - [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [- ] process > burden_mapping - [- ] process > plot_burden - Error executing process > 'fix_strain_names_bulk (BULK TRAIT)'
Caused by: Process
fix_strain_names_bulk (BULK TRAIT)
terminated with an error exit status (127)Command executed:
Rscript --vanilla
which Fix_Isotype_names_bulk.R
PC1.tsv fixCommand exit status: 127
Command output: (empty)
Command wrapper: .command.run: line 202: module: command not found
Work dir: /home/jiseon623/cegwas2-nf/work/e4/6cd3a3c66367217e09020d751dcfdb
executor > local (1) [e4/6cd3a3] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1, failed: 1 ✘ [- ] process > vcf_to_geno_matrix - [- ] process > chrom_eigen_variants - [- ] process > collect_eigen_variants - [- ] process > rrblup_maps - [- ] process > summarize_maps - [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [- ] process > burden_mapping - [- ] process > plot_burden - Error executing process > 'fix_strain_names_bulk (BULK TRAIT)'
Caused by: Process
fix_strain_names_bulk (BULK TRAIT)
terminated with an error exit status (127)Command executed:
Rscript --vanilla
which Fix_Isotype_names_bulk.R
PC1.tsv fixCommand exit status: 127
Command output: (empty)
Command wrapper: .command.run: line 202: module: command not found
Work dir: /home/jiseon623/cegwas2-nf/work/e4/6cd3a3c66367217e09020d751dcfdb
Tip: you can replicate the issue by changing to the process work dir and entering the command
bash .command.run
Failed to invokeworkflow.onComplete
event handler-- Check script 'main.nf' at line: 773 or see '.nextflow.log' file for more details