AndersenLab / cegwas2-nf

GWA mapping with C. elegans
MIT License
8 stars 6 forks source link

error exit status (127) #24

Open Jiseon623 opened 4 years ago

Jiseon623 commented 4 years ago

Hello, I tried using cegwas2-nf and it doesn't even go through the first process. I tried to find the cause of this issue, but couldn't find it.

I attach the output when running the command NXF_VER=19.07.0 nextflow main.nf --traitfile=test_traits/PC1.tsv --vcf=bin/WI.20180527.impute.vcf.gz --p3d=TRUE --sthresh=BF:

Thank you in advance for your help

N E X T F L O W ~ version 19.07.0 Launching main.nf [confident_northcutt] - revision: 0a592c2713


C. elegans GWAS pipeline

Phenotype Directory = null VCF = bin/WI.20180527.impute.vcf.gz CeNDR Release = 20180527 P3D = true Significance Threshold = BF Max AF for Burden Mapping = 0.05 Min Strains with Variant for Burden = 2 Significance Threshold = BF Gene File = bin/gene_ref_flat.Rda Result Directory = Analysis_Results-20200413 Eigen Memory allocation = 100 GB

executor > local (1) [e4/6cd3a3] process > fix_strain_names_bulk (BULK TRAIT) [ 0%] 0 of 1 [- ] process > vcf_to_geno_matrix - [- ] process > chrom_eigen_variants - [- ] process > collect_eigen_variants - [- ] process > rrblup_maps - [- ] process > summarize_maps - [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [- ] process > burden_mapping - [- ] process > plot_burden - Error executing process > 'fix_strain_names_bulk (BULK TRAIT)'

Caused by: Process fix_strain_names_bulk (BULK TRAIT) terminated with an error exit status (127)

Command executed:

Rscript --vanilla which Fix_Isotype_names_bulk.R PC1.tsv fix

Command exit status: 127

Command output: (empty)

Command wrapper: .command.run: line 202: module: command not found

Work dir: /home/jiseon623/cegwas2-nf/work/e4/6cd3a3c66367217e09020d751dcfdb

executor > local (1) [e4/6cd3a3] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1, failed: 1 ✘ [- ] process > vcf_to_geno_matrix - [- ] process > chrom_eigen_variants - [- ] process > collect_eigen_variants - [- ] process > rrblup_maps - [- ] process > summarize_maps - [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [- ] process > burden_mapping - [- ] process > plot_burden - Error executing process > 'fix_strain_names_bulk (BULK TRAIT)'

Caused by: Process fix_strain_names_bulk (BULK TRAIT) terminated with an error exit status (127)

Command executed:

Rscript --vanilla which Fix_Isotype_names_bulk.R PC1.tsv fix

Command exit status: 127

Command output: (empty)

Command wrapper: .command.run: line 202: module: command not found

Work dir: /home/jiseon623/cegwas2-nf/work/e4/6cd3a3c66367217e09020d751dcfdb

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run Failed to invoke workflow.onComplete event handler

-- Check script 'main.nf' at line: 773 or see '.nextflow.log' file for more details

Thatguy027 commented 4 years ago

Hi,

The main error seems to be module: command not found

Are you running this from a computer or a cluster?

Can you please attach the .nextflow.log file, which might have more information.

Thanks

Jiseon623 commented 4 years ago

Thank you for your quick response

I'm using a computer. I pasted the contents of the log file and attached the file.

Thanks

<.nextflow.log>

Apr-14 12:12:07.307 [main] DEBUG nextflow.cli.Launcher - $> nextflow run main.nf --traitfile=data.tsv --vcf=bin/WI.20180527.impute.vcf.gz --p3d=TRUE --sthresh=BF Apr-14 12:12:07.461 [main] INFO nextflow.cli.CmdRun - N E X T F L O W ~ version 19.07.0 Apr-14 12:12:07.479 [main] INFO nextflow.cli.CmdRun - Launching main.nf [lonely_koch] - revision: 0a592c2713 Apr-14 12:12:07.501 [main] DEBUG nextflow.config.ConfigBuilder - Found config local: /home/jiseon623/cegwas2-nf/nextflow.config Apr-14 12:12:07.502 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /home/jiseon623/cegwas2-nf/nextflow.config Apr-14 12:12:07.524 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: standard Apr-14 12:12:08.018 [main] DEBUG nextflow.Session - Session uuid: f1d88ff2-55e6-41fc-a119-ef561e6da3da Apr-14 12:12:08.018 [main] DEBUG nextflow.Session - Run name: lonely_koch Apr-14 12:12:08.018 [main] DEBUG nextflow.Session - Executor pool size: 40 Apr-14 12:12:08.031 [main] DEBUG nextflow.cli.CmdRun - Version: 19.07.0 build 5106 Created: 27-07-2019 13:22 UTC (22:22 KDT) System: Linux 4.15.0-91-generic Runtime: Groovy 2.5.6 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_241-b07 Encoding: UTF-8 (UTF-8) Process: 14928@nematode [127.0.1.1] CPUs: 40 - Mem: 188.8 GB (781.7 MB) - Swap: 466.7 GB (464.8 GB) Apr-14 12:12:08.151 [main] DEBUG nextflow.Session - Work-dir: /home/jiseon623/cegwas2-nf/work [ext2/ext3] Apr-14 12:12:08.321 [main] DEBUG nextflow.Session - Session start invoked Apr-14 12:12:08.787 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution Apr-14 12:12:08.867 [main] INFO nextflow.Nextflow - Apr-14 12:12:08.867 [main] INFO nextflow.Nextflow -

Apr-14 12:12:08.867 [main] INFO nextflow.Nextflow - C. elegans GWAS pipeline Apr-14 12:12:08.867 [main] INFO nextflow.Nextflow -

Apr-14 12:12:08.867 [main] INFO nextflow.Nextflow - Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - Phenotype Directory = null Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - VCF = bin/WI.20180527.impute.vcf.gz Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - CeNDR Release = 20180527 Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - P3D = true Apr-14 12:12:08.868 [main] INFO nextflow.Nextflow - Significance Threshold = BF Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Max AF for Burden Mapping = 0.05 Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Min Strains with Variant for Burden = 2 Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Significance Threshold = BF Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Gene File = bin/gene_ref_flat.Rda Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Result Directory = Analysis_Results-20200414 Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Eigen Memory allocation = 100 GB Apr-14 12:12:08.869 [main] INFO nextflow.Nextflow - Apr-14 12:12:08.991 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:08.992 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:08.997 [main] DEBUG nextflow.executor.Executor - [warm up] executor > local Apr-14 12:12:09.004 [main] DEBUG n.processor.LocalPollingMonitor - Creating local task monitor for executor 'local' > cpus=40; memory=188.8 GB; capacity=40; pollInterval=100ms; dumpInterval=5m Apr-14 12:12:09.043 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > fix_strain_names_bulk -- maxForks: 40 Apr-14 12:12:09.082 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:09.082 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.083 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > vcf_to_geno_matrix -- maxForks: 40 Apr-14 12:12:09.094 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.094 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.100 [main] DEBUG nextflow.processor.TaskProcessor - Creating combiner operator for each param(s) at index(es): [1] Apr-14 12:12:09.108 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > chrom_eigen_variants -- maxForks: 40 Apr-14 12:12:09.123 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:09.123 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.125 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > collect_eigen_variants -- maxForks: 40 Apr-14 12:12:09.147 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.147 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.148 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > rrblup_maps -- maxForks: 40 Apr-14 12:12:09.157 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.157 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.161 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > summarize_maps -- maxForks: 40 Apr-14 12:12:09.198 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.198 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.199 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > prep_ld_files -- maxForks: 40 Apr-14 12:12:09.205 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.205 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.207 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > rrblup_fine_maps -- maxForks: 40 Apr-14 12:12:09.211 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:09.211 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.212 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > concatenate_LD_per_trait -- maxForks: 40 Apr-14 12:12:09.218 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.218 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.219 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > plot_genes -- maxForks: 40 Apr-14 12:12:09.225 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Apr-14 12:12:09.225 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.226 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > burden_mapping -- maxForks: 40 Apr-14 12:12:09.229 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: local Apr-14 12:12:09.229 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Apr-14 12:12:09.230 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > plot_burden -- maxForks: 40 Apr-14 12:12:09.232 [main] DEBUG nextflow.script.ScriptRunner - > Await termination Apr-14 12:12:09.232 [main] DEBUG nextflow.Session - Session await Apr-14 12:12:09.274 [Task submitter] DEBUG nextflow.executor.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Apr-14 12:12:09.279 [Task submitter] INFO nextflow.Session - [09/194f45] Submitted process > fix_strain_names_bulk (BULK TRAIT) Apr-14 12:12:09.314 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 1; name: fix_strain_names_bulk (BULK TRAIT); status: COMPLETED; exit: 127; error: -; workDir: /hom$ Apr-14 12:12:09.321 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump output of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegw$ Apr-14 12:12:09.323 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump error of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegwa$ Apr-14 12:12:09.339 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'fix_strain_names_bulk (BULK TRAIT)'

Caused by: Process fix_strain_names_bulk (BULK TRAIT) terminated with an error exit status (127)

Command executed:

Rscript --vanilla which Fix_Isotype_names_bulk.R data.tsv fix

Command exit status: 127

Command output: (empty)

Command wrapper: .command.run: line 202: module: command not found

Work dir: /home/jiseon623/cegwas2-nf/work/09/194f45d74a0f318a891e5615bd3045

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line Apr-14 12:12:09.351 [main] DEBUG nextflow.Session - Session await > all process finished Apr-14 12:12:09.355 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: Process fix_strain_names_bulk (BULK TRAIT) terminated with an error exit status (127) Apr-14 12:12:09.373 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump error of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegwa$ Apr-14 12:12:09.373 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump output of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegw$ Apr-14 12:12:09.374 [main] DEBUG nextflow.Session - Session await > all barriers passed Apr-14 12:12:09.375 [main] DEBUG nextflow.processor.TaskRun - Unable to dump error of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegwas2-nf/wo$ Apr-14 12:12:09.375 [main] DEBUG nextflow.processor.TaskRun - Unable to dump output of process 'fix_strain_names_bulk (BULK TRAIT)' -- Cause: java.nio.file.NoSuchFileException: /home/jiseon623/cegwas2-nf/w$ Apr-14 12:12:09.383 [main] ERROR nextflow.script.WorkflowMetadata - Failed to invoke workflow.onComplete event handler java.io.FileNotFoundException: Analysis_Results-20200414/log.txt (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at java.io.FileWriter.(FileWriter.java:90) at org.codehaus.groovy.runtime.ResourceGroovyMethods.newWriter(ResourceGroovyMethods.java:1900) at org.codehaus.groovy.runtime.dgm$1039.doMethodInvoke(Unknown Source) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1217) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041) at groovy.runtime.metaclass.NextflowDelegatingMetaClass.invokeMethod(NextflowDelegatingMetaClass.java:60) at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:44) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:119) at Script_1c3df8cd$_runScript_closure31.doCall(Script_1c3df8cd:773) at Script_1c3df8cd$_runScript_closure31.doCall(Script_1c3df8cd) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:263) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041) at groovy.lang.Closure.call(Closure.java:405) at groovy.lang.Closure.call(Closure.java:399) at nextflow.script.WorkflowMetadata$_invokeOnComplete_closure4.doCall(WorkflowMetadata.groovy:369) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:263) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041) at groovy.lang.Closure.call(Closure.java:405) at groovy.lang.Closure.call(Closure.java:421) at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2296) at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2281) at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2322) at nextflow.script.WorkflowMetadata.invokeOnComplete(WorkflowMetadata.groovy:367) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1217) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041) at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:1011) at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:994) at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:97) at nextflow.script.WorkflowMetadata$_closure3.doCall(WorkflowMetadata.groovy:233) at nextflow.script.WorkflowMetadata$_closure3.call(WorkflowMetadata.groovy) at nextflow.Session.shutdown0(Session.groovy:737) at nextflow.Session.destroy(Session.groovy:689) at nextflow.script.ScriptRunner.terminate(ScriptRunner.groovy:297) at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:134) at nextflow.cli.CmdRun.run(CmdRun.groovy:246) at nextflow.cli.Launcher.run(Launcher.groovy:451) at nextflow.cli.Launcher.main(Launcher.groovy:633) Apr-14 12:12:09.392 [main] DEBUG nextflow.trace.StatsObserver - Workflow completed > WorkflowStats[succeedCount=0; failedCount=1; ignoredCount=0; cachedCount=0; succeedDuration=0ms; failedDuration=6ms; cac$ Apr-14 12:12:09.392 [main] DEBUG nextflow.trace.ReportObserver - Flow completing -- rendering html report Apr-14 12:12:09.430 [main] DEBUG nextflow.trace.ReportObserver - Execution report summary data:

{"fix_strain_names_bulk":{"cpu":null,"mem":null,"vmem":null,"time":{"mean":6,"min":6,"q1":6,"q2":6,"q3":6,"max":6,"minLabel":"fix_strain_names_bulk (BULK TRAIT)","maxLabel":"fix_strain_names_bulk (BULK T$ Apr-14 12:12:10.214 [main] DEBUG nextflow.CacheDB - Closing CacheDB done Apr-14 12:12:10.228 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

2020년 4월 13일 (월) 오후 11:55, Stefan notifications@github.com님이 작성:

Hi,

The main error seems to be module: command not found

Are you running this from a computer or a cluster?

Can you please attach the .nextflow.log file, which might have more information.

Thanks

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/AndersenLab/cegwas2-nf/issues/24#issuecomment-612933292, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANUZHDXCHHPZ3H7PK6SMC6TRMMRVXANCNFSM4MGZ5CVA .

Thatguy027 commented 4 years ago

I think this might be a platform issue because we made this pipeline on a Linux cluster. It looks like you are running on a Linux computer, which we have not tested the pipeline on. I am not sure how different the personal Linux computer is from the cluster.

Here are some things I can recommend:

1) Have you verified that Nextflow works on your system? Nextflow has a test after install that you can run to verify it is all working smoothly. See here

2) If you found that Nextflow is successfully working on your machine: go to the directory that the pipeline failed: /home/jiseon623/cegwas2-nf/work/09/194f45d74a0f318a891e5615bd3045 and attempt to run the command outside of Nextflow. This can be done by running the following command in the above directory:

Rscript --vanilla path/to/this/file/Fix_Isotype_names_bulk.R data.tsv fix

3) Finally, I think an old lab member of mine set up a docker container tagging him here: @faithman that might work more robustly on your personal machine. Here is the link

Let me know what happens when you try these things because it will help us make the pipeline better.

Jiseon623 commented 4 years ago

Thank you for your advice

1. I ran tutorial.nf http://tutorial.nf and it worked well.

2. I ran the command you wrote in the directory. I paste the output.

jiseon623@nematode:~/cegwas2-nf/work/e2/f60714f6f3409547771fd56372d8f6$ Rscript --vanilla /home/jiseon623/cegwas2-nf/bin/Fix_Isotype_names_bulk.R ../../../data.tsv fix ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.3.0 ✔ purrr 0.3.3 ✔ tibble 3.0.0 ✔ dplyr 0.8.5 ✔ tidyr 1.0.2 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.5.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag()

Attaching package: ‘data.table’

The following objects are masked from ‘package:dplyr’:

   between, first, last

The following object is masked from ‘package:purrr’:

   transpose

Parsed with column specification: cols( strain = col_character(), m2 = col_double(), mt = col_double() ) Downloading Gene Database to ~/.cegwas/cegwas.db trying URL ' https://storage.googleapis.com/elegansvariation.org/db/_latest.db' Content type 'application/octet-stream' length 326094848 bytes (311.0 MB)

downloaded 311.0 MB

Warning message: Grouping rowwise data frame strips rowwise nature

3. I'm not root so I can't use docker. Instead, I ran main.nf http://main.nf without nextflow.config file and with a modified file. 3-1 without nextflow.config N E X T F L O W ~ version 19.07.0 Launching main.nf [nasty_yalow] - revision: 0a592c2713


C. elegans GWAS pipeline

Phenotype Directory = null VCF = bin/WI.20180527.impute.vcf.gz CeNDR Release = 20180527 P3D = true Significance Threshold = BF Max AF for Burden Mapping = 0.05 Min Strains with Variant for Burden = 2 Significance Threshold = BF Gene File = bin/gene_ref_flat.Rda Result Directory = Analysis_Results-20200420 Eigen Memory allocation = 100 GB

executor > local (16) [16/224a95] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔ [fd/2dbaf6] process > vcf_to_geno_matrix (1) [100%] 1 of 1 ✔ [4c/02c8c7] process > chrom_eigen_variants (IV) [100%] 6 of 6 ✔ [fc/2f34be] process > collect_eigen_variants [100%] 1 of 1 ✔ [68/4a79d2] process > rrblup_maps (m2) [100%] 2 of 2 ✔ [64/782952] process > summarize_maps [ 0%] 0 of 1 [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [9d/aaf905] process > burden_mapping (m2) [100%] 2 of 2 ✔ [e7/c51ddb] process > plot_burden (m2) [100%] 2 of 2 ✔ Error executing process > 'summarize_maps'

Caused by: Process summarize_maps terminated with an error exit status (1)

Command executed:

Rscript --vanilla which Summarize_Mappings.R

cat *processed_mapping.tsv |\ awk '$0 !~ "\tNA\t" {print}' |\ awk '!seen[$2,$5,$12,$13,$14]++' |\ awk 'NR>1{print $5, $2, $12, $13, $14}' OFS="\t" > QTL_peaks.tsv

sig_maps=wc -l QTL_peaks.tsv | cut -f1 -d' '

if [ $sig_maps = 0 ]; then max_log10=cat *processed_mapping.tsv | awk 'BEGIN {max = 0} {if ($4>max && $4!= "log10p") max=$4} END {print max}' echo "NO TRAITS HAD SIGNIFICANT MAPPINGS - MAXIMUM -log10p IS $max_log10 - CONSIDER SETTING BF THRESHOLD BELOW THIS VALUE" exit fi

Command exit status: 1

Command output: [1] "m2_processed_mapping.tsv" "mt_processed_mapping.tsv"

Command error: ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.3.0 ✔ purrr 0.3.3 ✔ tibble 3.0.0 ✔ dplyr 0.8.5 ✔ tidyr 1.0.2 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.5.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() Error: Assigned data c(...) must be compatible with row subscript 1. ✖ 1 row must be assigned. ✖ Assigned data has 18 rows. ℹ Only vectors of size 1 are recycled. Backtrace: █

  1. ├─base::[<-(...)
  2. └─tibble:::[<-.tbl_df(...)
  3. └─tibble:::tbl_subassign(x, i, j, value, i_arg, j_arg, substitute(value))
  4. └─tibble:::vectbl_recycle_rhs(...)
  5. └─base::tryCatch(...)
  6. └─base:::tryCatchList(expr, classes, parentenv, handlers)
  7. └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
  8. └─value[3L] Execution halted

Work dir: /home/jiseon623/test/work/64/7829520b2799e5432b5b4eee571481

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

Pipeline execution summary
---------------------------
Completed at: Mon Apr 20 18:02:54 KST 2020
Duration    : 7m 40s
Success     : false
workDir     : /home/jiseon623/test/work
exit status : 1
Error report: Error executing process > 'summarize_maps'

Caused by: Process summarize_maps terminated with an error exit status (1)

Command executed:

Rscript --vanilla which Summarize_Mappings.R

cat *processed_mapping.tsv |\ awk '$0 !~ "\tNA\t" {print}' |\ awk '!seen[$2,$5,$12,$13,$14]++' |\ awk 'NR>1{print $5, $2, $12, $13, $14}' OFS="\t" > QTL_peaks.tsv

sig_maps=wc -l QTL_peaks.tsv | cut -f1 -d' '

if [ $sig_maps = 0 ]; then max_log10=cat *processed_mapping.tsv | awk 'BEGIN {max = 0} {if ($4>max && $4!= "log10p") max=$4} END {print max}' echo "NO TRAITS HAD SIGNIFICANT MAPPINGS - MAXIMUM -log10p IS $max_log10 - CONSIDER SETTING BF THRESHOLD BELOW THIS VALUE" exit fi

Command exit status: executor > local (16) [16/224a95] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔ [fd/2dbaf6] process > vcf_to_geno_matrix (1) [100%] 1 of 1 ✔ [4c/02c8c7] process > chrom_eigen_variants (IV) [100%] 6 of 6 ✔ [fc/2f34be] process > collect_eigen_variants [100%] 1 of 1 ✔ [68/4a79d2] process > rrblup_maps (m2) [100%] 2 of 2 ✔ [64/782952] process > summarize_maps [100%] 1 of 1, failed: 1 ✘ [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [9d/aaf905] process > burden_mapping (m2) [100%] 2 of 2 ✔ [e7/c51ddb] process > plot_burden (m2) [100%] 2 of 2 ✔ WARN: Access to undefined parameter email -- Initialise it to a default value eg. params.email = some_value Error executing process > 'summarize_maps'

Caused by: Process summarize_maps terminated with an error exit status (1)

Command executed:

Rscript --vanilla which Summarize_Mappings.R

cat *processed_mapping.tsv |\ awk '$0 !~ "\tNA\t" {print}' |\ awk '!seen[$2,$5,$12,$13,$14]++' |\ awk 'NR>1{print $5, $2, $12, $13, $14}' OFS="\t" > QTL_peaks.tsv

sig_maps=wc -l QTL_peaks.tsv | cut -f1 -d' '

if [ $sig_maps = 0 ]; then /home/max_log10=cat *processed_mapping.tsv | awk 'BEGIN {max = 0} {if ($4>max && $4!= "log10p") max=$4} END {print max}' echo "NO TRAITS HAD SIGNIFICANT MAPPINGS - MAXIMUM -log10p IS $max_log10 - CONSIDER SETTING BF THRESHOLD BELOW THIS VALUE" p: vieexit fi

Command exit status: 1

Command output: [1] "m2_processed_mapping.tsv" "mt_processed_mapping.tsv"

Command error: ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.3.0 ✔ purrr 0.3.3 ✔ tibble 3.0.0 ✔ dplyr 0.8.5 ✔ tidyr 1.0.2 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.5.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() Error: Assigned data c(...) must be compatible with row subscript 1. ✖ 1 row must be assigned. ✖ Assigned data has 18 rows. ℹ Only vectors of size 1 are recycled. Backtrace: █

  1. ├─base::[<-(...)
  2. └─tibble:::[<-.tbl_df(...)
  3. └─tibble:::tbl_subassign(x, i, j, value, i_arg, j_arg, substitute(value))
  4. └─tibble:::vectbl_recycle_rhs(...)
  5. └─base::tryCatch(...)
  6. └─base:::tryCatchList(expr, classes, parentenv, handlers)
  7. └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
  8. └─value[3L] Execution halted

Work dir: /home/jiseon623/test/work/64/7829520b2799e5432b5b4eee571481

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

3-2 with modified nextflow.config file contents: process.container = 'docker://faithman/cegwas2:latest' singularity.enabled = true singularity.cacheDir = "$PWD"

output:

C. elegans GWAS pipeline

Phenotype Directory = null VCF = bin/WI.20180527.impute.vcf.gz CeNDR Release = 20180527 P3D = true Significance Threshold = BF Max AF for Burden Mapping = 0.05 Min Strains with Variant for Burden = 2 Significance Threshold = BF Gene File = bin/gene_ref_flat.Rda Result Directory = Analysis_Results-20200420 Eigen Memory allocation = 100 GB

executor > local (4) [94/d8122b] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔ [de/72a90e] process > vcf_to_geno_matrix (1) [ 0%] 0 of 1 [- ] process > chrom_eigen_variants - [- ] process > collect_eigen_variants - [- ] process > rrblup_maps - [- ] process > summarize_maps - [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [80/846eaa] process > burden_mapping (mt) [ 0%] 0 of 2 [- ] process > plot_burden - Error executing process > 'burden_mapping (m2)'

Caused by: Process burden_mapping (m2) terminated with an error exit status (127)

Command executed:

Rscript --vanilla which makeped.R pr_m2.tsv

n_strains=wc -l pr_m2.tsv | cut -f1 -d" " min_af=bc -l <<< "2/($n_strains-1)"

rvtest \ --pheno m2.ped \ --out m2 \ --inVcf WI.20180527.impute.vcf.gz \ --freqUpper 0.05 \ --freqLower $min_af \ --geneFile refFlat.ws245.txt \ --vt price \ --kernel skat

Command exit status: 127

Command output: (empty)

Command error: ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.2.1 ✔ purrr 0.3.2 ✔ tibble 2.1.3 ✔ dplyr 0.8.3 ✔ tidyr 0.8.3 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.4.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() Parsed with column specification: cols( strain = col_character(), m2 = col_double() ) .command.sh: line 5: bc: command not found

Work dir: /home/jiseon623/test2/work/18/c2b6c23d8b8bad167383f02099c71e

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

Pipeline execution summary
---------------------------
Completed at: Mon Apr 20 15:29:32 KST 2020
Duration    : 8.4s
Success     : false
workDir     : /home/jiseon623/test2/work
exit status : 127
Error report: Error executing process > 'burden_mapping (m2)'

Caused by: Process burden_mapping (m2) terminated with an error exit status (127)

Command executed:

Rscript --vanilla which makeped.R pr_m2.tsv

n_strains=wc -l pr_m2.tsv | cut -f1 -d" " min_af=bc -l <<< "2/($n_strains-1)"

rvtest \ --pheno m2.ped \ executor > local (4) [94/d8122b] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔ [de/72a90e] process > vcf_to_geno_matrix (1) [100%] 1 of 1, failed: 1 [- ] process > chrom_eigen_variants - [- ] process > collect_eigen_variants - [- ] process > rrblup_maps - [- ] process > summarize_maps - [- ] process > prep_ld_files - [- ] process > rrblup_fine_maps - [- ] process > concatenate_LD_per_trait - [- ] process > plot_genes - [80/846eaa] process > burden_mapping (mt) [100%] 2 of 2, failed: 2 [- ] process > plot_burden - WARN: Access to undefined parameter email -- Initialise it to a default value eg. params.email = some_value WARN: Killing pending tasks (2) Error executing process > 'burden_mapping (m2)'

Caused by: Process burden_mapping (m2) terminated with an error exit status (127)

Command executed:

Rscript --vanilla which makeped.R pr_m2.tsv

n_strains=wc -l pr_m2.tsv | cut -f1 -d" " min_af=bc -l <<< "2/($n_strains-1)"

rvtest \ --pheno m2.ped \ --out m2 \ --inVcf WI.20180527.impute.vcf.gz \ --freqUpper 0.05 \ --freqLower $min_af \ --geneFile refFlat.ws245.txt \ --vt price \ --kernel skat

Command exit status: 127

Command output: (empty)

Command error: ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.2.1 ✔ purrr 0.3.2 ✔ tibble 2.1.3 ✔ dplyr 0.8.3 ✔ tidyr 0.8.3 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.4.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() Parsed with column specification: cols( strain = col_character(), m2 = col_double() ) .command.sh: line 5: bc: command not found

Work dir: /home/jiseon623/test2/work/18/c2b6c23d8b8bad167383f02099c71e

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

Thatguy027 commented 4 years ago

Hi again and sorry for the delays.

It looks like something has changed since your original post.. as you have made it further through the pipeline, which is indicated by:

[16/224a95] process > fix_strain_names_bulk (BULK TRAIT) [100%] 1 of 1 ✔
[fd/2dbaf6] process > vcf_to_geno_matrix (1)             [100%] 1 of 1 ✔
[4c/02c8c7] process > chrom_eigen_variants (IV)          [100%] 6 of 6 ✔
[fc/2f34be] process > collect_eigen_variants             [100%] 1 of 1 ✔
[68/4a79d2] process > rrblup_maps (m2)                   [100%] 2 of 2 ✔
[64/782952] process > summarize_maps                     [100%] 1 of 1,
failed: 1 ✘
[-        ] process > prep_ld_files                      -
[-        ] process > rrblup_fine_maps                   -
[-        ] process > concatenate_LD_per_trait           -
[-        ] process > plot_genes                         -
[9d/aaf905] process > burden_mapping (m2)                [100%] 2 of 2 ✔
[e7/c51ddb] process > plot_burden (m2)                   [100%] 2 of 2 ✔

I am curious to know what changed?

Regarding the current error, I am wondering if a significant QTL was identified by the mapping pipeline. I can't recall if I included a "Terminate pipeline if no significant QTL were identified" in the script. This might explain the current issue.

A couple of things to look for:

Check out the plots that are output in the Mappings/Plots folder, are any above the horizontal significance threshold line? Are there any pxgplot.pdf files, which would suggest a significant QTL. If the answer to these questions is no, then you can try lowering the significance threshold using the --sthresh=EIGEN flag when executing the nextflow pipeline. This threshold is usually between 4-5 on the y axis of the manhattan plot, so if some markers are above that, you should identify marginally significant QTL using the --sthresh=EIGEN flag.

If the answer is yes a significant QTL was identified but the pipeline still failed, then I would suggest entering the /home/jiseon623/test/work/64/7829520b2799e5432b5b4eee571481 directory where the pipeline failed and try executing the commands as last time to see if they work with on your current machine.

Please let me know how this troubleshooting goes, as we are gearing up to revamp this entire workflow

Thanks for your patience and your help!

Jiseon623 commented 4 years ago

Hi

Thank you for your help and suggestions

I thought nextflow.config file was causing the failure of the first process, so I tried running main.nf in a directory that does not contain nextflow.config file. I'm sorry that the text I wrote was too long to recognize that I ran it without the config file.

In Mapping/Plots, I have files named "(trait)_manplot.pdf", but no pxgplot.pdf file. In the manplot files, several dots indicated by red are above the threshold.

When I ran "Rscript --vanilla /home/jiseon623/test/bin/Summarize_Mappings.R" in "~/test/work/64/7829520b2799e5432b5b4eee571481", the result was:

jiseon623@nematode:~/test/work/64/7829520b2799e5432b5b4eee571481$ Rscript --vanilla /home/jiseon623/test/bin/Summarize_Mappings.R ── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ ggplot2 3.3.0 ✔ purrr 0.3.3 ✔ tibble 3.0.0 ✔ dplyr 0.8.5 ✔ tidyr 1.0.2 ✔ stringr 1.4.0 ✔ readr 1.3.1 ✔ forcats 0.5.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() [1] "m2_processed_mapping.tsv" "mt_processed_mapping.tsv" Error: Assigned data c(...) must be compatible with row subscript 1. ✖ 1 row must be assigned. ✖ Assigned data has 18 rows. ℹ Only vectors of size 1 are recycled. Backtrace: █

  1. ├─base::[<-(...)
  2. └─tibble:::[<-.tbl_df(...)
  3. └─tibble:::tbl_subassign(x, i, j, value, i_arg, j_arg, substitute(value))
  4. └─tibble:::vectbl_recycle_rhs(...)
  5. └─base::tryCatch(...)
  6. └─base:::tryCatchList(expr, classes, parentenv, handlers)
  7. └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
  8. └─value[3L] Execution halted

Thanks again for your help and suggestions

Thatguy027 commented 4 years ago

Interesting...

At this point it seems like the issue is at a post-mapping processing step.

It will be difficult for me to offer more suggestions without doing some troubleshooting of my own.

Usually no output pxgplot.pdf means there were no significant QTL detected. This might make sense that the pipeline is crashing on the summarize mapping step if there are no QTL above the BF threshold. But you mentioned there are red dots in the manhattan plot so I am confused about that. If QTL above threshold are detected should also see blue regions surrounding each QTL that correspond to genomic regions that are processed further. If you do not see these in your manhattan plot, try running the pipeline again with --sthresh=EIGEN, which will lower the threshold for post-mapping QTL processing. Note that you can also add the -resume flag so you don't have to run the steps that already completed.

If this is not the issue, please let me know what files are in the summarize mapping directory and if you would be willing to share your data so I can see what the issue is.

Jiseon623 commented 4 years ago

Hello

When I ran the pipeline with --sthresh=BF, I couldn't see any blue regions in manplot files. So I lowered the threshold to 3, and then I could see pxgplot files and blue regions surrounding red dots in manplot files. however, summarize_maps process failed again with the same error as before. There were only trait_processed_mapping.tsv files in the directory that the pipeline failed, and I ran the command below in the directory. (I'm not sure that "summarize mapping directory" means this directory.)

Rscript --vanilla /home/jiseon623/cegwas2-nf/bin/Summarize_Mappings.R

cat *processed_mapping.tsv | awk '$0 !~ "\tNA\t" {print}' | awk '!seen[$2,$5,$12,$13,$14]++' | awk 'NR>1{print $5, $2, $12, $13, $14}' OFS="\t" > QTL_peaks.tsv

sig_maps= wc -l QTL_peaks.tsv | cut -f1 -d' '

if [ "$sig_maps" = 0 ]; then max_log10= cat *processed_mapping.tsv | awk 'BEGIN {max = 0} {if ($4>max && $4!= "log10p") max=$4} END {print max}' echo "NO TRAITS HAD SIGNIFICANT MAPPINGS - MAXIMUM -log10p IS $max_log10

Then, a QTL_peaks.tsv file containing information(start, peak, and end position) of QTL peaks was created in the directory.

Sorry, I didn't understand exactly what "your data" means.

Thanks