bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 354 forks source link

varscan pointing to directory that does not exist #2712

Closed lento2002 closed 5 years ago

lento2002 commented 5 years ago

bcbio version 1.1.4a . see below from the job; failed about 18 hours in, exomeseq run on 36-core 200GB machine. Seemingly it is trying to reference a directory that is not there..?

Greatly appreciate any insight you may have. Thanks! R/Mike

#########bcbio-nextgen.log (tail)

[2019-03-05T03:34Z] blurb-normal: Assigned coverage as 'regional' with 16.8% genome coverage and 0.0% offtarget coverage [2019-03-05T03:44Z] blurb-tumor: Assigned coverage as 'regional' with 19.5% genome coverage and 0.0% offtarget coverage [2019-03-05T03:54Z] multiprocessing: combine_sample_regions [2019-03-05T03:54Z] Identified 300 parallel analysis blocks Block sizes: min: 294 5%: 1247.25 25%: 60770.0 median: 16086778.5 75%: 16087919.75 95%: 32175363.8 99%: 48261141.97 max: 64358814 Between block sizes: min: 252 5%: 279.35 25%: 375.75 median: 607.0 75%: 990.25 95%: 3018.700000000004 99%: 18868.930000000008 max: 86713

[2019-03-05T03:54Z] multiprocessing: calculate_sv_bins [2019-03-05T03:54Z] multiprocessing: calculate_sv_coverage [2019-03-05T03:54Z] multiprocessing: normalize_sv_coverage [2019-03-05T03:54Z] Timing: hla typing [2019-03-05T03:54Z] multiprocessing: call_hla [2019-03-05T03:58Z] Timing: alignment post-processing [2019-03-05T03:58Z] multiprocessing: piped_bamprep [2019-03-05T03:58Z] Timing: variant calling [2019-03-05T03:58Z] multiprocessing: variantcall_sample [2019-03-05T08:30Z] Uncaught exception occurred Traceback (most recent call last): File "/mnt/fsx/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 26, in run _do_run(cmd, checks, log_stdout, env=env) File "/mnt/fsx/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run raise subprocess.CalledProcessError(exitcode, error_msg) CalledProcessError: Command '/mnt/fsx/share/bcbio/anaconda/envs/python2/bin/python /home/bcbioadmin/blurb170_exomeseq/config/bcbiotx/tmptgLkkI/blurb1-chrUn_KI270752v1_0_10101-work/runWorkflow.py -m local -j 1 --quiet ' returned non-zero exit status 1 [root@ip-10-0-0-44 log]#

#######Calling the command manually from command line (No such file or directory?)

[root@ip-10-0-0-44 log]# mnt/fsx/share/bcbio/anaconda/envs/python2/bin/python /home/bcbioadmin/blurb170_exomeseq/config/bcbiotx/tmptgLkkI/blurb1-chrUn_KI270752v1_0_10101-work/runWorkflow.py -m local -j 1 --quiet bash: mnt/fsx/share/bcbio/anaconda/envs/python2/bin/python: No such file or directory

[root@ip-10-0-0-44 log]# bcbio_nextgen.py -v 1.1.4a [root@ip-10-0-0-44 log]# blurb

#########Slurm job output file (tail)

[2019-03-05T08:51Z] 08:51:49.852 INFO ProgressMeter - Traversal complete. Processed 13598 total variants in 0.0 minutes. [2019-03-05T08:51Z] 08:51:49.861 INFO FilterMutectCalls - Shutting down engine [2019-03-05T08:51Z] [March 5, 2019 8:51:49 AM UTC] org.broadinstitute.hellbender.tools.walkers.mutect.FilterMutectCalls done. Elapsed time: 0.03 minutes. [2019-03-05T08:51Z] Runtime.totalMemory()=691404800 [2019-03-05T08:51Z] Tool returned: [2019-03-05T08:51Z] SUCCESS [2019-03-05T08:51Z] Filtering MuTect2 calls with allele fraction threshold of 0.1 [2019-03-05T08:51Z] bgzip blurb1-chr9_124877849_138394717.vcf [2019-03-05T08:51Z] tabix index blurb1-chr9_124877849_138394717.vcf.gz [2019-03-05T08:51Z] Genotyping with varscan: ('chr2', 96531193, 112620159) blurb-normal-sort.bam [root@ip-10-0-0-44 config]#

lento2002 commented 5 years ago

job is still also technically 'running' not sure if this is expected behavior?

[root@ip-10-0-0-44 config]# squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 148 all test.lon root R 3-00:26:23 1 ip-10-0-3-66 150 all mf-170es bcbioadm R 20:48:42 1 ip-10-0-3-69 [root@ip-10-0-0-44 config]#

chapmanb commented 5 years ago

Thanks for the report and apologies about the issues. The error message looks like it's failing during strelka2 calling, since runWorkflow.py is part of that process. In bcbio the processes get run in temporary directories, so those do get removed on failure or re-run.

If you just need to get the analysis done, my suggestion is to remove strelka2 from the failing sample and re-run. If you want to debug, you can isolate the sample and try running strelka2 manually without the --quiet flag and it will provide much more verbose output about why it's failing.

Thanks for the help debugging this.