bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 354 forks source link

Variant calling failing at combining the individual chromosomal level alterations into a single vcf file #2800

Closed audyavar closed 5 years ago

audyavar commented 5 years ago

Hello -

I am trying to run BCBIO 1.1.6 version on SLURM/Lustre file system. Many of my samples completed runs successfully with 3 different variant callers - Mutect2, Strelka2, Varscan and then creating an ensemble VCF with snpeff annotation. One of my samples consistently fails (with any bcbio 1.1.3-1.1.6 version) on the step for combining variants into a single vcf file from individual chromosomes. See dump below. Any advice?

Debug-log file: 2019-04-25T17:01Z] Configuring 1 jobs to run, using 1 cores each with 1.00g of memory reserved for each job [2019-04-25T17:01Z] Combine variant files [2019-04-25T17:01Z] INFO 2019-04-25 17:01:48 MergeVcfs [2019-04-25T17:01Z] ** NOTE: Picard's command line syntax is changing. [2019-04-25T17:01Z] ** [2019-04-25T17:01Z] ** For more information, please see: [2019-04-25T17:01Z] ** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition) [2019-04-25T17:01Z] ** [2019-04-25T17:01Z] ** The command line looks like this in the new syntax: [2019-04-25T17:01Z] ** [2019-04-25T17:01Z] ** MergeVcfs -D /mnt/fsx/share/bcbio/genomes/Hsapiens/hg38/seq/hg38.dict -O /mnt/fsx/scratch_home/bcbioadmin/PACT131/exomeseq/config/bcbiotx/tmpg_4wwbuk/pact1-chr2_32179835_48267342-raw.vcf.gz -I /mnt/fsx/scratch_home/bcbioadmin/PACT131/exomeseq/config/strelka2/chr2/pact1-chr2_32179835_48267342-work/results/variants/somatic.snvs-fixed.vcf.gz -I /mnt/fsx/scratch_home/bcbioadmin/PACT131/exomeseq/config/strelka2/chr2/pact1-chr2_32179835_48267342-work/results/variants/somatic.indels-fixed.vcf.gz [2019-04-25T17:01Z] ** [2019-04-25T17:01Z] 17:01:48.377 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/fsx/share/bcbio/anaconda/share/picard-2.19.0-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so [2019-04-25T17:01Z] [Thu Apr 25 17:01:48 UTC 2019] MergeVcfs INPUT=[/mnt/fsx/scratch_home/bcbioadmin/PACT131/exomeseq/config/strelka2/chr2/pact1-chr2_32179835_48267342-work/results/variants/somatic.snvs-fixed.vcf.gz, /mnt/fsx/scratch_home/bcbioadmin/PACT131/exomeseq/config/strelka2/chr2/pact1-chr2_32179835_48267342-work/results/variants/somatic.indels-fixed.vcf.gz] OUTPUT=/mnt/fsx/scratch_home/bcbioadmin/PACT131/exomeseq/config/bcbiotx/tmpg_4wwbuk/pact1-chr2_32179835_48267342-raw.vcf.gz SEQUENCE_DICTIONARY=/mnt/fsx/share/bcbio/genomes/Hsapiens/hg38/seq/hg38.dict VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=true CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false [2019-04-25T17:01Z] [Thu Apr 25 17:01:48 UTC 2019] Executing as ?@ip-10-0-3-79.us-west-2.compute.internal on Linux 3.10.0-957.5.1.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_192-b01; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.19.0-SNAPSHOT [2019-04-25T17:01Z] INFO 2019-04-25 17:01:48 MergeVcfs Processed 10,000 records. Elapsed time: 00:00:00s. Time for last 10,000: 0s. Last read position: chr2:37,130,371 [2019-04-25T17:01Z] INFO 2019-04-25 17:01:49 MergeVcfs Processed 20,000 records. Elapsed time: 00:00:00s. Time for last 10,000: 0s. Last read position: chr2:42,245,306 [2019-04-25T17:01Z] INFO 2019-04-25 17:01:49 MergeVcfs Processed 30,000 records. Elapsed time: 00:00:00s. Time for last 10,000: 0s. Last read position: chr2:45,763,137 [2019-04-25T17:01Z] [Thu Apr 25 17:01:49 UTC 2019] picard.vcf.MergeVcfs done. Elapsed time: 0.02 minutes. [2019-04-25T17:01Z] Runtime.totalMemory()=760217600 [2019-04-25T17:01Z] Filtering Strelka2 calls with allele fraction threshold of 0.1 [2019-04-25T17:01Z] bgzip pact1-chr2_32179835_48267342.vcf [2019-04-25T17:01Z] tabix index pact1-chr2_32179835_48267342.vcf.gz [2019-04-25T17:01Z] Genotyping with varscan: ('chr1', 0, 16090171) pact-normal-sort.bam

Dump: File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 284, in variantcall_sample return genotype.variantcall_sample(*args) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 377, in variantcall_sample out_file = caller_fn(align_bams, items, ref_file, assoc_files, region, out_file) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/strelka2.py", line 30, in run call_file = _run_somatic(paired, ref_file, assoc_files, region, call_file, strelka_work_dir) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/strelka2.py", line 313, in _run_somatic _run_workflow(paired.tumor_data, workflow_file, tx_work_dir) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/strelka2.py", line 326, in _run_workflow do.run(cmd, "Run Strelka2: %s" % dd.get_sample_name(data)) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run _do_run(cmd, checks, log_stdout, env=env) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run raise subprocess.CalledProcessError(exitcode, error_msg) subprocess.CalledProcessError: Command '/mnt/fsx/share/bcbio/anaconda/envs/python2/bin/python /mnt/fsx/scratch_home/bcbioadmin/PACT131/exomeseq/config/bcbiotx/tmpde_776le/pact1-chrUn_KI270752v1_0_11158-work/runWorkflow.py -m local -j 1 --quiet ' returned non-zero exit status 1. """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/mnt/fsx/share/bcbio/anaconda/bin/bcbio_nextgen.py", line 238, in main(kwargs) File "/mnt/fsx/share/bcbio/anaconda/bin/bcbio_nextgen.py", line 46, in main run_main(kwargs) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 45, in run_main fc_dir, run_info_yaml) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 89, in _run_toplevel for xs in pipeline(config, run_info_yaml, parallel, dirs, samples): File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 152, in variant2pipeline samples = genotype.parallel_variantcall_region(samples, run_parallel) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 208, in parallel_variantcall_region "vrn_file", ["region", "sam_ref", "config"])) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/split.py", line 35, in grouped_parallel_split_combine final_output = parallel_fn(parallel_name, split_args) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel return run_multicore(fn, items, config, parallel=parallel) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items): File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 934, in call self.retrieve() File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 833, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "/mnt/fsx/share/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 670, in get raise self._value subprocess.CalledProcessError: Command '/mnt/fsx/share/bcbio/anaconda/envs/python2/bin/python /mnt/fsx/scratch_home/bcbioadmin/PACT131/exomeseq/config/bcbiotx/tmpde_776le/pact1-chrUn_KI270752v1_0_11158-work/runWorkflow.py -m local -j 1 --quiet ' returned non-zero exit status 1.

naumenko-sa commented 5 years ago

Hi Akshata!

It looks like a data issue rather than bcbio issue, since just one sample failed out of many. Have you tried to investigate the quality of the problematic sample? What makes it different from the other samples? Coverage, number of variants?

What is the last command in bcbio-nextgen-commands.log?

S

chapmanb commented 5 years ago

Akshata; I'm in agreement with Sergey and wanted to add that it looks like it's failing when running strelka2 in the runWorkflow.py step. Unfortunately strelka2 is a bit hard to debug only from this output since we turn on the --quiet flag to avoid the very verbose standard output. This is likely a data quality issue and there is something strelka2 doesn't like about the failing sample. If you can identify the one that's failing by running with a single core (bcbio_nextgen.py -n 1) and then run outside of bcbio without the --quiet flag, it could provide additional details about what is causing the problem. Hope this helps.

roryk commented 5 years ago

Thanks, looks like this was covered with some advice in #2801.