bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
990 stars 354 forks source link

Mutect2 error: java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN #2829

Closed amizeranschi closed 5 years ago

amizeranschi commented 5 years ago

Hello,

I'm running into an error with mutect2 (java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN) when using a pair of FASTQ files with yeast sequence reads. Other callers (vardict, freebayes, strelka2) seem to work fine with these files. Haplotypecaller also works fine in germline VC mode.

I have GATK 4.1.1.0. I'm guessing these GATK issues is related: https://github.com/broadinstitute/gatk/issues/5821 and https://github.com/broadinstitute/gatk/issues/5880, but I'm not sure how I can get mutect2 to work with these files. Any idea what could be causing this?

Note: I'm using single-end reads (this came up in a separate issue: https://github.com/broadinstitute/gatk/issues/5553#issuecomment-485581531) and I've set up two batches, one treating a FASTQ file as normal and the other as tumor, and the second one with the samples reversed.

This is the stack trace:

21:53:45.495 INFO  FilterMutectCalls - Done initializing engine
21:53:45.887 INFO  ProgressMeter - Starting traversal
21:53:45.889 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
21:53:45.890 INFO  FilterMutectCalls - Starting pass 0 through the variants
21:53:46.089 INFO  FilterMutectCalls - Finished pass 0 through the variants
21:53:46.169 INFO  FilterMutectCalls - Shutting down engine
[May 20, 2019 9:53:46 PM EEST] org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls done. Elapsed time: 0.07 minutes.
Runtime.totalMemory()=922484736
java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN
    at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:1023)
    at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:995)
    at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:999)
    at org.broadinstitute.hellbender.utils.MathUtils.normalizeLog10(MathUtils.java:1098)
    at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.clusterProbabilities(SomaticClusteringModel.java:201)
    at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.learnAndClearAccumulatedData(SomaticClusteringModel.java:120)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.learnParameters(Mutect2FilteringEngine.java:152)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.afterNthPass(FilterMutectCalls.java:151)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:35)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:984)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/share/gatk4-4.1.1.0-0/gatk-package-4.1.1.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xms909m -Xmx3181m -XX:+UseSerialGC -Djava.io.tmpdir=/export/home/ncit/external/a.mizeranschi/yeast/CNV/BRF-BRS-somatic-CNV/work/bcbiotx/tmp50fhhyub -jar /export/home/ncit/external/a.mizeranschi/bcbio_nextgen/anaconda/share/gatk4-4.1.1.0-0/gatk-package-4.1.1.0-local.jar FilterMutectCalls --reference /export/home/ncit/external/a.mizeranschi/bcbio_nextgen/genomes/Scerevisiae/sacCer3_BRF/seq/sacCer3_BRF.fa --variant /export/home/ncit/external/a.mizeranschi/yeast/CNV/BRF-BRS-somatic-CNV/work/bcbiotx/tmp50fhhyub/BRS-vs-BRF-som-NODE_1409_length_236_cov_0.961326_0_236-raw.vcf.gz --output /export/home/ncit/external/a.mizeranschi/yeast/CNV/BRF-BRS-somatic-CNV/work/bcbiotx/tmp50fhhyub/BRS-vs-BRF-som-NODE_1409_length_236_cov_0.961326_0_236-raw-filt.vcf.gz
' returned non-zero exit status 3.

And this is the YAML template that I use:

# Template for whole genome Illumina variant calling with GATK pipeline
---
details:
  - analysis: variant2
    genome_build: sacCer3_BRF
    resources:
      default:
        memory: 3G
        cores: 32
        jvm_opts: ["-Xms1000m", "-Xmx3500m"]
    algorithm:
      aligner: bwa
      effects: false
      mark_duplicates: true
      recalibrate: false
      realign: false
      variantcaller:
        somatic: mutect2
      ensemble:
        numpass: 2
      ploidy: 2
      tools_off: [gemini, contamination, peddy]
      tools_on: svplots
      min_allele_fraction: 1
      svcaller: [cnvkit, seq2c]
      variant_regions: /export/home/ncit/external/a.mizeranschi/yeast/GTF/variant_regions_BRF.bed
      sv_regions: /export/home/ncit/external/a.mizeranschi/yeast/GTF/sv_regions_genes_contigs_BRF.bed
naumenko-sa commented 5 years ago

Hello!

Sorry, maybe a little on the off-topic side, however, I'm curious:

  1. Why are you using somatic calling on yeast? Isn't HC enough?
  2. Is it diploid or haploid strain?
  3. What is the estimate of the polymorphism level in the yeast population?

Sergey

amizeranschi commented 5 years ago

Hi Sergey,

I'm not sure about the polymorphism level in the yeast population. The strain we're using is diploid and we found evidence for high allelism in the two samples that we're trying to analyze. I already ran HC on them and, in addition, I wanted to try out Mutect2 to see if there are any low-frequency variants present in one sample and not the other.

Alex

naumenko-sa commented 5 years ago

Hi Alex!

Thanks for reporting the issue and expanding the usage of bcbio into yeast genomics!

Unfortunately, it is not easy to debug such a unique case: single end reads on a custom yeast reference coupled with somatic(cancer) variant calling with mutect2. (Wow!)

Hopefully, somebody will reproduce it in a human context and find the source of the issue.

Some thoughts for now - have you tried those methods to increase sensititvity?

1) to use ensemble variant calling 1 of 3 or even 1 of 7 tools, if you want to achieve really high sensitivity without opening the can of worms of somatic calling (use multiple callers: gatk, samtools, freebayes, up to 7 and set up the threshold to call a variant to 1).

2) Also, you may try to call variants in many samples together as a 'family'. There improved coverage and occurrence of variants in 2 samples may increase their chance to be picked up.

3) The third option is to consider relaxed filters in your vcf (expand your analysis beyond only PASS variants). The filters were tunned for human genome for the best sensitivity/specificity balance, and might not reflect the reality of variant calling it Yeast.

However, I hope you remember that gain in sensitivity does not come without loss in specificity (i.e. you will be getting more false positive trash calls).

Buy the way, have you done any measurements of variant calling precision and sensitivity in yeast with bcbio? (See the examples here, https://github.com/bcbio/bcbio_validations).

If you have done it, we might publish the results among the other validations, underlining this unique use case of bcbio.

Sergey

chapmanb commented 5 years ago

Thanks much for this report and apologies about the issue. This looks related to #2832 and we were able to reproduce when the MuTect2 associated .stats file has 0 callable reads. The latest development has a workaround which avoid this by flooring this stat at 1. It also leaves the intermediate files in the work directory (instead of transactional directory) to make it easier to debug if it still fails. If you still have issues after updating please share the input *.vcf.gz and *.vcf.gz.stats files to FilterMutectCalls and we can investigate more. Thanks again and hope this gets your analysis done.

amizeranschi commented 5 years ago

Thanks a lot for the update. The MuTect2 issue went away after upgrading to the latest development version.

mbootwalla commented 5 years ago

Hi Brad. I updated to the latest version of bcbio and am still getting this error. I am doing tumor only calling using mutect2 on paired-end data. Version of bcbio is 1.1.6a and version of GATK is v4.1.2.0 Here's the variants from the problematic vcf -

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TUMOR_SAMPLE

13 19020627 . A G . . DP=5;ECNT=1;FS=0.000;MBQ=0,31;MFRL=0,135;MMQ=60,58;MPOS=16;MQ=62.35;MQ0=0;POPAF=7.30;TLOD=14.62 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,5:0.857:5:0,3:0,2:0,0,3,2 13 19020776 . T G . . DP=3;ECNT=3;FS=0.000;MBQ=0,31;MFRL=0,230;MMQ=60,58;MPOS=8;MQ=58.56;MQ0=0;POPAF=7.30;TLOD=10.03 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,3:0.800:3:0,1:0,2:0,0,2,1 13 19020857 . G A . . ClippingRankSum=-0.431;DP=4;ECNT=3;FS=0.000;MBQ=31,30;MFRL=192,284;MMQ=58,64;MPOS=21;MQ=61.62;MQ0=0;MQRankSum=0.431;POPAF=7.30;ReadPosRankSum=-0.674;TLOD=5.48 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:2,2:0.500:4:0,1:2,1:1,1,1,1 13 19020890 . G A . . DP=1;ECNT=3;FS=0.000;MBQ=0,30;MFRL=0,154;MMQ=60,45;MPOS=15;MQ=45.00;MQ0=0;POPAF=7.30;TLOD=3.18 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,1:0.667:1:0,0:0,1:0,0,0,1 13 19021019 . AG CA . . DP=1;ECNT=1;FS=0.000;MBQ=0,31;MFRL=0,284;MMQ=60,70;MPOS=48;MQ=70.00;MQ0=0;POPAF=7.30;TLOD=4.20 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,1:0.667:1:0,0:0,1:0,0,0,1 13 19021223 . T C . . DP=1;ECNT=1;FS=0.000;MBQ=0,31;MFRL=0,171;MMQ=60,70;MPOS=32;MQ=70.00;MQ0=0;POPAF=7.30;TLOD=3.28 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,1:0.667:1:0,1:0,0:0,0,1,0 13 19021651 . A G . . DP=1;ECNT=1;FS=0.000;MBQ=0,31;MFRL=0,208;MMQ=60,59;MPOS=14;MQ=59.00;MQ0=0;POPAF=7.30;TLOD=3.28 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,1:0.667:1:0,0:0,1:0,0,0,1 13 19022290 . C G . . DP=5;ECNT=1;FS=0.000;MBQ=0,30;MFRL=0,152;MMQ=60,70;MPOS=21;MQ=66.93;MQ0=0;POPAF=7.30;TLOD=16.29 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,5:0.857:5:0,3:0,2:0,0,5,0

Here's the output from the stats file for the same -

statistic value callable 1.0

Let me know if you need any further information to help debug this.

Thanks,

Moiz

roryk commented 5 years ago

Hi Moiz,

Could you post up the command that is failing along with the broken VCF file so we can have a mini test set while we try to fix this?

mbootwalla commented 5 years ago

Hi Rory,

Here's the command that I see in the bcbio logs -

java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xms681m -Xmx3181m -XX:+UseSerialGC -Djava.io.tmpdir=bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm -jar bcbio_1.1.5/anaconda/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar FilterMutectCalls --reference bcbio_1.1.5/genomes/Hsapiens/hs37d5/seq/hs37d5.fa --variant Tumor-raw.vcf.gz --output Tumor-raw-filt.vcf.gz

Let me know if you need anything else.

Thanks,

Moiz

roryk commented 5 years ago

Thanks, could you post up the VCF file that you sampled from above, but attach the actual gzipped file? That way we can use that to test against.

mbootwalla commented 5 years ago

Hi Rory,

That's all the variants in the vcf. I can attach the actual vcf if you need it.

roryk commented 5 years ago

Yup! The actual VCF would be helpful, just so we'll have the header and it wil be formatted right and all of that.

mbootwalla commented 5 years ago

Hi Rory,

I have attached the bgzipped vcf file. Let me know if you need anything else.

Thanks,

Moiz Tumor-raw.vcf.gz.gz

roryk commented 5 years ago

Sorry, could you post the error that you are seeing as well? The bcbio-nextgen-debug.log file would be helpful to look at.

mbootwalla commented 5 years ago

Hi Rory,

The sample identifiers in the debug log file have PHI in them that I cannot share. It'll take me a lot of time to de-identify and validate it. I am attaching the stack trace from where the error occurred. Let me know if this is enough.

Traceback (most recent call last): File "bcbio_nextgen/1.1.5/bin/bcbio_nextgen.py", line 238, in main(kwargs) File "bcbio_nextgen/1.1.5/bin/bcbio_nextgen.py", line 46, in main run_main(kwargs) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 53, in run_main fc_dir, run_info_yaml) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 89, in _run_toplevel for xs in pipeline(config, run_info_yaml, parallel, dirs, samples): File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 152, in variant2pipeline samples = genotype.parallel_variantcall_region(samples, run_parallel) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 208, in parallel_variantcall_region "vrn_file", ["region", "sam_ref", "config"])) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/distributed/split.py", line 35, in grouped_parallel_split_combine final_output = parallel_fn(parallel_name, split_args) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipython.py", line 137, in run for data in view.map_sync(fn, items, track=False): File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/view.py", line 344, in map_sync return self.map(f,*sequences,kwargs) File "<bcbio_1.1.5/anaconda/lib/python3.6/site-packages/decorator.py:decorator-gen-140>", line 2, in map File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/view.py", line 52, in sync_results ret = f(self, *args, *kwargs) File "<bcbio_1.1.5/anaconda/lib/python3.6/site-packages/decorator.py:decorator-gen-139>", line 2, in map File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/view.py", line 37, in save_ids ret = f(self, args, kwargs) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/view.py", line 1114, in map return pf.map(sequences) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/remotefunction.py", line 299, in map return self(sequences, ipp_mapping=True) File "<bcbio_1.1.5/anaconda/lib/python3.6/site-packages/decorator.py:decorator-gen-122>", line 2, in call File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/remotefunction.py", line 80, in sync_view_results return f(self, *args, **kwargs) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/remotefunction.py", line 285, in call__ return r.get() File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/asyncresult.py", line 169, in get raise self.exception() File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/asyncresult.py", line 228, in _resolve_result results = error.collect_exceptions(results, self._fname) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/error.py", line 233, in collect_exceptions raise e File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/error.py", line 231, in collect_exceptions raise CompositeError(msg, elist) ipyparallel.error.CompositeError: one or more exceptions from call to method: variantcall_sample [20:apply]: CalledProcessError: Command 'set -o pipefail; unset JAVA_HOME && export PATH=bcbio_1.1.5/anaconda/bin:"$PATH" && gatk --java-options '-Xms681m -Xmx3181m -XX:+UseSerialGC -Djava.io.tmpdir=bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm' Mutect2 --annotation ClippingRankSumTest --annotation DepthPerSampleHC --reference bcbio_1.1.5/genomes/Hsapiens/hs37d5/seq/hs37d5.fa --annotation MappingQualityRankSumTest --annotation MappingQualityZero --annotation QualByDepth --annotation ReadPosRankSumTest --annotation RMSMappingQuality --annotation FisherStrand --annotation MappingQuality --annotation DepthPerAlleleBySample --annotation Coverage --read-validation-stringency LENIENT -I bcbio_20180410/work_novoalign/align/Tumor_Sample/Tumor-sort.bam --tumor-sample TUMOR_SAMPLE -L bcbio_20180410/work_novoalign/mutect2/13/Tumor-regions.bed --interval-set-rule INTERSECTION -O bcbio_20180410/work_novoalign/mutect2/13/Tumor-raw.vcf.gz && sed -i 's/callable\t0.0/callable\t1.0/' bcbio_20180410/work_novoalign/mutect2/13/Tumor-raw.vcf.gz. stats && unset JAVA_HOME && export PATH=bcbio_1.1.5/anaconda/bin:"$PATH" && gatk --java-options '-Xms681m -Xmx3181m -XX:+UseSerialGC -Djava.io.tmpdir=bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm' FilterMutectCalls --reference bcbio_1.1.5/genomes/Hsapiens/hs37d5/seq/hs37d5.fa --variant bcbio_20180410/work_novoalign/mutect2/13/Tumor-raw.vcf.gz --output bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm/Tumor-raw-filt.vcf.gz

roryk commented 5 years ago

Thanks, are there errors further up in the debug log? This is the generic “this command failed” error, the root cause should be further up.

On Tue, Jun 11, 2019 at 5:15 PM Moiz Bootwalla notifications@github.com wrote:

Hi Rory,

The sample identifiers in the debug log file have PHI in them that I cannot share. It'll take me a lot of time to de-identify and validate it. I am attaching the stack trace from where the error occurred. Let me know if this is enough.

Traceback (most recent call last): File "bcbio_nextgen/1.1.5/bin/bcbio_nextgen.py", line 238, in main(kwargs) File "bcbio_nextgen/1.1.5/bin/bcbio_nextgen.py", line 46, in main run_main(kwargs) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 53, in run_main fc_dir, run_info_yaml) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 89, in _run_toplevel for xs in pipeline(config, run_info_yaml, parallel, dirs, samples): File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 152, in variant2pipeline samples = genotype.parallel_variantcall_region(samples, run_parallel) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 208, in parallel_variantcall_region "vrn_file", ["region", "sam_ref", "config"])) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/distributed/split.py", line 35, in grouped_parallel_split_combine final_output = parallel_fn(parallel_name, split_args) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipython.py", line 137, in run for data in view.map_sync(fn, items, track=False): File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/view.py", line 344, in map_sync return self.map(f,*sequences,kwargs) File "<bcbio_1.1.5/anaconda/lib/python3.6/site-packages/decorator.py:decorator-gen-140>", line 2, in map File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/view.py", line 52, in sync_results ret = f(self, *args, *kwargs) File "<bcbio_1.1.5/anaconda/lib/python3.6/site-packages/decorator.py:decorator-gen-139>", line 2, in map File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/view.py", line 37, in save_ids ret = f(self, args, kwargs) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/view.py", line 1114, in map return pf.map(sequences) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/remotefunction.py", line 299, in map return self(sequences, __ipp_mapping=True) File "<bcbio_1.1.5/anaconda/lib/python3.6/site-packages/decorator.py:decorator-gen-122>", line 2, in call File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/remotefunction.py", line 80, in sync_view_results return f(self, *args, *kwargs) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/remotefunction.py", line 285, in call* return r.get() File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/asyncresult.py", line 169, in get raise self.exception() File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/client/asyncresult.py", line 228, in _resolve_result results = error.collect_exceptions(results, self._fname) File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/error.py", line 233, in collect_exceptions raise e File "bcbio_1.1.5/anaconda/lib/python3.6/site-packages/ipyparallel/error.py", line 231, in collect_exceptions raise CompositeError(msg, elist) ipyparallel.error.CompositeError: one or more exceptions from call to method: variantcall_sample [20:apply]: CalledProcessError: Command 'set -o pipefail; unset JAVA_HOME && export PATH=bcbio_1.1.5/anaconda/bin:"$PATH" && gatk --java-options '-Xms681m -Xmx3181m -XX:+UseSerialGC -Djava.io.tmpdir=bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm' Mutect2 --annotation ClippingRankSumTest --annotation DepthPerSampleHC --reference bcbio_1.1.5/genomes/Hsapiens/hs37d5/seq/hs37d5.fa --annotation MappingQualityRankSumTest --annotation MappingQualityZero --annotation QualByDepth --annotation ReadPosRankSumTest --annotation RMSMappingQuality --annotation FisherStrand --annotation MappingQuality --annotation DepthPerAlleleBySample --annotation Coverage --read-validation-stringency LENIENT -I bcbio_20180410/work_novoalign/align/Tumor_Sample/Tumor-sort.bam --tumor-sample TUMOR_SAMPLE -L bcbio_20180410/work_novoalign/mutect2/13/Tumor-regions.bed --interval-set-rule INTERSECTION -O bcbio_20180410/work_novoalign/mutect2/13/Tumor-raw.vcf.gz && sed -i 's/callable\t0.0/callable\t1.0/' bcbio_20180410/work_novoalign/mutect2/13/Tumor-raw.vcf.gz. stats && unset JAVA_HOME && export PATH=bcbio_1.1.5/anaconda/bin:"$PATH" && gatk --java-options '-Xms681m -Xmx3181m -XX:+UseSerialGC -Djava.io.tmpdir=bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm' FilterMutectCalls --reference bcbio_1.1.5/genomes/Hsapiens/hs37d5/seq/hs37d5.fa --variant bcbio_20180410/work_novoalign/mutect2/13/Tumor-raw.vcf.gz --output bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm/Tumor-raw-filt.vcf.gz

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/bcbio/bcbio-nextgen/issues/2829?email_source=notifications&email_token=AADFG6X7K3UOU64ETECICZTP2AIX5A5CNFSM4HOTY5K2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXOQZBQ#issuecomment-501025926, or mute the thread https://github.com/notifications/unsubscribe-auth/AADFG6UQ2ZBJPXZV6JY4RD3P2AIX5ANCNFSM4HOTY5KQ .

mbootwalla commented 5 years ago

Hi Rory,

Is this what you were looking for -

[June 5, 2019 2:50:08 PM PDT] org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls done. Elapsed time: 0.09 minutes. Runtime.totalMemory()=691404800 java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN at org.broadinstitute.hellbender.utils.NaturalLogUtils.logSumExp(NaturalLogUtils.java:84) at org.broadinstitute.hellbender.utils.NaturalLogUtils.normalizeLog(NaturalLogUtils.java:51) at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.clusterProbabilities(SomaticClusteringModel.java:203) at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.learnAndClearAccumulatedData(SomaticClusteringModel.java:122) at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.learnParameters(Mutect2FilteringEngine.java:156) at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.afterNthPass(FilterMutectCalls.java:151) at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:44) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1039) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205) at org.broadinstitute.hellbender.Main.main(Main.java:291) Using GATK jar bcbio_1.1.5/anaconda/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xms681m -Xmx3181m -XX:+UseSerialGC -Djava.io.tmpdir=bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm -jar bcbio_1.1.5/anaconda/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar FilterMutectCalls --reference bcbio_1.1.5/genomes/Hsapiens/hs37d5/seq/hs37d5.fa --variant bcbio_20180410/work_novoalign/mutect2/13/Tumor-raw.vcf.gz --output bcbio_20180410/work_novoalign/bcbiotx/tmp3gkzrbvm/Tumor-raw-filt.vcf.gz ' returned non-zero exit status 3.

mbootwalla commented 5 years ago

@roryk any updates on this issue? Kindly let me know if you need any more details.

chapmanb commented 5 years ago

Moiz; Thanks for following up on this. I was able to reproduce and it looks like the same error can happen in some cases with the callable reads in the stats file set at 1 (which was our workaround for 0 reads). Increasing to a bigger number seems to avoid it and the latest development now does this to try and work around the issue more generally. If you update to the latest development version of bcbio, remove the problematic Tumor-raw.vcf.gz (and .stats) files and re-run it should hopefully work cleanly now. Please let us know if you are still stuck. Thanks again for the help debugging.

mbootwalla commented 5 years ago

Hi Brad,

I tested this and confirm that it works fine now. I will let you know if this issue crops up again.

Thank you so much for fixing this!

roryk commented 5 years ago

Thank you for following up Moiz!

mbootwalla commented 5 years ago

Hello @chapmanb and @roryk . I have run into this issue again. Not sure what the issue is now. Kindly let me know if you guys have any ideas or if you need files to test this out.

Thanks,

Moiz

chapmanb commented 5 years ago

Moiz; Thanks for the report and apologies about the continued problems. I still have no clue about what causes the underlying issue and allows MuTect2 to make calls but report zero callable reads, but tried to keep going with the less than perfect solution we've been using and set it to a higher number. Hope this gets it working now and avoids the problem. Thanks again for your patience with this.