broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.69k stars 588 forks source link

FilterMutectCalls: errorRate must be good probability but got NaN #5821

Closed igordot closed 5 years ago

igordot commented 5 years ago

I ran GATK 4.1.0.0 Mutect2 on a small (~1Mb) targeted panel. I am using a normal control that is not the same individual (basically to exclude technical artifacts), so I do expect to see more variants than with a proper matched normal. I was getting around 100-300 variants per sample with GATK 4.0.6.0. I am still roughly in the same range for some samples GATK 4.1.0.0, but I am getting 0 for some.

The problem seems to be at the FilterMutectCalls stage where I am seeing the following error:

[March 19, 2019 10:43:17 PM EDT] org.broadinstitute.hellbender.tools.walkers.mutect.FilterMutectCalls done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=8851030016
java.lang.IllegalArgumentException: errorRate must be good probability but got NaN
    at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:730)
    at org.broadinstitute.hellbender.utils.QualityUtils.errorProbToQual(QualityUtils.java:227)
    at org.broadinstitute.hellbender.utils.QualityUtils.errorProbToQual(QualityUtils.java:211)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2FilteringEngine.applyContaminationFilter(Mutect2FilteringEngine.java:79)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2FilteringEngine.calculateFilters(Mutect2FilteringEngine.java:518)
    at org.broadinstitute.hellbender.tools.walkers.mutect.FilterMutectCalls.firstPassApply(FilterMutectCalls.java:130)
    at org.broadinstitute.hellbender.engine.TwoPassVariantWalker.lambda$traverseVariants$0(TwoPassVariantWalker.java:76)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
    at org.broadinstitute.hellbender.engine.TwoPassVariantWalker.traverseVariants(TwoPassVariantWalker.java:74)
    at org.broadinstitute.hellbender.engine.TwoPassVariantWalker.traverse(TwoPassVariantWalker.java:27)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)

I see there was a previous similar issue (https://github.com/broadinstitute/gatk/issues/5553), but that one was apparently resolved. Curiously, I used several versions of GATK 4 and it never came up for me before.

davidbenjamin commented 5 years ago

@igordot I believe this is fixed in master and will be in the 4.1.1.0 release this Thursday.

igordot commented 5 years ago

I tried 4.1.1.0. Although that error is fixed, now I am getting a new one:

java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN
    at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:1023)
    at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:995)
    at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:999)
    at org.broadinstitute.hellbender.utils.MathUtils.normalizeLog10(MathUtils.java:1098)
    at org.broadinstitute.hellbender.utils.MathUtils.normalizeFromLog10ToLinearSpace(MathUtils.java:1074)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.posteriorProbabilityOfError(Mutect2FilteringEngine.java:91)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.posteriorProbabilityOfError(Mutect2FilteringEngine.java:76)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ContaminationFilter.calculateErrorProbability(ContaminationFilter.java:60)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2VariantFilter.errorProbability(Mutect2VariantFilter.java:15)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$1(ErrorProbabilities.java:19)
    at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
    at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.<init>(ErrorProbabilities.java:19)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:136)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:140)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:31)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:68)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:66)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:31)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:984)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)
davidbenjamin commented 5 years ago

@igordot Could you provide your command line? Also, could you check whether the error persists when you use a panel of normals, or nothing at all, instead of the unmatched normal?

davidbenjamin commented 5 years ago

Also, is this hg38?

igordot commented 5 years ago

This is the command:

/path/gatk-4.1.1.0/gatk --java-options "-Xms8G -Xms8G" FilterMutectCalls \
--verbosity WARNING \
--reference /path/ref/hg38/genome.fa \
--unique-alt-read-count 5 \
--variant /path/SERACARE-NS18-16-HAPMAP-NS18-17.unfiltered.vcf \
--contamination-table /path/SERACARE-NS18-16-HAPMAP-NS18-17.contamination.txt \
--output /path/SERACARE-NS18-16-HAPMAP-NS18-17.original.vcf.gz  

And yes, this is hg38. Do you think it's related to that?

davidbenjamin commented 5 years ago

A few recent bugs, which are all entirely my fault, came about because the liftover of gnomAD to hg38 (there is no official hg38 gnomAD yet) exposed some new edge cases, such as AF=. and AF=0, that caused errors. I suspect that you are seeing one that we hadn't found yet. Unfortunately, we do not have nearly validation on hg38.

Here's what I will do: 1) correct our hg38 gnomAD to fix liftover artifacts and put this new resource in the GATK bucket. 2) Create a Firecloud workspace with a few hg38 samples in order to reproduce the error and to make sure future changes don't create new problems 3) try to fix the error because even if 1) works it's sloppy to rely on the fact that gnomAD won't have these edge cases. I hope 1) succeeds because it will be available immediately without waiting for the next release.

igordot commented 5 years ago

Thank you for looking into it. I am curious if that is really the problem. If the reference files were causing problems, shouldn't that impact all samples? I am seeing this error with some, but not most of the samples. Even using a different matched control sample with the same tumor sample will cause or fix the error.

davidbenjamin commented 5 years ago

What we saw recently wasn't the reference itself, but rather our AF-only gnomAD resource lifted-over to the hg38 reference. The error only came up for sites that reached genotyping, which depends on the specific tumor sample as well as the lack of evidence in the normal. That's why it only appeared in some tumor-normal combinations.

That being said, it might be something else. If you are able to share the unfiltered vcf file and the vcf.stats file it would be the most direct way to debug.

davidbenjamin commented 5 years ago

@igordot I have not yet succeeded in reproducing the error with the few hg38 samples I have tested (2) and nothing obvious showed up in various grep regexes of our hg38 gnomAD (1). I am starting to think that we actually have solved all the hg38 issues and this is unrelated to my initial guess.

If you can share your unfiltered vcf input it would be very helpful, but if that's not possible could you post the contents of your contamination-table input? I have a hunch that the small size of the panel is causing CalculateContamination to give an unreliable result. For targeted panels we recommend running Mutect2 without CalculateContamination. If you're running from Terra/Firecloud or from the wdl, this means leaving the optional variants_for_contamination input empty.

davidbenjamin commented 5 years ago

By the way, I believe the same issue has come up recently on the GATK forum: https://gatkforums.broadinstitute.org/gatk/discussion/23685/issues-on-filtermutectcalls-log10-probability-must-be-0-or-less#latest.

@igordot If your contamination table shows a contamination of 1 or greater don't bother sending the vcf -- contamination would definitely be the issue in that case.

igordot commented 5 years ago

You are right. The error seems to be happening when contamination is 1 or NaN. That is probably due to a non-matched normal. The same panel with a true matched normal gives much more reasonable results (<0.01), so I don't know if the panel size is entirely at fault here.

Should FilterMutectCalls just ignore contamination if the error is sufficiently high?

davidbenjamin commented 5 years ago

@igordot Thanks very much for following up on this. Just to clarify, are you saying that the 1/NaN contamination occurs when you run GetPileupSummaries on both the tumor and a non-matched normal and then run CalculateContamination using both of these outputs? If so, that will definitely cause problems. Running CalculateContamination on just the tumor output from GetPileupSummaries should work much better. We went to great lengths to make CalculateContamination work in tumor-only mode, although I would still be wary if your target territory is less than a few megabases.

Also, I would not recommend using a non-matched normal anywhere in Mutect2. Unless your panel has unique technical artifacts that don't resemble those in an exome I would recommend you run tumor-only mode but use use of the publicly-available panels of normals in the GATK bucket. A worse alternative but in my opinion still better than a non-matched normal would be to run Mutect2 in tumor-only mode separately on the tumor samples and the unmatched normal, then use the unmatched normal vcf as a blacklist for the tumor calls with SelectVariants.

igordot commented 5 years ago

Thank you for the suggestions.

I realize using a non-matched normal is not ideal. I was using it to at least filter out any technical artifacts. It seems to work well for that.

Where do I find the panel of normals that you mentioned?

davidbenjamin commented 5 years ago

@igordot A few panel can be found in the GATK best practices bucket, for example: gs://gatk-best-practices/somatic-b37/Mutect2-exome-panel.vcf

igordot commented 5 years ago

How do I access that? I thought that the GATK resources were located here: https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0/

Is there a reason this is not in the GATK resource bundle?

davidbenjamin commented 5 years ago

They're public, so just install Google Cloud gsutil and copy with gsutil cp gs://gatk-best-practices/somatic-b37/Mutect2-exome-panel.vcf <local path to copy to>. Or, if you're running on the cloud, you don't even need to download anything, just run Mutect2 with the cloud paths eg

gatk Mutect2 -R ref.fasta -I tumor.bam -pon gs://gatk-best-practices/somatic-b37/Mutect2-exome-panel.vcf -O calls.vcf

If you install gsutil this works when running locally as well, but for speed I would recommend downloading the pon.

Is there a reason this is not in the GATK resource bundle?

Not that I can think of.

igordot commented 5 years ago

Is there also a hg38 version?

davidbenjamin commented 5 years ago

There's gs://gatk-best-practices/somatic-hg38/1000g_pon.hg38.vcf.gz for hg38.

MikeWLloyd commented 4 years ago

I think this issue still persists: The Genome Analysis Toolkit (GATK) v4.1.3.0

2019-10-29T18:18:04.000640760Z java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN
2019-10-29T18:18:04.001018863Z  at org.broadinstitute.hellbender.utils.NaturalLogUtils.logSumExp(NaturalLogUtils.java:84)
2019-10-29T18:18:04.001194549Z  at org.broadinstitute.hellbender.utils.NaturalLogUtils.normalizeLog(NaturalLogUtils.java:51)
2019-10-29T18:18:04.001367357Z  at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.clusterProbabilities(SomaticClusteringModel.java:203)
2019-10-29T18:18:04.001518160Z  at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.probabilityOfSequencingError(SomaticClusteringModel.java:96)
2019-10-29T18:18:04.001673083Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.TumorEvidenceFilter.calculateErrorProbability(TumorEvidenceFilter.java:27)
2019-10-29T18:18:04.001846904Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2VariantFilter.errorProbability(Mutect2VariantFilter.java:15)
2019-10-29T18:18:04.002024760Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$1(ErrorProbabilities.java:19)
2019-10-29T18:18:04.002140012Z  at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
2019-10-29T18:18:04.002232542Z  at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
2019-10-29T18:18:04.002242727Z  at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
2019-10-29T18:18:04.002292461Z  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
2019-10-29T18:18:04.002301667Z  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
2019-10-29T18:18:04.002307019Z  at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
2019-10-29T18:18:04.002311722Z  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2019-10-29T18:18:04.002316449Z  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
2019-10-29T18:18:04.002321526Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.<init>(ErrorProbabilities.java:19)
2019-10-29T18:18:04.002358113Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:141)
2019-10-29T18:18:04.002377342Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:146)
2019-10-29T18:18:04.002383406Z  at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:40)
2019-10-29T18:18:04.002431769Z  at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:77)
2019-10-29T18:18:04.002441351Z  at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
2019-10-29T18:18:04.002446409Z  at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
2019-10-29T18:18:04.002493533Z  at java.util.Iterator.forEachRemaining(Iterator.java:116)
2019-10-29T18:18:04.002503311Z  at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
2019-10-29T18:18:04.002508016Z  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
2019-10-29T18:18:04.002512520Z  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
2019-10-29T18:18:04.002574562Z  at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
2019-10-29T18:18:04.002625341Z  at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
2019-10-29T18:18:04.002635077Z  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2019-10-29T18:18:04.002683298Z  at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
2019-10-29T18:18:04.002692496Z  at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:75)
2019-10-29T18:18:04.002697751Z  at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:40)
2019-10-29T18:18:04.002731707Z  at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
2019-10-29T18:18:04.002740306Z  at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
2019-10-29T18:18:04.002745164Z  at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
2019-10-29T18:18:04.002777218Z  at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
2019-10-29T18:18:04.002785268Z  at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
2019-10-29T18:18:04.002855927Z  at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
2019-10-29T18:18:04.002867030Z  at org.broadinstitute.hellbender.Main.main(Main.java:291)

I am using ExAC lifted to hg38 as a germline resource in mutect2 with only a tumor sample, and getting the above error in filtermutectcalls. I recently updated to v4.1.3.0 to have the latest changes to mutect2. I was not having this issue with v4.0.5.1.

Here is extracted information from the VCF which caused the issue.

DP=1;ECNT=2;FS=0.000;MBQ=0,20;MFRL=0,91;MMQ=60,46;MPOS=6;MQ=46.00;POPAF=5.08;TLOD=4.20  GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:0,1:0.667:1:0,1:0,0:0|1:11155815_C_T:11155815:0,0,1,0
DP=1;ECNT=2;FS=0.000;MBQ=0,20;MFRL=0,91;MMQ=60,46;MPOS=16;MQ=46.00;POPAF=5.08;TLOD=4.20 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:0,1:0.667:1:0,1:0,0:0|1:11155815_C_T:11155815:0,0,1,0
DP=1;ECNT=2;FS=0.000;MBQ=0,34;MFRL=0,272;MMQ=60,30;MPOS=25;MQ=30.00;POPAF=4.13;TLOD=4.20    GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:0,1:0.667:1:0,1:0,0:0|1:11350899_C_T:11350899:0,0,1,0
DP=1;ECNT=2;FS=0.000;MBQ=0,32;MFRL=0,272;MMQ=60,30;MPOS=15;MQ=30.00;POPAF=4.23;TLOD=4.20    GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:0,1:0.667:1:0,1:0,0:0|1:11350899_C_T:11350899:0,0,1,0
davidbenjamin commented 4 years ago

@MikeWLloyd can you try version 4.1.4.0?

MikeWLloyd commented 4 years ago

Yes, I will try. I didn't realize 4.1.4.0 was available.

davidbenjamin commented 4 years ago

No worries; fingers crossed.

MikeWLloyd commented 4 years ago

I loaded the docker repo GATK v4.1.4.0 and had the same (or similar) error result.

2019-10-30T13:35:51.791637449Z java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN
2019-10-30T13:35:51.792001654Z  at org.broadinstitute.hellbender.utils.NaturalLogUtils.logSumExp(NaturalLogUtils.java:84)
2019-10-30T13:35:51.792175325Z  at org.broadinstitute.hellbender.utils.NaturalLogUtils.normalizeLog(NaturalLogUtils.java:51)
2019-10-30T13:35:51.792358868Z  at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.clusterProbabilities(SomaticClusteringModel.java:203)
2019-10-30T13:35:51.792559803Z  at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.probabilityOfSequencingError(SomaticClusteringModel.java:96)
2019-10-30T13:35:51.792736667Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.TumorEvidenceFilter.calculateErrorProbability(TumorEvidenceFilter.java:27)
2019-10-30T13:35:51.792905235Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2VariantFilter.errorProbability(Mutect2VariantFilter.java:15)
2019-10-30T13:35:51.793072365Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$1(ErrorProbabilities.java:19)
2019-10-30T13:35:51.793261944Z  at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
2019-10-30T13:35:51.793456807Z  at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
2019-10-30T13:35:51.793619935Z  at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
2019-10-30T13:35:51.793810301Z  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
2019-10-30T13:35:51.794006885Z  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
2019-10-30T13:35:51.794191116Z  at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
2019-10-30T13:35:51.794367593Z  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2019-10-30T13:35:51.794548129Z  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
2019-10-30T13:35:51.794722501Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.<init>(ErrorProbabilities.java:19)
2019-10-30T13:35:51.794896154Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:141)
2019-10-30T13:35:51.795082090Z  at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:146)
2019-10-30T13:35:51.795253632Z  at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:40)
2019-10-30T13:35:51.795448274Z  at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:77)
2019-10-30T13:35:51.795607447Z  at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
2019-10-30T13:35:51.795775473Z  at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
2019-10-30T13:35:51.795944490Z  at java.util.Iterator.forEachRemaining(Iterator.java:116)
2019-10-30T13:35:51.796108757Z  at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
2019-10-30T13:35:51.796277399Z  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
2019-10-30T13:35:51.796441683Z  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
2019-10-30T13:35:51.796940319Z  at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
2019-10-30T13:35:51.797119562Z  at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
2019-10-30T13:35:51.797275911Z  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2019-10-30T13:35:51.797439525Z  at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
2019-10-30T13:35:51.797567816Z  at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:75)
2019-10-30T13:35:51.797740910Z  at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:40)
2019-10-30T13:35:51.797896360Z  at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
2019-10-30T13:35:51.798060735Z  at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
2019-10-30T13:35:51.798212290Z  at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
2019-10-30T13:35:51.798375318Z  at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
2019-10-30T13:35:51.798498807Z  at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
2019-10-30T13:35:51.798643830Z  at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
2019-10-30T13:35:51.798803653Z  at org.broadinstitute.hellbender.Main.main(Main.java:292)
2019-10-30T13:35:51.902177042Z Using GATK jar /gatk/gatk-package-4.1.4.0-local.jar

Contents of *.vcf.stats

statistic   value
callable    0.0

Below is output from: grep -v '#' *.vcf | cut -f 8,9,10,11 | head -n 100. The VCF that failed in this case only has 3 calls. I am running Mutect2 in parallel by chromosome defined by interval files.

DP=1;ECNT=1;FS=0.000;MBQ=0,36;MFRL=0,143;MMQ=60,36;MPOS=24;MQ=36.00;POPAF=5.08;TLOD=3.78    GT:AD:AF:DP:F1R2:F2R1:SB    0/1:0,1:0.667:1:0,0:0,1:0,0,0,1
DP=4;ECNT=2;FS=0.000;MBQ=0,34;MFRL=0,275;MMQ=60,24;MPOS=7;MQ=24.78;POPAF=5.08;TLOD=17.30    GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:0,4:0.833:4:0,1:0,3:0|1:11155815_C_T:11155815:0,0,2,2
DP=4;ECNT=2;FS=0.000;MBQ=0,33;MFRL=0,275;MMQ=60,24;MPOS=17;MQ=24.78;POPAF=5.08;TLOD=17.30   GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:0,4:0.833:4:0,1:0,3:0|1:11155815_C_T:11155815:0,0,2,2
davidbenjamin commented 4 years ago

@MikeWLloyd It seems like you are running Mutect2 in parallel, which is totally fine, and then running FilterMutectCalls in parallel as well, which is not how the tool works. You need to run MergeMutectStats on the .vcf.stats files and run MergeVcfs on the scattered .vcfs, and then run FilterMutectCalls with the merged files as inputs. This is implemented in the mutect2 WDL: https://github.com/broadinstitute/gatk/blob/master/scripts/mutect2_wdl/mutect2.wdl and the featured workspace in Terra: https://app.terra.bio/#workspaces/help-gatk/Somatic-SNVs-Indels-GATK4.

The error seems to occur because the .stats file for the failing interval shows no callable depth. That is, every locus in the interval had a depth less than 10. Once you merge your files I would hope that somewhere in the genome there is a site with greater depth. (If not, you can adjust the threshold with the --callable-depth argument)