broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.71k stars 590 forks source link

IllegalArgumentException: beta must be greater than 0 in FilterMutectCalls #6202

Open bshifaw opened 5 years ago

bshifaw commented 5 years ago

FilterMutectCalls failed with

java.lang.IllegalArgumentException: beta must be greater than 0 but got -87566.7500301585

"this error only comes after the first pass of filtermutectCalls completed."

ValidateVarinats shows no errors when run on VCF. "The stats file was created by mutect2 for each shard and then joined with MergeMutectStats. Similar the read orientation model was built with the f1r2 files from all shards."

@davidbenjamin


Hi there,

I have a simulated dataset of related samples and currently running Mutect2 on it (10 tumor samples WGS with 130x) I managed to run everything through and now FilterMutectCalls crashes after the first pass through the variants with

[October 1, 2019 12:16:16 PM UTC] org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls done. Elapsed time: 370.68 minutes.
Runtime.totalMemory()=20597702656
java.lang.IllegalArgumentException: beta must be greater than 0 but got -87566.7500301585
        at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:724)
        at org.broadinstitute.hellbender.tools.walkers.readorientation.BetaDistributionShape.<init>(BetaDistributionShape.java:14)
        at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.BinomialCluster.getFuzzyBinomial(BinomialCluster.java:42)
        at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.BinomialCluster.learn(BinomialCluster.java:33)
        at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.lambda$learnAndClearAccumulatedData$7(SomaticClusteringModel.java:131)
        at org.broadinstitute.hellbender.utils.IndexRange.forEach(IndexRange.java:116)
        at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.learnAndClearAccumulatedData(SomaticClusteringModel.java:131)
        at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.learnParameters(Mutect2FilteringEngine.java:156)
        at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.afterNthPass(FilterMutectCalls.java:151)
        at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:44)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1039)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
        at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /gatk/gatk-package-4.1.2.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.1.2.0-local.jar FilterMutectCalls --contamination-table /cromwell-executions/sebastian/f3c8dc32-7754-42c3-b0a7-9d667904c2e5/call-filtering/inputs/1230118915/generated-3208ebe8-e3ef-11e9-91de-005056b01e3e --tumor-segmentation /cromwell-executions/sebastian/f3c8dc32-7754-42c3-b0a7-9d667904c2e5/call-filtering/inputs/1230118915/generated-3208e648-e3ef-11e9-91de-005056b01e3e --stats /cromwell-executions/sebastian/f3c8dc32-7754-42c3-b0a7-9d667904c2e5/call-filtering/inputs/61973814/generated-58e77956-db7e-11e9-9da2-005056b01e3e.txt --orientation-bias-artifact-priors /cromwell-executions/sebastian/f3c8dc32-7754-42c3-b0a7-9d667904c2e5/call-filtering/inputs/-1768654832/generated-58e75278-db7e-11e9-9da2-005056b01e3e.tar.gz -V /cromwell-executions/sebastian/f3c8dc32-7754-42c3-b0a7-9d667904c2e5/call-filtering/inputs/164276910/generated-58e6fad0-db7e-11e9-9da2-005056b01e3e.vcf.gz -R /cromwell-executions/sebastian/f3c8dc32-7754-42c3-b0a7-9d667904c2e5/call-filtering/inputs/1500471319/human_g1k_v37.fasta -O generated-32095ea2-e3ef-11e9-91de-005056b01e3e.vcf.gz

I do not have any idea how to work around this. Any suggestions?

This Issue was generated from your forums

davidbenjamin commented 5 years ago

Thanks @bshifaw! I asked the user to try the new release. If that fails I will debug ASAP.

SebastianHollizeck commented 5 years ago

Hey, just wondering if there has been any updates for my problem.

davidbenjamin commented 4 years ago

@SebastianHollizeck I think #6337 fixes this. Could you try re-running FilterMutectCalls using this jar: gs://broad-dsde-methods-davidben/gatk-builds/clustering.jar

SebastianHollizeck commented 4 years ago

@davidbenjamin I tried and this time its a different error.

14:55:53.232 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/shollizeck/clustering.jar!/com/intel/gkl/native/libgkl_compression.so
Jan 09, 2020 2:55:53 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
14:55:53.432 INFO  FilterMutectCalls - ------------------------------------------------------------
14:55:53.433 INFO  FilterMutectCalls - The Genome Analysis Toolkit (GATK) v4.1.4.1-6-g6bb31a7-SNAPSHOT
14:55:53.433 INFO  FilterMutectCalls - For support and documentation go to https://software.broadinstitute.org/gatk/
14:55:53.433 INFO  FilterMutectCalls - Executing as shollizeck@stpr-res-compute02.unix.petermac.org.au on Linux v3.10.0-1062.4.3.el7.x86_64 amd64
14:55:53.433 INFO  FilterMutectCalls - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_232-b09
14:55:53.434 INFO  FilterMutectCalls - Start Date/Time: 9 January 2020 2:55:53 PM
14:55:53.434 INFO  FilterMutectCalls - ------------------------------------------------------------
14:55:53.434 INFO  FilterMutectCalls - ------------------------------------------------------------
14:55:53.434 INFO  FilterMutectCalls - HTSJDK Version: 2.21.0
14:55:53.435 INFO  FilterMutectCalls - Picard Version: 2.21.2
14:55:53.435 INFO  FilterMutectCalls - HTSJDK Defaults.COMPRESSION_LEVEL : 2
14:55:53.435 INFO  FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
14:55:53.435 INFO  FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
14:55:53.435 INFO  FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
14:55:53.435 INFO  FilterMutectCalls - Deflater: IntelDeflater
14:55:53.435 INFO  FilterMutectCalls - Inflater: IntelInflater
14:55:53.435 INFO  FilterMutectCalls - GCS max retries/reopens: 20
14:55:53.435 INFO  FilterMutectCalls - Requester pays: disabled
14:55:53.436 INFO  FilterMutectCalls - Initializing engine
14:55:53.835 INFO  FeatureManager - Using codec VCFCodec to read file file:///home/shollizeck/SHollizeckUploadGATK/variants.vcf.gz
14:55:54.068 INFO  FilterMutectCalls - Done initializing engine
14:55:54.743 INFO  IOUtils - Extracting data from archive: file:///home/shollizeck/SHollizeckUploadGATK/orientationBiasArtifactPriors.tar.gz
14:55:54.765 INFO  IOUtils - Extracting file: ./b_mutated.orientation_priors
14:55:54.767 INFO  IOUtils - Extracting file: ./g_mutated.orientation_priors
14:55:54.767 INFO  IOUtils - Extracting file: ./c_mutated.orientation_priors
14:55:54.768 INFO  IOUtils - Extracting file: ./f_mutated.orientation_priors
14:55:54.769 INFO  IOUtils - Extracting file: ./i_mutated.orientation_priors
14:55:54.769 INFO  IOUtils - Extracting file: ./d_mutated.orientation_priors
14:55:54.770 INFO  IOUtils - Extracting file: ./e_mutated.orientation_priors
14:55:54.770 INFO  IOUtils - Extracting file: ./a_mutated.orientation_priors
14:55:54.771 INFO  IOUtils - Extracting file: ./h_mutated.orientation_priors
14:55:54.771 INFO  IOUtils - Extracting file: ./j_mutated.orientation_priors
14:55:54.855 INFO  ProgressMeter - Starting traversal
14:55:54.856 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
14:55:54.857 INFO  FilterMutectCalls - Starting pass 0 through the variants
14:56:05.368 INFO  ProgressMeter -            1:2019484              0.2                 16000          91332.9
14:56:15.521 INFO  ProgressMeter -            1:4008750              0.3                 35000         101621.1
14:56:26.027 INFO  ProgressMeter -            1:5856032              0.5                 55000         105867.6
...
19:37:05.295 INFO  ProgressMeter -     GL000209.1:48811            281.2              30739000         109323.8
19:37:15.543 INFO  ProgressMeter -     GL000224.1:65537            281.3              30758000         109324.9
19:37:25.847 INFO  ProgressMeter -     GL000248.1:21736            281.5              30768000         109293.8
19:37:25.906 INFO  FilterMutectCalls - Finished pass 0 through the variants
19:50:04.590 INFO  FilterMutectCalls - Shutting down engine
[9 January 2020 7:50:04 PM] org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls done. Elapsed time: 294.19 minutes.
Runtime.totalMemory()=14966849536
java.lang.IllegalArgumentException: Values in probability array sum to a negative number NaN
    at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:731)
    at org.broadinstitute.hellbender.utils.MathUtils.normalizeSumToOne(MathUtils.java:731)
    at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.performEMIteration(SomaticClusteringModel.java:336)
    at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.learnAndClearAccumulatedData(SomaticClusteringModel.java:306)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.learnParameters(Mutect2FilteringEngine.java:158)
    at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.afterNthPass(FilterMutectCalls.java:159)
    at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:44)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
    at org.broadinstitute.hellbender.Main.main(Main.java:292)
davidbenjamin commented 4 years ago

@SebastianHollizeck Are you able to share the unfiltered VCF and .vcf.stats file from Mutect2 that caused this error?

davidbenjamin commented 4 years ago

@SebastianHollizeck Never mind; I see you already shared it in the forum thread.

davidbenjamin commented 4 years ago

@SebastianHollizeck I believe the bug is not in FilterMutectCalls but upstream in LearnReadOrientationModel in the edge case of 3-base contexts that have no data in some of the samples. It's strange because we have an integration test for this already, and I would appreciate getting your input files to LearnReadOrientationModel for debugging.

I think the following quick fix will work: untar your artifact priors, delete all but sample b, and re-tar, then run FilterMutectCalls as before.

Is there a reason why all samples except b have very little data, and have no data at all for most 3-base contexts? To be clear, we want to fix the bug even if the data are weird, but I want to double-check that this is expected.

gbrandt6 commented 4 years ago

A user wrote in with a similar issue in the GATK forum. Using FilterMutectCalls after Mutect2(successful), they got an error java.lang.IllegalArgumentException: alpha must be greater than 0 but got NaN Here is the post: https://gatk.broadinstitute.org/hc/en-us/community/posts/360073336352-Error-running-FilterMutectCalls-alpha-must-be-greater-than-0-but-got-NaN

@davidbenjamin do you know if these are related?

davidbenjamin commented 3 years ago

@gbrandt6 This is a different issue. It looks similar but it occurs in a different part of filtering.