Open fleharty opened 4 years ago
This is probably complicated by a bug in the htsjdk warning from previous versions, which should be fixed in the latest master now. There's probably still a bug, but the error will be more informative now.
There may be a ploidy-related bug since the somatic genotypes are a little funky that way. I don't like the fact that this is calling a biallelic method.
@fleharty if you still care about this, can you run it again with the latest master?
Hi everyone. I try to run gatk 4.2.5.0 VariantAnnotator using gnomAD data. However I get this error message java.lang.IllegalStateException: Allele in genotype C not in the variant context [C*, CT] can you maybe advise whats going on?
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx30G -jar /run/media/riadh/One Touch1/Analysis/gatk-4.2.4.1/gatk-package-4.2.5.0-local.jar VariantAnnotator -V PE69_chr3.vcf -R /run/media/riadh/One Touch/Reference_data_b38/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta --resource:gnomad /run/media/riadh/One Touch/Reference_data_b38/gnomad.genomes.v3.1.2.sites.chr3.vcf.bgz -E gnomad.nhomalt -E gnomad.ALT -E gnomad.AF -O PE69_ch3_vep_cadd_gnomad.vcf --resource-allele-concordance
10:58:19.715 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/run/media/riadh/One%20Touch1/Analysis/gatk-4.2.4.1/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Mar 17, 2022 10:58:19 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
10:58:19.796 INFO VariantAnnotator - ------------------------------------------------------------
10:58:19.796 INFO VariantAnnotator - The Genome Analysis Toolkit (GATK) v4.2.5.0
10:58:19.796 INFO VariantAnnotator - For support and documentation go to https://software.broadinstitute.org/gatk/
10:58:19.797 INFO VariantAnnotator - Executing as riadh@ikm-unix-1012.uio.no on Linux v5.16.12-200.fc35.x86_64 amd64
10:58:19.797 INFO VariantAnnotator - Java runtime: OpenJDK 64-Bit Server VM v11.0.14.1+1
10:58:19.797 INFO VariantAnnotator - Start Date/Time: March 17, 2022 at 10:58:19 AM CET
10:58:19.797 INFO VariantAnnotator - ------------------------------------------------------------
10:58:19.797 INFO VariantAnnotator - ------------------------------------------------------------
10:58:19.797 INFO VariantAnnotator - HTSJDK Version: 2.24.1
10:58:19.797 INFO VariantAnnotator - Picard Version: 2.25.4
10:58:19.798 INFO VariantAnnotator - Built for Spark Version: 2.4.5
10:58:19.798 INFO VariantAnnotator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
10:58:19.798 INFO VariantAnnotator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
10:58:19.798 INFO VariantAnnotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
10:58:19.798 INFO VariantAnnotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
10:58:19.798 INFO VariantAnnotator - Deflater: IntelDeflater
10:58:19.798 INFO VariantAnnotator - Inflater: IntelInflater
10:58:19.798 INFO VariantAnnotator - GCS max retries/reopens: 20
10:58:19.798 INFO VariantAnnotator - Requester pays: disabled
10:58:19.798 INFO VariantAnnotator - Initializing engine
10:58:19.942 INFO FeatureManager - Using codec VCFCodec to read file file:///run/media/riadh/One%20Touch/Reference_data_b38/gnomad.genomes.v3.1.2.sites.chr3.vcf.bgz
10:58:19.971 INFO FeatureManager - Using codec VCFCodec to read file file:///run/media/riadh/My%20Book_From%20Eiklid/Analysis/gatk-4.2.4.1/ensembl-vep/PE69_chr3.vcf
10:58:20.063 INFO VariantAnnotator - Done initializing engine
10:58:20.091 WARN VariantAnnotatorEngine - The requested expression attribute "gnomad.ALT" is missing from the header in its resource file gnomad
10:58:20.140 INFO ProgressMeter - Starting traversal
10:58:20.140 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
10:58:42.160 INFO VariantAnnotator - Shutting down engine
[March 17, 2022 at 10:58:42 AM CET] org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotator done. Elapsed time: 0.37 minutes.
Runtime.totalMemory()=17158897664
java.lang.IllegalStateException: Allele in genotype C not in the variant context [C*, CT]
at htsjdk.variant.variantcontext.VariantContext$Validation.validateGenotypes(VariantContext.java:382)
at htsjdk.variant.variantcontext.VariantContext$Validation.access$200(VariantContext.java:323)
at htsjdk.variant.variantcontext.VariantContext$Validation$2.validate(VariantContext.java:331)
at htsjdk.variant.variantcontext.VariantContext.lambda$validate$0(VariantContext.java:1384)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at htsjdk.variant.variantcontext.VariantContext.validate(VariantContext.java:1384)
at htsjdk.variant.variantcontext.VariantContext.
Update: this issue is still happening. User ran GATK 4.4: https://gatk.broadinstitute.org/hc/en-us/community/posts/15706942393371-Error-when-running-VariantAnnotator
Here is a PR to deploy a bugfix for a similar issue in HaplotypeCaller. https://github.com/broadinstitute/gatk/pull/5365
Bug Report
Affected tool(s) or class(es)
VariantAnnotator
Affected version(s)
Description
Throws an exception on a legal variant.
java.lang.IllegalStateException: Allele in genotype G not in the variant context [G*, G, GT] at htsjdk.variant.variantcontext.VariantContext$Validation.validateGenotypes(VariantContext.java:382) at htsjdk.variant.variantcontext.VariantContext$Validation.access$200(VariantContext.java:323) at htsjdk.variant.variantcontext.VariantContext$Validation$2.validate(VariantContext.java:331) at htsjdk.variant.variantcontext.VariantContext.lambda$validate$0(VariantContext.java:1384) at java.lang.Iterable.forEach(Iterable.java:75) at htsjdk.variant.variantcontext.VariantContext.validate(VariantContext.java:1384) at htsjdk.variant.variantcontext.VariantContext.(VariantContext.java:489)
at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:647)
at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:638)
at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.trimAlleles(GATKVariantContextUtils.java:1329)
at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.trimAlleles(GATKVariantContextUtils.java:1285)
at org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorEngine.getMinRepresentationBiallelics(VariantAnnotatorEngine.java:499)
at org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorEngine.annotateExpressions(VariantAnnotatorEngine.java:440)
at org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:285)
at org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotator.apply(VariantAnnotator.java:230)
at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
I realize this is a open source project. But I've made copy of the failing VCF available at: /dsde/working/fleharty/tmp/buggy.vcf /dsde/working/fleharty/tmp/buggy.vcf.idx
Steps to reproduce
gatk VariantAnnotator -V buggy.vcf --resource:gnomad af-only-gnomad.raw.sites.vcf -E gnomad.AF --resource-allele-concordance -O gnomad_annotated.vcf
Expected behavior
Should work
Actual behavior
Throws exception