zenglab-pku / 2024fall-CTI

computational tumor immunology, 2024 fall, Peking University
1 stars 1 forks source link

Error with gatk Funcotator output #3

Open pkuTrasond opened 3 days ago

pkuTrasond commented 3 days ago

Hi TAs, I encountered a problem when running the gatk Funcotator in the annotation step.

The command line I use is:

gatk Funcotator --data-sources-path /lustre1/share/references/funcotator_dataSources.v1.8.hg38.20230908s 
-O ~/analysis/5_funcotator/OC_funcotator.maf --output-file-format MAF \
-R /lustre1/share/references/hg38.fa \
-V ~/analysis/4_somMut/gatk/OC_filter.vcf --ref-version hg38 \
--intervals /lustre1/share/references/hg38.exon.interval_list

However, I received a output .maf file with ONLY COMMENT LINES

There are no errors during the process. Here are some of the Warnings and the Exceptions printed by JVM:

16:07:50.190 INFO  Funcotator - Shutting down engine
[September 23, 2024 at 4:07:50 PM CST] org.broadinstitute.hellbender.tools.funcotator.Funcotator done. Elapsed time: 0.46 minutes.
Runtime.totalMemory()=2533359616
java.lang.IllegalArgumentException: Unexpected value: overlaps_pseudogene
        at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature$FeatureTag.getEnum(GencodeGtfFeature.java:1394)
        at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature.<init>(GencodeGtfFeature.java:197)
        at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfGeneFeature.<init>(GencodeGtfGeneFeature.java:19)
        at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfGeneFeature.create(GencodeGtfGeneFeature.java:23)
        at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature$FeatureType$1.create(GencodeGtfFeature.java:760)
        at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature.create(GencodeGtfFeature.java:327)
        at org.broadinstitute.hellbender.utils.codecs.gtf.AbstractGtfCodec.decode(AbstractGtfCodec.java:138)
        at org.broadinstitute.hellbender.utils.codecs.gtf.AbstractGtfCodec.decode(AbstractGtfCodec.java:23)
        at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.readNextRecord(TribbleIndexedFeatureReader.java:502)
        at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.<init>(TribbleIndexedFeatureReader.java:442)
        at htsjdk.tribble.TribbleIndexedFeatureReader.query(TribbleIndexedFeatureReader.java:298)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.refillQueryCache(FeatureDataSource.java:622)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.queryAndPrefetch(FeatureDataSource.java:591)
        at org.broadinstitute.hellbender.engine.FeatureManager.getFeatures(FeatureManager.java:363)
        at org.broadinstitute.hellbender.engine.FeatureContext.getValues(FeatureContext.java:173)
        at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.queryFeaturesFromFeatureContext(DataSourceFuncotationFactory.java:314)
        at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.getFeaturesFromFeatureContext(DataSourceFuncotationFactory.java:229)
        at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotationFactory.java:207)
        at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotationFactory.java:182)
        at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.lambda$createFuncotationMapForVariant$0(FuncotatorEngine.java:152)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
        at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
        at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
        at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.createFuncotationMapForVariant(FuncotatorEngine.java:162)
        at org.broadinstitute.hellbender.tools.funcotator.Funcotator.enqueueAndHandleVariant(Funcotator.java:924)
        at org.broadinstitute.hellbender.tools.funcotator.Funcotator.apply(Funcotator.java:878)
        at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
        at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
        at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
        at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
        at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
        at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1095)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)

And

WARNING 2024-09-23 16:07:48     AsciiLineReader Creating an indexable source for an AsciiFeatureCodec using a stream that is neither a PositionalBufferedStream nor a BlockCompressedInputStream
16:14:21.944 WARN  GencodeGtfCodec - GENCODE GTF Header line 1 has a version number that is above maximum tested version (v 34) (given: 43): ##description: evidence-based annotation of the human genome (GRCh38), version 43 (Ensembl 109)   Continuing, but errors may occur.
16:14:21.945 WARN  GencodeGtfCodec - GENCODE GTF Header line 1 has a version number that is above maximum tested version (v 34) (given: 43): ##description: evidence-based annotation of the human genome (GRCh38), version 43 (Ensembl 109)   Continuing, but errors may occur.
16:14:21.946 INFO  FeatureManager - Using codec GencodeGtfCodec to read file file:///lustre1/share/references/funcotator_dataSources.v1.8.hg38.20230908s/gencode/hg38/gencode.v43.annotation.REORDERED.gtf
16:14:21.948 WARN  GencodeGtfCodec - GENCODE GTF Header line 1 has a version number that is above maximum tested version (v 34) (given: 43): ##description: evidence-based annotation of the human genome (GRCh38), version 43 (Ensembl 109)   Continuing, but errors may occur.

I can't figure out what's wrong with this step. The previous step in which the filtered .vcf is produced seems nothing wrong and the filtered VCF file has content. Also the ANNOVAR annotation using the same filtered VCF file produced fine.

Thanks for the help!

zenglab-pku commented 20 hours ago

Hi! It's great of you to think of and put forward this question.

  1. For Exception of overlaps pseudogenes, you can refer to: https://gatk.broadinstitute.org/hc/en-us/community/posts/19484785532315-Funcotator-Unexpected-value-Ensembl-canonical In a word, this is because the version resource bundle has not been updated on the server. If you want reasonable results when analyzing your own WES data, it is recommended that you install the newest version of resource bundle.

  2. Warning is not a problem. It was just trying to index the vcf file and remind you of this process.

  3. As I mentioned in the class, the WES data takes just a small proportion of official version. So most of our detected mutations in this case is synonymous mutation that can not be annotated with any functions, right?

zenglab-pku commented 20 hours ago

By the way, if you use the output .maf file produced by ANNOVAR, though it seems fine, all or most of the mutations are also synonymous mutations.