Open GATKSupportTeam opened 3 years ago
Thank you for taking a look into this. I followed recommendation of @gbrandt6 and reduced the --pruning-lod-threshold but this call is still unable to make it to the output of Mutect2. I tried different thresholds from 1.3, 0.7, 0.5 and even 0.1 but it did not lead to any difference in detecting this call.
Mutect2 Adaptive Pruning issue as discussed in GATK OH meeting. Here is the original post:
This request was created from a contribution made by Nabeel Ahmed on April 07, 2021 09:13 UTC.
Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/360077647812-Why-do-a-clear-expected-variant-not-show-up-in-the-Mutect2-vcf-file
--
I am running Mutect2 on a sample in tumor-only mode. This sample has mutations introduced and known to be true positive calls. However, I am unable to detect some of these calls in the vcf file after Mutect2 is run that have very clear read support as seen in IGV. I have used the –bam-output option to show the output bam and in IGV, it shows that there is no assembly in this region and no mutation event was detected. I am showing the IGV screenshot for one of such calls (chr12:25398285).
I am using the latest version GATK 4.2.0.0 and the following is the full Mutect2 command from the log file
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.2.0.0-local.jar Mutect2 -R ../resources/hg19.fa -L ../resources/coding_regions.bed -I bam_files/sample1.bam --pon ../resources/pon.vcf.gz --germline-resource ../resources/af-only-gnomad.raw.sites.hg19.vcf.gz --bam-output sample1.mutect2_out.bam --recover-all-dangling-branches true -min-pruning 1 --min-dangling-branch-length 2 --debug --max-reads-per-alignment-start 0 --genotype-pon-sites True --f1r2-tar-gz vcf_files/f1r2.sample1.tar.gz -O vcf_files/unfiltered.sample1.vcf
In the debug mode, the following log messages are generated for this region
08:01:26.086 INFO Mutect2Engine - Assembling chr12:25398242-25398320 with 14298 reads: (with overlap region = chr12:25398142-25398420)
I have another call with similar VAF that is detected in the vcf output(chr12:25380275).
chr12 25380275 . T G . . AS_SB_TABLE=3911,5343|26,21;DP=9485;ECNT=1;MBQ=36,36;MFRL=0,0;MMQ=42,42;MPOS=18;POPAF=7.30;TLOD=53.53 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:9254,47:4.970e-03:9301:5321,21:3867,26:3911,5343,26,21
The input and the output BAMs show this call with the variant.
In the logs, it shows the detection of an active region here:
08:01:23.642 INFO Mutect2Engine - Assembling chr12:25380238-25380327 with 19912 reads: (with overlap region = chr12:25380138-25380427)
08:01:24.119 INFO EventMap - >> Events = EventMap{chr12:25380275-25380275 [T*, G],}
08:01:24.154 INFO AssemblyResultSet - Trimming active region AssemblyRegion chr12:25380238-25380327 active?=true nReads=19912 with 2 haplotypes
08:01:24.154 INFO AssemblyResultSet - Trimmed region to chr12:25380255-25380295 and reduced number of haplotypes from 2 to only 2
08:01:25.383 INFO EventMap - >> Events = EventMap{chr12:25380275-25380275 [T*, G],}
I have tried troubleshooting with the steps stated in this [blog](/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant). However, it does not change the output vcf. I used the force-calling mode by giving the above call in an input vcf and the call did appear in the vcf file.
chr12 25398285 . C A . . AS_SB_TABLE=4312,3630|14,8;DP=8096;ECNT=1;MBQ=36,36;MFRL=0,0;MMQ=42,42;MPOS=22;POPAF=7.30;TLOD=14.69 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:7942,22:2.576e-03:7964:3609,8:4268,13:4312,3630,14,8
However, I cannot rely on force-calling mutations on a set of calls. I am unsure if I am missing out more calls that are not showing up. Are there any parameters I need to tune so that I do not miss calls like above?
(created from Zendesk ticket #136765)
gz#136765