broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.69k stars 588 forks source link

Somatic MNP called even though one base mutated in the normal #6476

Open cboursnell opened 4 years ago

cboursnell commented 4 years ago

Bug Report

Affected tool(s) or class(es)

Mutect2

Affected version(s)

Tested in 4.1.2 up to and including 4.1.5

Description

Here's an IGV screenshot of a variant I found. One of the bases in the MNP is already a variant in the normal. https://drive.google.com/file/d/1YwfXH4LQQmhZ3wdi9HD1n6m3Up684DV5/view?usp=sharing

Steps to reproduce

My Mutect2 command was:

gatk Mutect2 --reference genome.fa -I normal.bam -I tumor.bam -normal normal -tumor tumor -L chr3:4300000-4400000 --annotation FisherStrand --output output.vcf --bam-output output.bamout.bam --germline-resource af-only-gnomad.hg38.vcf.gz --af-of-alleles-not-in-resource 0.0000025 --disable-read-filter MateOnSameContigOrNoMappedMateReadFilter --panel-of-normals pon.vcf --f1r2-tar-gz output.f1r2.tar.gz --f1r2-max-depth 600 --dont-use-soft-clipped-bases true

Expected behavior

This variant should just be called as a SNV at the first position.

Actual behavior

An MNP is called

cboursnell commented 4 years ago

After searching some more I found that this isn't a rare occurrence. Over 200 of the MNP variants in called by mutect2 in one patient had one of the bases called as a variant in the germline by the haplotypecaller.

fleharty commented 4 years ago

@cboursnell Could you provide a bam and the bam-out subset to include this variant and 150 bp on each side?

I suspect that when you look at the bam out, you might see that this in fact is an MNV that looks like an SNV in IGV due to alignment issues.

cboursnell commented 4 years ago

I have uploaded the bam files for the region normal bam https://drive.google.com/open?id=1z0HcTzSoWXfiTw_FXs0m-nh7ORSFoAFl tumour bam https://drive.google.com/open?id=1sBN3-QBuE6sxhnya1bLLJG600RIth9TH mutect2 bamout tumour bam https://drive.google.com/open?id=1fjkteAPlSAmykHFO2DKUIjqMN8SgGCi8

It doesn't look like there are any alignment issues. It is a very cleanly aligned region. The allele frequency at position 4,317,584 is 38% in the normal and 51% in the tumour. This is not a few sequencing errors or alignment issue. Mutect should be calling this as an SNV at position 4,317,583 C>T and nothing somatic at 4,317,584 because of the evidence in the normal at this position.

yl-h commented 4 years ago

I would like to add that this behaviour also occurs with nested SNVs that are in panel of normals.

Based on intersecting MNVs with PoN and quick testing (on 4.1.7.0) it seems that Mutect2 only checks if the first ALT base of an MNV is in normal (whole MNV not emitted) or PoN (MNV emitted by M2 but filtered by FilterMutectCalls). This is not applied for other positions of an MNV.

cboursnell commented 3 years ago

Any ideas on this? I'm still including a hacky cludge in my filters to remove these variants