broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.68k stars 587 forks source link

Make sure NON_REF doesn't get called and doesn't get AD #5677

Closed ldgauthier closed 4 years ago

ldgauthier commented 5 years ago

In GATK3 sometimes in gross indels the genotype would be called for the NON_REF allele. This may be improved in GATK4.1 with the new QUAL model, but it's worth putting some checks, i.e. after https://github.com/broadinstitute/gatk/blob/89ea9e01225db5c9bbe262c888a0abb74509f94c/src/main/java/org/broadinstitute/hellbender/tools/walkers/genotyper/AlleleSubsettingUtils.java#L93 which I believe gets called from HaplotypeCaller and GenotypeGVCFs

AD should be checked in that same method

ldgauthier commented 5 years ago

I propose the expected behavior be to no-call the genotype that would have had NON_REF with all-zero PLs. Likely someone will be grumpy about that, but how confident can you really be about a genotype that was mostly likely to contain an allele you haven't actually seen?