broadinstitute / gatk-protected

Obsolete/Legacy GATK repository -- go to https://github.com/broadinstitute/gatk instead
BSD 3-Clause "New" or "Revised" License
33 stars 20 forks source link

FilterByOrientationBias will fail when it (erroneously) determines that it is running on a non-diploid organism. #1044

Closed LeeTL1220 closed 7 years ago

LeeTL1220 commented 7 years ago

This caused a failure in a tumor sample. In OrientationBiasFilterer.java, remove (or change to a warning) on roughly line 60-62.

         if (genotype.getPloidy() != 2) {
                throw new UserException.BadInput("This tool will not run with non-diploid organisms.  Saw GT: " + genotype.getGenotypeString());
            }
LeeTL1220 commented 7 years ago

@davidbenjamin I'm a little worried that the getPloidy() is not behaving as expected.

LeeTL1220 commented 7 years ago

@davidbenjamin Actually, the getPloidy() code is fine.... Does this variant seem reasonable coming out of M2? Note the GT value of 0/1/2/3.

1   237060945   .   CTTT    C,CT,CTTTT  .   .   DP=447;ECNT=1;IN_DBSNP;IN_PON;NLOD=23.13,7.21,2.86;N_ART_LOD=4.87,11.43,11.61;RPA=15,12,13,16;RU=T;STR;TLOD=12.85,41.18,10.06;TLOD_FWD=-2.504e-02;TLOD_REV=12.87;TUMOR_SB_POWER_FWD=0.197;TUMOR_SB_POWER_REV=0.989  GT:AD:AF:MBQ:MCL:MFRL:MMQ:MPOS  0/1/2/3:89,5,16,12:0.031,0.099,0.068:26,31,28,2:0,0,0,0:161,164,157,144:60,29,60,60:21,22,24,22 0/0:72,2,6,15:9.524e-03,0.029,0.048:29,29,29,20:0,0,0,0:178,219,158,167:60,45,60,60:23,27,27,22
davidbenjamin commented 7 years ago

@LeeTL1220 Given the tumor lods this is genotyped correctly -- there is evidence (albeit evidence that we will reject base don the filters) that all these alleles exist. Obviously there's an extremely strong prior that this can't happen, which we currently implement as the infinitely strong multiallelic filter. In addition to being multiallelic, this site will fail on the pon filter, the normal artifact filter (every alt allele), the germline filter (the 2nd and 3rd alt alleles), the median base quality filter (3rd alt allele), and the median mapping quality filter (3rd alt allele).

Bottom line is, there's nothing wrong in principle if M2 emits a non-diploid somatic genotype, although in this case and the great majority of cases these are artifacts.