ssadedin / bazam

A read extraction and realignment tool for next generation sequencing data
GNU Lesser General Public License v2.1
98 stars 16 forks source link

Swapping first and second mates when pair aligned to reverse strand #31

Open mikisvaz opened 3 years ago

mikisvaz commented 3 years ago

I noticed that Mutect2 filtered out a lot of variants based on orientation bias on a realigned pair of samples, which didn't happen when I used RevertSam instead. I seemed to have tracked the problem down to this:

A pair of samples might originally be

H0KU0ADXX130516:2:2112:19067:52017      163     1       30257   0       76M     =       30269   88      AAAAAGAGCATCATCAGTCCAAAGTCCAGCAGTTGTCCCTCCTGGAATCCGTTGGCTTGCCTCCGGCATTTTTGGC    CCDFCH<HHEBGDAHDHAHGEFEJBHGEJHCFBDJBHGGFHHGIIBGCHG@BDJHHFEJIGCIH@DGBCCEECGJF       MC:Z:76M        MD:Z:76 PG:Z:MarkDuplicates.1A  RG:Z:RG_H0KU0ADXX130516_2_R1    NM:i:0  MQ:i:0  AS:i:76 XS:i:76
H0KU0ADXX130516:2:2112:19067:52017      83      1       30269   0       76M     =       30257   -88     ATCAGTCCAAAGTCCAGCAGTTGTCCCTCCTGGAATCCGTTGGCTTGCCTCCGGCATTTTTGGCCCTTGCCTTTTA    BCHFFBIDBDFICDIGIIDDDDEBEIICHHBGHDCCH?GGDGHJEFJHHCH=GIHCEFFEEGHIHHFDHGEDDF?9       MC:Z:76M        MD:Z:76 PG:Z:MarkDuplicates.1A  RG:Z:RG_H0KU0ADXX130516_2_R1    NM:i:0  MQ:i:0  AS:i:76 XS:i:76

And after realignment become

H0KU0ADXX130516:2:2112:19067:52017      99      1       30257   0       76M     =       30269   88      AAAAAGAGCATCATCAGTCCAAAGTCCAGCAGTTGTCCCTCCTGGAATCCGTTGGCTTGCCTCCGGCATTTTTGGC    ABAFAF;GIEBHCAFCGAGGEFEIAIGEJHBDADIBGGFDHGEGGBFCIG@CDJGIFEJHFBGG@AEABBEDBDHE       MC:Z:76M        MD:Z:76 PG:Z:MarkDuplicates     RG:Z:H0KU0.2    NM:i:0  MQ:i:0  AS:i:76 XS:i:76
H0KU0ADXX130516:2:2112:19067:52017      147     1       30269   0       76M     =       30257   -88     ATCAGTCCAAAGTCCAGCAGTTGTCCCTCCTGGAATCCGTTGGCTTGCCTCCGGCATTTTTGGCCCTTGCCTTTTA    @CFEDBICCCEHBAIEHGDCDDDBDHICFGBGHDDCG@FFDGIIEFIGGCH>GIGCEFFDFGHHHGEEIFEDCA=9       MC:Z:76M        MD:Z:76 PG:Z:MarkDuplicates     RG:Z:H0KU0.2    NM:i:0  MQ:i:0  AS:i:76 XS:i:76

so the flags 163 and 83 become 99 and 147 all else being the same. In fact the first and second pairs come out of bazam swapped.