huishenlab / biscuit

BISulfite-seq CUI Toolkit
Other
16 stars 7 forks source link

[Issue] Incorrect calling of NM #50

Closed njspix closed 8 months ago

njspix commented 9 months ago

Biscuit appears to be incorrectly calling the following situation, which should be a mismatch:

Reference: ---C---
G>A read:  ---T---

Reprex:

echo $'>chr1\nACGATCGATCGATCGATCGATCGATGCGGGGGGGGCGGGGGGGGCGATCATCGATCGACGAC' > ref.fa
biscuit index ref.fa
biscuit align ref.fa -1 ACAATCAATCAATCAATCAATCAATACAAAAAAAATAAAAAAAACAATCATCAATCAACAAC
[main] Version: 1.4.0
...
inputread       0       chr1    1       60      62M     *       0       0       ACAATCAATCAATCAATCAATCAATACAAAAAAAATAAAAAAAACAATCATCAATCAACAAC  *       NM:i:0  MD:Z:2G3G3G3G3G3G2G1G0G0G0G0G0G0G0G0C0G0G0G0G0G0G0G0G1G6G3G2G2ZC:i:27 ZR:i:0  AS:i:59 XS:i:22 XL:i:62 MC:Z:*  MQ:i:0  YD:A:r
...

Note that the number of mismatches is 0; this appears to be incorrect, as a C or T on the daughter (G->A) converted strand can both be trusted as the 'real' sequenced base...

jamorrison commented 8 months ago

This issue has been resolved and will make it into the next release of BISCUIT.