Open holtgrewe opened 7 years ago
Sorry for my late reply😥
I'm currently working on adding several sam tags, including MD and SA (supplementary alignments). The code still has some bugs, so please wait for a few daysðŸ˜.
(I expect I can finish up the tests for this branch by the middle of this week...)
Thanks,
Hajime Suzuki
Sounds on the na12787 sequel data set on DNA nexus it seems that minialign produces shorter aligned fragments than ngm lr and bwa pacbio. Do you have an idea on possible reasons?
It is confirmed that minialign tends to split alignment over short indels. If such indel resides around the end of a read, the shorter one might be removed out from the resulting alignments. It is the most possible situation.
I want to examine the situation in my hand, could you tell me the sample url and the parameters? (Is this? http://www.pacb.com/blog/identifying-structural-variants-na12878-low-fold-coverage-sequencing-pacbio-sequel-system/ )
Thanks,
Hajime Suzuki
Though it is my personal impression, I think the NGM-LR currently performs the best on detecting indels ranging from 20 to 100 bases. It seems the performance of minialign is between NGM-LR and BWA-MEM, but much closer to the latter. I am now struggling to rescue such alignments with an additional supplementary alignment collection algorithm. It is different from the NGM-LR's convex gap-penalty function approach but I'm sure it will also works well on SV detection.
I've just added MD tag option in the release 0.5.0 (enabled with -TMD
flag; passing -TNM,MD
result in equivalent output with samtools calmd
). Please try it out😄
https://github.com/ocxtal/minialign/tree/minialign-0.5.0
Hajime Suzuki
It would be useful to implement calculation of the MD tag just like BWA does it. This allows to perform simple variant calling and estimating error rates from the BAM file alone without a reference.