ocxtal / minialign

[IMPORTANT: not for real data analysis, only for algorithm evaluation] fast and accurate alignment tool for PacBio and Nanopore long reads
MIT License
126 stars 9 forks source link

Calculate MD tag #8

Open holtgrewe opened 7 years ago

holtgrewe commented 7 years ago

It would be useful to implement calculation of the MD tag just like BWA does it. This allows to perform simple variant calling and estimating error rates from the BAM file alone without a reference.

ocxtal commented 7 years ago

Sorry for my late reply😥

I'm currently working on adding several sam tags, including MD and SA (supplementary alignments). The code still has some bugs, so please wait for a few days😭.

(I expect I can finish up the tests for this branch by the middle of this week...)

Thanks,

Hajime Suzuki

holtgrewe commented 7 years ago

Sounds on the na12787 sequel data set on DNA nexus it seems that minialign produces shorter aligned fragments than ngm lr and bwa pacbio. Do you have an idea on possible reasons?

ocxtal commented 7 years ago

It is confirmed that minialign tends to split alignment over short indels. If such indel resides around the end of a read, the shorter one might be removed out from the resulting alignments. It is the most possible situation.

I want to examine the situation in my hand, could you tell me the sample url and the parameters? (Is this? http://www.pacb.com/blog/identifying-structural-variants-na12878-low-fold-coverage-sequencing-pacbio-sequel-system/ )

Thanks,

Hajime Suzuki

ocxtal commented 7 years ago

Though it is my personal impression, I think the NGM-LR currently performs the best on detecting indels ranging from 20 to 100 bases. It seems the performance of minialign is between NGM-LR and BWA-MEM, but much closer to the latter. I am now struggling to rescue such alignments with an additional supplementary alignment collection algorithm. It is different from the NGM-LR's convex gap-penalty function approach but I'm sure it will also works well on SV detection.

ocxtal commented 7 years ago

I've just added MD tag option in the release 0.5.0 (enabled with -TMD flag; passing -TNM,MD result in equivalent output with samtools calmd). Please try it out😄 https://github.com/ocxtal/minialign/tree/minialign-0.5.0

Hajime Suzuki