Closed LinXialab closed 5 years ago
You will have to parse the MD string for that, as if I remember correctly NGMLR doesn't use the NM tag.
In python:
import re
import pysam
edit_distances = []
for read in pysam.AlignmentFile("yourfile.bam", "rb"):
edit_distances.append(
(sum([len(item) for item in re.split('[0-9^]', read.get_tag("MD"))]) + # Parse MD string to get mismatches/deletions
sum([item[1] for item in read.cigartuples if item[0] == 1])) # Parse cigar to get insertions
/read.query_alignment_length)
Thanks for your reply. The python script your provided may help us a lot. Another question: Could you please tell us what the tag "XI:f:" means? I have found that the number follow by "XI:f" represent identity (https://github.com/philres/ngmlr/blob/master/src/SAMWriter.cpp).
Based on that you can get the alignment identity from the XI tag then. Cool, was not aware that this tag was there.
Yes the XI tag gives the alignment identify. Thanks Fritz
Excuse me, there are many kinds of identity, like BLAST identity. Could you please tell me the tag 'XI' represents which kind of identity?
Its the number of differences in the alignment divided by alignment length. Cheers Fritz
Hi,
We want to compare the identity of alignments which mapped by minimap2 and NGMLR. Could you please tell us how to calculate the identity from the sam file generated by NGMLR with default parameters?
Looking forward your reply!