Martinsos / edlib

Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
http://martinsos.github.io/edlib
MIT License
506 stars 165 forks source link

Wrong CIGAR op in alignment - 1X instead of 1= #57

Closed isovic closed 7 years ago

isovic commented 7 years ago

Hi, I noticed a strange alignment reported for some simple sequences. Concretely:

>Seq1  
ATG  
>Seq2  
TTTTTTTTTTTTTTTTTTTTTTTTTTTTCATGAGACGCAACTATGGTGACGAA  

Results for NICE alignment:

$ edlib/build/bin/edlib-aligner -m SHW -p -l -f NICE seq1.fasta seq2.fasta
...
Query #0 (3 residues): score = 2  
T: -T- (0 - 0)  

Q: ATG (0 - 2)

Results for CIG_EXT alignment:

$ edlib/build/bin/edlib-aligner -m SHW -p -l -f CIG_EXT seq1.fasta seq2.fasta
...
Query #0 (3 residues): score = 2  
Cigar:  
1I1X1I  

But there should be a "=" instead of "X" there...

Ivan

Martinsos commented 7 years ago

Thanks @isovic! It was a bug in alignment traceback - one of the cells that is "outside of matrix" was not always correctly calculated when moving through the very first column of the matrix. I fixed it with af94e5350f68825b1b60ac8f0b5d7bef15638159.

isovic commented 7 years ago

Thanks for the quick fix! :-)

Ivan