Martinsos / edlib

Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
http://martinsos.github.io/edlib
MIT License
506 stars 165 forks source link

Do alignment without mismatch? #111

Closed moold closed 6 years ago

moold commented 6 years ago

Hi, Is it possible to do alignment without any mismatch, only allow gaps. such as query: TCGTGTCCCAGCT target: TCGTTATTGCT

edlib output: 4=1I1=3X1I3= TCGTGTcccAGCT TCGT -Tatt -GCT The lower case letters are mismatches.

But I expect to get the following alignment: TCGTGTCCCA--GCT TCGT -T- - -ATTGCT

Thank you!

Martinsos commented 6 years ago

Hi, no, I am afraid that is not possible. Edlib implements edit distance which means mismatches and gaps(indels) have same penalty (are equally bad). Algorithm could be adapted so it does not allow mismatches, but Edlib does not support that nor is there a plan to support it in near future. You could consider some other algorithm, maybe implementation of NW and then set mismatch penalty to some very high number.

On Sun, Mar 4, 2018 at 11:42 AM, 胡江 notifications@github.com wrote:

Hi, Is it possible to do alignment without any mismatch, only allow gaps. such as query: TCGTGTCCCAGCT target: TCGTTATTGCT

edlib output: 4=1I1=3X1I3= TCGTGTcccAGCT TCGT -Tatt -GCT The lower case letters are mismatches.

But I expect to get the following alignment: TCGTGTCCCA--GCT TCGT -T- - -ATTGCT

Thank you!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Martinsos/edlib/issues/111, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdyh7mIhs4o-8CZ81pqfU62Lm_CUrvQks5ta9MqgaJpZM4SbTFf .

moold commented 6 years ago

Ok, thank you.