veg / tn93

TN93 fast distance calculator
MIT License
15 stars 7 forks source link

Improperly specified branch making all match modes (except GAPMM) equivalent #13

Closed AABoyles closed 6 years ago

AABoyles commented 6 years ago

I was looking over your source while doing some optimizations on my Javascript port. My version had a branch that caused all match modes except GAPMM to return the same value, regardless of the presence or distribution of gaps. I believe yours has it too. I'm sorry I don't have the resources to patch it myself, but if true, this could be a critical bug and I thought you should know about it.

spond commented 6 years ago

Dear @AABoyles,

Thanks for looking at our code and taking the time to bring a possible issue to our attention.

This branch is only going to be taken only if either characters being compared is a gap if (c1 == GAP || c2 == GAP).

In that case the desired behavior for all modes except GAPMM is to simply skip over this pair (GAPMM will skip only over pairs that are BOTH gaps). It is also worth noting that the enclosing statement (if (sequence_descriptor1 && sequence_descriptor2 && matchMode != GAPMM)) is only going to be taken if the mode is GAPMM or (in very rare cases), the sequences are not represented in their slightly compressed form.

In other words, I think the code (at least the C++ version, not sure about JS) is behaving correctly.

Best, Sergei

AABoyles commented 6 years ago

Ah, I see. Thank you for elaborating, and I'm sorry to have bothered you!