ezorita / seeq

DNA/RNA pattern matching algorithm
GNU General Public License v3.0
21 stars 5 forks source link

Improve alignment update algorithm. #10

Open ezorita opened 9 years ago

ezorita commented 9 years ago

This could be done much faster using a precomputed table of state transitions. Depending on what is the old i-th differential value, the updated i-th differential value, the old i+1-th differential value and whether the i+1-th position is in match or mismatch. This is a graph with 5 states (old transition and updated transition): [old]-[updated] up-same same-same same-down down-same down-down

Each state has six possible transitions: [match/mismatch]-[old i+1-th differential value] match-up match-same (...) mismatch-down which yields an emission (the updated i+1-th differential value) and connects to one of the 5 states. This way a pattern of length m could be updated in m operations without reverting the differential encoding.