Meet in the middle - Githubissues

RagnarGrootKoerkamp commented 2 years ago

For Dijkstra, a common technique to reduce explored states is to use meet in the middle, where you search both from the start and end and stop as you expand a state from both ends.

WFA is basically just a more efficient implementation of Dijkstra, since it also expands in order of increasing g (distance from start). Thus, meet in the middle should just work here.

All you'd need to do is run is from both sides to the middle of the string (or alternate and run both sides to equal score). It shouldn't be hard to detect when the two searches overlap. Probably a check that the M from the start and M from the end on the same diagonal add to more than the sequence length is sufficient.

The benefit here is that now the runtime is 2*(s/2)^2 = s^2/2, so around twice as fast as the original,. assuming the implementation doesn't slow down too much.

(For non-global alignment this may be more tricky. Not sure yet.)

I don't know if it's worth the effort right now and how much a 2x speedup matters compared to other optimizations you may be thinking of, but anyway at some point this could be nice.

smarco commented 2 years ago

:)

You just spoiled the surprise. Not only you get better execution time, but you can hit a better lower memory bound O(s).

Let me complete this comment later. We can continue this via email.

RagnarGrootKoerkamp commented 2 years ago

Ha cool! I see where you're getting the linear memory from ;) Good stuff 👍

smarco commented 2 years ago

FYI, related discussion moved to https://github.com/jeizenga/wfalm/issues/4

smarco / WFA2-lib

Meet in the middle #8