Daniel-Liu-c0deb0t / block-aligner

SIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.
https://crates.io/crates/block_aligner
MIT License
124 stars 7 forks source link

Thoughts on relation to / comparison with WFA #13

Open rob-p opened 2 years ago

rob-p commented 2 years ago

The WFA (and related BiWFA) algorithm have provided some really nice asymptotic and practical advances for pairwise alignment under a set of realistic scoring functions. Do you have any thoughts on the relation of the block alignment idea with WFA? Could ideas from these two approaches be combined fruitfully? How would we expect their performance to compare?

Daniel-Liu-c0deb0t commented 1 year ago

Very late response (sorry!), but block aligner fills the niche of aligning sequences with complex scoring matrices (eg., amino acid or PSSMs). For these tasks, it's difficult for WFA and related approaches to work well. Also, it seems that WFA's performance falls off quite a bit with high error rates. I recently saw a 3rd party benchmark here: https://github.com/rchikhi/rust-alignbench. I think generally speaking, for sufficiently high error rates, optimized SIMD DP seems to win over careful pruning.

I think there's room for something of a mix between WFA for guiding the alignment, and optimized DP for computing small blocks within the larger DP matrix, kinda like wfmash/wflign. Block aligner currently uses heuristics for guiding the placement of blocks within the DP matrix, which seems to work (surprisingly) well in practice, but obviously lacks any guarantees, which is probably a problem for aligning difficult regions (repeats, large gaps, etc.). Currently, this issue is only able to be addressed with using a larger block size (slower).