Refinements of the WFA alignment algorithm with better complexity
This repository contains implementations of the WFA algorithm as well as a few variations on it.
The three memory-reducing variations are eminently practical, but the suffix tree algorithm has large constants that make it take more time in the majority of common use cases.
Because many users will not need the suffix tree algorithm, the library is implemented in two headers. The non-suffix tree algorithm is implemented in wfa_lm.hpp
. For environments with modern x86 processors, this header is entirely self-contained and depends on nothing except the C++14 standard library. Simply copy wfa_lm.hpp
into the include directory of your project. You may need to add -msse4.1
to your compiler invocation.
The library is usable in environments with non-x86 processors (such as many Apple notebooks), but it depends on SIMDE for portability. SIMDE is a header-only library as well. To use it, first check out the SIMDE submodule:
git submodule update --init --recursive
Next, copy wfa_lm.hpp
and simde/simde/
into the include directory of your project.
The suffix tree algorithm is implemented in wfa_lm_st.hpp
. It is also single-header, but it depends on SDSL v3 for its implementation of a suffix tree. To use it, check out the submodules as described above and copy the headers in external/sdsl-lite/include
to your project's include directory.
Eizenga, JM, and Paten, B (2022) Improving the time and space complexity of the WFA algorithm and generalizing its scoring. bioRxiv. DOI 10.1101/2022.01.12.476087