Martinsos / edlib

Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
http://martinsos.github.io/edlib
MIT License
493 stars 162 forks source link

Generic Edlib #147

Closed mobinasri closed 4 years ago

mobinasri commented 4 years ago

The first PR in the path of modifying edlib to support generic sequences Here is a brief summary of what has changed;

  1. The constructor of EqualityDefition is commented and its method, areEqual(), returns the simple equality(==) of two given symbols. As you said we can discuss later how to generalize this feature for generic alphabets.

  2. The main function of the library, edlibAlign, is replaced by its template. For calling it, the type of symbols in input sequences has to be determined. AlphabetType is the name of this data type. The type for indexing the symbols is changed to uint32_t. It limits the alphabet size (of two given sequences) to 4 billion. We don't expect to see such a large alphabet. It seems that should work for most cases.

  3. "runTests.cpp" is modified, now it uses all the previous character-based tests to make sure that everything is fine at least for those tests. (Just one test is failed because of using additionalEqualities in that test.)

Martinsos commented 4 years ago

Closing this one since #148 replaced it.