Martinsos / edlib

Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
http://martinsos.github.io/edlib
MIT License
493 stars 162 forks source link

Speed up calculation of Peq #13

Open Martinsos opened 10 years ago

Martinsos commented 10 years ago

Although calculation of Peq normally takes unimportant amount of time compared to DP calculation, in some cases like when using NW for very similar proteins, it takes about half of execution time! It would be interesting to speed it up in all or at least such cases.

Martinsos commented 10 years ago

One idea: precalculate small portions of Peq. For example, if we have alphabet of length 10, we can precalculate Peq for all portions of size 6, which will take 1MB of memory. Then, when building Peq for one word, which has 64 bits, we do not have to calculate bit by bit, but we can insert 10 bits at once, which will give speedup of 10 times for Peq calculation! However, I am not sure how much time Peq construction takes, I am not sure anymore if it really takes half of execution time.