matthieugomez / StringDistances.jl

String Distances in Julia
Other
134 stars 18 forks source link

Bit parallel Levenshtein distance #24

Open lesshaste opened 4 years ago

lesshaste commented 4 years ago

I was wondering whether this library uses the bit parallel speed up tricks from:

https://www.win.tue.nl/~jfg/educ/bit.mat.pdf https://users.dcc.uchile.cl/~gnavarro/ps/jea06.pdf

They seem very worthwhile.

matthieugomez commented 4 years ago

Not at all! Feel free to try a pull request.

lesshaste commented 4 years ago

I am not sure I am competent to code it well in Julia (first I would have to learn Julia :) ). However this series of answers https://codegolf.stackexchange.com/a/197716/9207 seems to contain optimized code in Java, Python and Rust. For example:

@jit(nopython=True, fastmath=True, nogil=True)
def LevenshteinDistance(n, p, t):
        np=~p
        HMASK = (1 << (n - 1))
        VP = (1 << n) - 1
        VN = 0
        score = n
        for j in range(0,n):
            if (t & (1<<j)) != 0:
                Bj = p
            else:
                Bj = np
            D0 = ((VP + (Bj & VP)) ^ VP) | Bj | VN
            HN = VP & D0
            HP = VN | ~(VP | D0)

            if ((HP & HMASK) != 0):
             score += 1;
            elif ((HN & HMASK) != 0):
             score -= 1;
            X = (HP << 1) | 1
            VN = X & D0
            VP = (HN << 1) | ~(X | D0)
        return score

Maybe one of those implementations can be translated directly into Julia? Of course a Julia answer to that question would also be awesome!