Entomy / LibLangly

The combined Langly runtime
https://entomy.github.io/LibLangly/
33 stars 7 forks source link

Damerau-Levenshtein #152

Open Entomy opened 4 years ago

Entomy commented 4 years ago

Damerau-Levenshtein is an extension of the Levenshtein edit-distance algorithm to additionally support basic transposition. As such, it gives noticeably better results, and should be implemented and preferentially used.

Entomy commented 4 years ago

I'm copying these over from #153

Damerau-Levenshtein edit-distance is probably the best compromise between accuracy and performance for general purpose use (other edit-distance algorithms should still be implemented). However there's two pitfalls when finding a good implementation for this. As a result, lifting an existing implementation isn't likely.

  1. The overwhelming majority of implementations actually implement Optimal String Alignment, which doesn't return the same results. OSA is not the same algorithm, although it is very similar.

  2. The few that implement DL implement it incorrectly. I've emailed the authors about this, and have received no response within a week, so I'm not particularly hopeful about that getting fixed.

So pretty par-for-the-norm with Comp. Sci. stuff.

These may be useful as far as a good implementation goes.

Hyyrö, H. A Bit-Vector Algorithm for Computing Levenshtein and Damerau Edit Distances. Nord. J. Comput. 2002.

Balhaf, K. Alsmirat, M. et al. Accelerating Levenshtein and Damerau edit distance algorithms using GPU with unified memory. ICICS. 2017.