Open tang-hi opened 2 years ago
It would be nice to see some benchmarks comparing them.
I have implemented the Damerau–LevenshteinDistance(Optimal string alignment distance), and do some benchmark on them. It seems like Damerau–LevenshteinDistance is 2x slower than LevenshteinDistance.Do you think it's an acceptable change? If the performance is ok, I will clean the code and raise a pr
Thinking about this a bit more, I'm reluctant to replace the current implementation with something twice as slow with vague real-world advantages. I think a better approach would be to make the edit distance function pluggable (with the current LevenshteinDistance
function as a default).
I will investigate how this can be done without breaking changes (hopefully). And afterwards maybe you can contribute a DamerauLevenshteinDistance
function? What do you think?
Thinking about this a bit more, I'm reluctant to replace the current implementation with something twice as slow with vague real-world advantages. I think a better approach would be to make the edit distance function pluggable (with the current
LevenshteinDistance
function as a default).I will investigate how this can be done without breaking changes (hopefully). And afterwards maybe you can contribute a
DamerauLevenshteinDistance
function? What do you think?
I think it's good idea.I'm looking forward that
Hi, Damerau–LevenshteinDistance is the minimum number of edit operations ( insertions, deletions ,substitutions and transposition) required to change one word to another.It has one more action(transposition) compared to LevenshteinDistance. I think Damerau–LevenshteinDistance is more practical than LevenshteinDistance. And it's time complexity is also O(N^2), but unfortunately it's space complexity is O(N^2) while LevenshteinDistance's space complexity is O(N)
Do you think it's an acceptable change? If so, I would like to raise a PR to change LevenshteinDistance to Damerau–Levenshtein distance.