GlobalNamesArchitecture / damerau-levenshtein

Calculates edit distance using Damerau-Levenshtein algorithm
MIT License
135 stars 19 forks source link

Weighted Levenshtein #18

Closed prebours closed 3 years ago

prebours commented 4 years ago

Most existing Levenshtein libraries are not very flexible: all edit operations have cost 1.

However, sometimes not all edits are created equal. For instance, if you are doing OCR correction, maybe substituting '0' for 'O' should have a smaller cost than substituting 'X' for 'O'. If you are doing human typo correction, maybe substituting 'X' for 'Z' should have a smaller cost, since they are located next to each other on a QWERTY keyboard.

There is an implementation in Python here. It would be great to support a way to add weights.

Is it something you would consider?

dimus commented 3 years ago

It would be useful functionality, but I guess it should be a different gem, because the name of this gem pretty much defines its scope. I hope someone will make such library in Ruby.