jamesturk / jellyfish

🪼 a python library for doing approximate and phonetic matching of strings.
https://jamesturk.github.io/jellyfish/
MIT License
2.07k stars 158 forks source link

Similarity functions for damerau_levenshtein, levenshtein and hamming #60

Closed J535D165 closed 7 years ago

J535D165 commented 8 years ago

Hello Jamesturk,

What are your thoughts about adding similarity functions for damerau_levenshtein, levenshtein and hamming? Currently, only distance functions are available for these algorithms. I think, most of the applications use similarity functions.

What about: damerau_levenshtein_similarity levenshtein_similarity hamming_similarity

Comparable with the R-package https://github.com/markvanderloo/stringdist.

I can make the Python ones if you like. C version does not look that hard either.

Kind regards, Jonathan

jamesturk commented 8 years ago

Is there an explanation somewhere of how those similarities are used/calculated? I'm not super familiar w/ R but it looks like they're doing (score / max_score)?

On Mon, Sep 12, 2016 at 4:51 PM, Jonathan de Bruin <notifications@github.com

wrote:

Hello Jamesturk,

What are your thoughts about adding similarity functions for damerau_levenshtein, levenshtein and hamming? Currently, only distance functions are available for these algorithms. I think, most of the applications use similarity functions.

What about: damerau_levenshtein_similarity levenshtein_similarity hamming_similarity

Comparable with the R-package https://github.com/markvanderloo/stringdist.

I can make the Python ones if you like. C version does not look that hard either.

Kind regards, Jonathan

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jamesturk/jellyfish/issues/60, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAfYrqTgvxP3javi8yeK8xeAGVV-E7Cks5qpbtWgaJpZM4J7BHp .

jamesturk commented 7 years ago

going to close this out, feel free to reopen if there's still interest/progress