luozhouyang / python-string-similarity

A library implementing different string similarity and distance measures using Python.
MIT License
991 stars 127 forks source link

Convert distance to probability #6

Closed jtlz2 closed 5 years ago

jtlz2 commented 5 years ago

This is a broader question - but do you have any insight on how to convert e.g. Levenshtein distance to probability? I want to combine the edit distance with prior information on the background population - and it's not clear to me how to combine the two metrics. Thanks!

sangmoon commented 5 years ago

Hi I am a github traveler, and saw your question by chance. I become curious too and look up SO.

https://stackoverflow.com/questions/955110/similarity-string-comparison-in-java/16018452#16018452

It says there can be several ways to change edit distance to probability, it's a common way(used in many library) to to measure how much (in %) you'd have to change the longer string to turn it into the shorter.

I hope this help you!

luozhouyang commented 5 years ago

Thanks @sangmoon. @jtlz2 you can have a try.