Open Talon24 opened 5 years ago
I have also run into this inconsistency. It seems like they should use the same cost.
I am just a maintainer, not the original author. But please see the discussion here: https://stackoverflow.com/questions/14260126/how-python-levenshtein-ratio-is-computed
Ratio
is based on the InDel-Distance (only allows Insertions/Deletions), while the distance
is based on the uniform Levenshtein distance. I suppose this is done, so the results of ratio
are closer to the results of difflibs ratio function, while distance still allows the use of the normal uniform Levenshtein distance. I agree, that this can be surprising and the documentation should probably include a note on this difference in behavior.
Calling Levenshtein.ratio generates a different result than calculating the ratio by hand. Shown in this example
After reading through the code, i noticed you call
levenshtein_common
for the ratio, you increase the cost of the replace operation. Is there a special reason why the functions should calculate this differently?