jamesturk / jellyfish

🪼 a python library for doing approximate and phonetic matching of strings.
https://jamesturk.github.io/jellyfish/
MIT License
2.07k stars 158 forks source link

damerau_levenshtein_distance fails occasionally #25

Closed scrimshander closed 10 years ago

scrimshander commented 10 years ago

I'm getting an index out of range error on a specific combination of two strings that is hard to reproduce with other string pairs. I've tried quickly to debug it, but I just can't wrap my head around the algorithm quickly enough. This works in v0.2.2.

damerau_levenshtein_distance('cape sand recycling ', 'edith ann graham') --> list index out of range error, line 34, in _levenshtein_distance

Note the space at the end of string #1. If the strings are reversed, no error is thrown.

jamesturk commented 10 years ago

Thanks for this, I'll try and look into it asap

On Tue, Aug 5, 2014 at 5:16 PM, scrimshander notifications@github.com wrote:

I'm getting an index out of range error on a specific combination of two strings that is hard to reproduce with other string pairs. I've tried quickly to debug it, but I just can't wrap my head around the algorithm quickly enough. This works in v0.2.2.

damerau_levenshtein_distance('cape sand recycling ', 'edith ann graham') --> list index out of range error, line 34, in _levenshtein_distance

Note the space at the end of string #1 https://github.com/sunlightlabs/jellyfish/issues/1. If the strings are reversed, no error is thrown.

— Reply to this email directly or view it on GitHub https://github.com/sunlightlabs/jellyfish/issues/25.

jamesturk commented 10 years ago

this is fixed in source, sorry for the long delay.