Closed grudzien closed 7 years ago
After a bit of debugging it is hitting the max_distance check in the C code. I am trying to figure out why.
-- edit I tried to set max_distance manually to 15 (it is usually set to 10). When I set it to 15 all checks complete correctly. I am currently investigating why max_distance gets set to 10 and why current_distance goes to 11 on my examples, causing the code to return erroneously.
-- edit my fault. I didn't read the docs closely enough.
I was just looking at this module and I noticed that for larger strings there appears to be an issue calculating the dl distance. For example:
DamerauLevenshtein.distance("aaliyahmarie", "rvolnwkxcqefqbthwxfrskwchnmqoibgbvosrdyyuswdyqtiez") => 11
This is incorrect as shows on: http://fuzzy-string.com/Compare/Transform.aspx?r=aaliyahmarie&q=rvolnwkxcqefqbthwxfrskwchnmqoibgbvosrdyyuswdyqtiez
It's supposed to be 44.
Here are some other examples: DamerauLevenshtein.distance("aaliyahmarie", "mfhdfvbfykinhnhkhcubdjjqgjdcaazjvuijpdodsulveprmmh") => 11 (should be 45) DamerauLevenshtein.distance("aaliyahmarie", "hxswuwpeahyulhwnjxsdtopawjqbcytjkrszhfonapovxyyazr") => 11 (should be 45)
Here is an example where "aaliyahmarie" doesn't break at long strings: DamerauLevenshtein.distance("aaliyahmarie", "ukcmvhgxewuliefeinkftyjbqgvymzzzqmlzeuwjjquxvrtjic") => 44 (Correct)
Another example with a different first string: DamerauLevenshtein.distance("constantine", "qzmwsgylrorhimtpsmbesmzerbkiteyhityhkmimtvysdhuize") => 11 (Should be 44)
Not sure why all of the wrong answers are 11, maybe it is a coincidence from the string size I picked for the second string.
I was able to replicate this bug across multiple machines (Both Mac)