rljacobson / Levenshtein

A Blazingly Fast Damerau–Levenshtein Edit Distance Function (UDF) for MySQL
MIT License
24 stars 3 forks source link

Wrong calculation (not Damerau) #6

Closed CoE2013 closed 2 years ago

CoE2013 commented 4 years ago

Just spend some hours to get in working, then the firs test failed. Looks like the logic is not a real Damerau-distance, e.g.

grafik

Another, correct implementation shows distance correctly: grafik

rljacobson commented 4 years ago

Thanks, will check it out and add it to my test suite when I get a chance.

stiivo commented 4 years ago

hi,

what are the chances for a fix of this issue in the near feature?

Validark commented 3 years ago

I can't for the life of me get this test suite working. Could someone try running damlev("pension", "penitence")? I believe that this library gives an answer of 6, but the answer should be 5.

sjlevy commented 2 years ago

I just tried using this implementation, damlev("pension", "penitence") does correctly return 5

If you are having trouble compiling-- try getting rid of everything under ### Testing and Benchmarking ### in CMakeLists.txt that way you don't have to download the large boost library (boost isn't clearly listed as a requirement)

arhyneRWU commented 2 years ago

We found the issues and corrected them. @sjlevy the damlev in the old version worked for most words but a few issues with the trimming and resetting of the vector size cause it to error in some cases.

Once @rljacobson accepts my pull request the issues with calculations are fixed.

rljacobson commented 2 years ago

Thanks to a herculean intellectual effort by @arhyneRWU, we think that we sorted out all the bugs. We also discovered bugs in some online calculators, so be careful comparing results across websites of unknown accuracy. Find a tool or library you know you can trust.

@CoE2013, @stiivo, @sjlevy, and @Validark: Thanks for the reports. This is why peer review is so important. If you still have the energy and the will, try it again and see if all your problems are fixed.