datatonic / duke

Automatically exported from code.google.com/p/duke
0 stars 0 forks source link

Optimize Levenshtein comparison #67

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
We can in many cases shortcut the Levenshtein comparison by working out what 
the maximum allowed edit distance is before we simply consider the strings 
different. If the length difference between the strings is greater than this we 
can just return without doing any comparison at all. Similarly, while computing 
the matrix we can stop at any time if we reach this number.

Need to do some profiling first to get an estimate of what the possible 
performance gain might be.

Original issue reported on code.google.com by lar...@gmail.com on 1 Mar 2012 at 9:42

GoogleCodeExporter commented 8 years ago
Now done, and testing shows significant performance gain.

Original comment by lar...@gmail.com on 5 Mar 2012 at 8:08