daneran / duke

Automatically exported from code.google.com/p/duke
0 stars 0 forks source link

Weighted Levenshtein #44

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
We need to be able to treat some edits as larger than others. Particularly 
edits involving numbers are important.

Original issue reported on code.google.com by lar...@gmail.com on 1 Oct 2011 at 6:46

GoogleCodeExporter commented 8 years ago

Original comment by lar...@gmail.com on 5 Nov 2011 at 10:08

GoogleCodeExporter commented 8 years ago
This turns out to be a well-known variation on Levenshtein: 
http://nlp.stanford.edu/IR-book/html/htmledition/edit-distance-1.html

Original comment by lar...@gmail.com on 22 Nov 2011 at 8:04

GoogleCodeExporter commented 8 years ago
The Needleman-Wunsch algorithm is what we need: 
http://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm 
http://alias-i.com/lingpipe/demos/tutorial/stringCompare/read-me.html

A good description of the algorithm is here: 
http://www.cs.toronto.edu/~brudno/csc2427/Lec7Notes.pdf

It seems to have high complexity, but we ignore that.

Original comment by lar...@gmail.com on 11 Mar 2012 at 12:29

GoogleCodeExporter commented 8 years ago
Implemented now.

Original comment by lar...@gmail.com on 25 Jul 2012 at 12:24