jeroendv / ComicUpdater

Comic Updater is a Firefox and Chrome extension that allows you to easily update your web comic bookmarks
0 stars 0 forks source link

improved matching algorithm #2

Closed jeroendv closed 11 years ago

jeroendv commented 11 years ago

simply determining number of matching characters starting from the very first is not always satisfactory.

http://somesite.com/1234/HarryPotter/page1
http://somesite.com/1235/HarryPotter/page2

The match between these two url is much more then just the initial

http://somesite.com/123

What in the case that some characters are added or deleted?

http://somesite.com/1234/HarryPotter/page1
http://www.somesite.com/12358/HarryPotter/page2

This would even fail because there are less then 10 initial characters that match, while it is obvious for any human that the match between these two url's is very big.

\cc @ianhanniballake

jeroendv commented 11 years ago

The levenshteinDistance branch adds a fuzzyMatch algorithm that does exactly this.