improved matching algorithm

simply determining number of matching characters starting from the very first is not always satisfactory.

http://somesite.com/1234/HarryPotter/page1
http://somesite.com/1235/HarryPotter/page2

The match between these two url is much more then just the initial

http://somesite.com/123

What in the case that some characters are added or deleted?

http://somesite.com/1234/HarryPotter/page1
http://www.somesite.com/12358/HarryPotter/page2

This would even fail because there are less then 10 initial characters that match, while it is obvious for any human that the match between these two url's is very big.

\cc @ianhanniballake

jeroendv / ComicUpdater

improved matching algorithm #2