Closed pellcorp closed 4 years ago
This appears to be a good reference:
https://ilyankou.files.wordpress.com/2015/06/ib-extended-essay.pdf
Yep Ratcliff scores better in many cases than jaro winkler
There is a library which has an accurate implementation but its based on scala.
where is the link?
In a poc I did, I used maven coordinates: com.rockymadden.stringmetric:stringmetric-core:0.26.1
This corresponds to github project: https://github.com/rockymadden/stringmetric
I tested the Ratcliff/Obershelp impl in the stringmetric-core project against known good test data
Sorry for bumping this thread up, I ported .Net implementation of Ratcliff-Obershelp (by Ligi, a patch to fuzzystring) within my fork. I'm sorry, I'm a novice to both java and github, so I haven't made a pull request yet. I'll be glad if somebody could help test it. Thank you
@denmase If you can submit a PR, I'd be happy to help review it. BTW, I help run the port of this to .NET at https://github.com/feature23/StringSimilarity.NET - but we are a 100% port only, so we do not add new features that aren't added here first.
@paulirwin I'll submit a PR, but pardon me if the coding is not up to acceptable coding standard yet. I know StringSimilarity.NET as well, I use both actually, so thank you for SS.Net. If you want, I can try to add Ratcliff-Obershelp to it too. Thank you for the help.
Fixed in 4946f586712e4c91d12d766c62ae495db6506733 and PR #55
There is a library which has an accurate implementation but its based on scala.