maxharlow / csvmatch

🔎 Finds fuzzy matches between CSV files
Other
183 stars 22 forks source link

Add other fuzzy matching algorithms #8

Open maxharlow opened 7 years ago

maxharlow commented 7 years ago

There's a list here: http://ntz-develop.blogspot.co.uk/2011/03/fuzzy-string-search.html

maxharlow commented 6 years ago

Sources:

maxharlow commented 6 years ago

Includes Jaro as of 1.14

maxharlow commented 5 years ago

Locality sensitive hashing? Was used here: https://artificialinformer.com/issue-one/dissecting-a-machine-learning-powered-investigation.html

maxharlow commented 4 years ago

Fellegi-Sunter? https://github.com/moj-analytical-services/sparklink

maxharlow commented 4 years ago

Worth evaluating methods from here: https://github.com/J535D165/recordlinkage

maxharlow commented 4 years ago

Also: https://github.com/Bergvca/string_grouper (and associated blog post)

maxharlow commented 4 years ago

https://github.com/bradhackinen/nama

maxharlow commented 4 years ago

https://github.com/Living-with-machines/DeezyMatch

maxharlow commented 3 years ago

https://github.com/robertknight/approx-string-match-js

maxharlow commented 3 years ago

https://github.com/anhaidgroup/deepmatcher https://github.com/anhaidgroup/py_entitymatching https://github.com/anhaidgroup/py_stringmatching https://github.com/anhaidgroup/py_stringsimjoin

maxharlow commented 3 years ago

https://github.com/RaRe-Technologies/gensim/pull/3146

maxharlow commented 3 years ago

https://github.com/lukewhyte/textpack

maxharlow commented 2 years ago

https://opensanctions.org/articles/2021-11-11-deduplication/

maxharlow commented 1 year ago

https://github.com/dgrtwo/fuzzyjoin/issues/86

maxharlow commented 1 year ago

https://github.com/stanfordnlp/string2string