RobinL / fuzzymatcher

Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4
MIT License
281 stars 60 forks source link

use rapidfuzz instead of fuzzywuzzy #53

Closed maxbachmann closed 3 years ago

maxbachmann commented 4 years ago

FuzzyWuzzy is GPLv2 licensed which would force you to licence the whole project under GPLv2. I had the same problem on one of my projects and so I wrote rapidfuzz which is implementing the same algorithm but is based on a version of fuzzywuzzy that was MIT Licensed and is therefor MIT Licensed aswell, so it can be used in here without forcing a License change. As a nice bonus it is fully implemented in C++ and comes with a few Algorithmic improvements making it between 5 and 100 times faster than FuzzyWuzzy.