perrette / papers

Command-line tool to manage bibliography (pdfs + bibtex)
MIT License
146 stars 22 forks source link

use rapidfuzz instead of fuzzywuzzy #15

Closed maxbachmann closed 4 years ago

maxbachmann commented 4 years ago

FuzzyWuzzy and Python-Levenshtein are both GPLv2 licensed which would force you to licence the whole project under GPLv2. I had the same problem on one of my projects and so I wrote rapidfuzz which is implementing the same algorithm but is based on a version of fuzzywuzzy that was MIT Licensed and is therefor MIT Licensed aswell, so it can be used in here without forcing a License change. As a nice bonus it is fully implemented in C++ and comes with a few Algorithmic improvements making it between 5 and 100 times faster than FuzzyWuzzy.

perrette commented 4 years ago

Thanks for the contribution. I was not very careful with the License. Sounds like a neat package. Any idea why the build fails on Travis? https://travis-ci.com/github/maxbachmann/papers/jobs/300908367

perrette commented 4 years ago

I suppose that has to do with: https://github.com/rhasspy/rapidfuzz/issues/3 ?

maxbachmann commented 4 years ago

I have absolutely no idea why it fails with travis. I did not really use travis lately, but it build fine on my desktop and in github actions running pip install. Apparently it fails to find pybind11, but pybind11 should generally be installed automatically (it is a required package in the setup.py of rapidfuzz.

maxbachmann commented 4 years ago

I did add some prebuild wheels or macos,linux and windows and python3.5-3.8 to pypi so travis does not has to compile it anymore and therefore has no problems installing it anymore.

Is it required to keep supporting python2.7? https://travis-ci.com/github/maxbachmann/papers/jobs/300957767 I suppose rapidfuzz could be made compatible with python2.7 but since it reached end of life in january I am not sure whether having python2.7 support is really of much worth

perrette commented 4 years ago

Hi @maxbachmann, that is excellent. I was already considering to remove 2.7 support for the reason you state, so that won't be a problem. Let me see these days how to proceed. Thanks again.

maxbachmann commented 4 years ago

As a note newer versions of rapidfuzz do now support Python 2.7 aswell ;)