We just found that fuzz.WRatio() gives different results depending if python-Levenshtein is installed or not.
Given the warning message when importing fuzzywuzzy.fuzz and python-Levenshtein is not installed:
/usr/local/lib/python3.6/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
The user will think that the difference is purely in terms of speed, which is not.
Versions used:
fuzzywuzzy==0.18.0
python-Levenshtein==0.12.0
Example of score differences :
>>> from fuzzywuzzy import fuzz
/usr/local/lib/python3.6/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
>>> fuzz.WRatio('Copia', 'electronica')
50
>>> from fuzzywuzzy import fuzz
>>> fuzz.WRatio('Copia', 'electronica')
54
We strongly suggest to specify in the warning message that results could differ between the "pure-python SequenceMatcher" and the python-Levenshtein version:
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning. Results can be different between SequenceMatchers')
We just found that
fuzz.WRatio()
gives different results depending ifpython-Levenshtein
is installed or not.Given the warning message when importing
fuzzywuzzy.fuzz
andpython-Levenshtein
is not installed:The user will think that the difference is purely in terms of speed, which is not.
Versions used:
Example of score differences :
We strongly suggest to specify in the warning message that results could differ between the "pure-python SequenceMatcher" and the python-Levenshtein version: