seatgeek / fuzzywuzzy

Fuzzy String Matching in Python
http://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/
GNU General Public License v2.0
9.2k stars 878 forks source link

Fix #307: inconsistent preprocessing #308

Open Thijsvandepoll opened 3 years ago

Thijsvandepoll commented 3 years ago

This will solve issue #307, which caused inconsistent preprocessing steps. This resulted in the possibility that a self-comparison could result in another score than 100. For example,

list(process.extractWithoutOrder("杰弗里·S·布里特", ["杰弗里·S·布里特"], scorer=fuzz.token_set_ratio))

[('杰弗里·S·布里特', 50)]

Which is unexpected.