not good for Chinese? - Githubissues

SeekPoint commented 6 years ago

suggestions = fuzzyfinder('可大讯飞', ['仅就第三季度而言，虽然科大讯飞管理费用与研发费用都在大幅提升，但两项之和与营收的比例为 24% ，去年同期的 25% 还要低一个百分点。因此，将>三季度扣非净利润降低，归咎于管理费用与研发费用的提升，显然不太恰当。']) print(list(suggestions)) [] ---expected [科大讯飞], only one Chinese character is different

amjith commented 6 years ago

This is brilliant!! I never tested this outside of english, but I'm glad you did.

Here's the problem, this library is not trying to find closest matches to the word you have typed by tolerating typos. This library is for trying to narrow a list of long strings by typing bits and pieces of a substring from the long string.

There is a blog post that explains how this works: https://blog.amjith.com/fuzzyfinder-in-10-lines-of-python

What you're looking for is fuzzywuzzy library which will find closest matches to what you have typed based on Levenshtein distance.

SeekPoint commented 6 years ago

ok, i'wll check, thanks

amjith / fuzzyfinder

not good for Chinese? #14