Open PalaChen opened 5 years ago
The default scorer that is selected by process.extract is fuzz.Wratio
, which by default converts all non ascii characters to whitespaces and trims them. So in your case your comparing empty strings. So in your case use:
process.extract(u"星球大战", choices, scorer=fuzz.UWRatio)
or since you mentioned fuzz.ratio
process.extract(u"星球大战", choices, scorer=fuzz.ratio)
user python2 for example
` choices = [u"星球大战",u"5月4日星球大战", u"星球大戰", u"战大球星", u"星球大战游戏下"] process.extract(u"星球大战", choices)
[(u'星球大战', 0), (u'5月4日星球大战', 0), (u'星球大戰', 0), (u'战大球星', 0), (u'星球大战游戏下', 0)] `
but
fuzz.ratio(u"星球大战", u"星球大战1") 89