MaartenGr / PolyFuzz

Fuzzy string matching, grouping, and evaluation.
https://maartengr.github.io/PolyFuzz/
MIT License
733 stars 67 forks source link

What about chinese matching?Could you have the will to develop for it? #8

Closed YINGPENGZH closed 3 years ago

MaartenGr commented 3 years ago

I am not familiar with the nuances surrounding fuzzy string matching for the Chinese language. Having said that, you can use pre-trained embeddings to perform the string matching:

from polyfuzz import PolyFuzz
from polyfuzz.models import Embeddings
from flair.embeddings import WordEmbeddings

chinese_embedding = WordEmbeddings('zh')
matcher = Embeddings(chinese_embedding, min_similarity=0)

model = PolyFuzz(matcher)