AmenRa / retriv

A Python Search Engine for Humans 🥸
MIT License
174 stars 20 forks source link

Dose this supports Chinese? #14

Closed phellonchen closed 1 year ago

AmenRa commented 1 year ago

Hi,

The dense retriever should work out-of-the-box by providing a multilingual model, such as this one sentence-transformers/distiluse-base-multilingual-cased-v1, or a model specifically trained for Chinese.

I think the sparse retriever works if you provide custom stemmer and tokenizer (if needed). A Chinese stop-word list is already provided.

Let me know if you have any luck!

Elias

AmenRa commented 1 year ago

Closing for inactivity.