xhluca / bm25s

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
https://bm25s.github.io
MIT License
862 stars 35 forks source link

How to apply bm25s to languages such as Chinese? #34

Closed AlanLu0808 closed 3 months ago

AlanLu0808 commented 3 months ago

I think the bm25s library is great and very efficient. I would like to use it in my project.

But how to apply bm25s to languages such as Chinese? Can you provide some examples?

bm777 commented 3 months ago

It works for Chinese language too. just use a Chinese tokenizer and stemmer.

bm777 commented 3 months ago

@AlanLu0808 have a look here: #33