xhluca / bm25s

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
https://bm25s.github.io
MIT License
920 stars 39 forks source link

Add saving and loading corpus/stopwords to `Tokenizer` and add integration to HF Hub via `bm25s.hf.TokenizerHF` (save/load) #59

Closed xhluca closed 2 months ago

xhluca commented 2 months ago