huggingface / huggingface.js

Utilities to use the Hugging Face Hub API
https://hf.co/docs/huggingface.js
MIT License
1.37k stars 213 forks source link

[`integration`] Add `bm25s` library for bm25 retrieval models/indices #763

Closed tomaarsen closed 3 months ago

tomaarsen commented 3 months ago

Hello!

Pull Request overview

Links

Details

BM25S is a very new library for efficient BM25: an important algorithm for full-text search. It can be combined with vector search (e.g. Sentence Transformers' domain) for hybrid search, which is commonly used and very powerful. I'm considering mentioning this library more in Sentence Transformers and its example/docs, so it'll be nice to get the "Use in Library" button & download count for these models/indices. I've set filter to False as it's not a big library currently.

cc @xhluca for context: this should add a "Use this model" button in the top right of all models on the Hub that have a bm25s tag in the model card metadata. It'll then show the following snippet:

from bm25s.hf import BM25HF

retriever = BM25HF.load_from_hub("${model.id}")
xhluca commented 3 months ago

Lgtm! Thanks for the pr.