xhluca / bm25s

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
https://bm25s.github.io
MIT License
920 stars 39 forks source link

Get sparse embedding functionality #68

Closed lspataroG closed 1 month ago

lspataroG commented 1 month ago

Hello @xhluca, First of all thank you so much for releasing this great library! I have a question, is it currently supported to retrieve transformed vectors from tokens (or words)? I am thinking in the style of Sklearn TfidfVectorizer transform. The reason I am asking this is that I would like to use the feature vectors directly. In particular having the functionality to infer sparse vectors from new queries/documents.

Thank you very much!