Closed xhluca closed 2 months ago
I wonder if it is possible to do invertex indexing here, by creating an array that tracks start and end: https://github.com/xhluca/bm25s/blob/daf29ceaa2fd77ca8601920502b7b8f05eb82be2/bm25s/scoring.py#L329-L352
Deprecate selection_backend in retrieve so that it happens at the object init time
In retrospective, it seems that selection_backend
remains useful for testing purposes, as well as using the jax backend. Let's not deprecate it in 0.2.0
Make tqdm work in _retrieve_internal_numba_parallel
Unfortuantely tqdm won't work, so we can't add progress bar to retrieve when backend is set to numba
Create a tokenizer class (perhaps in a separate PR? also should handle https://github.com/xhluca/bm25s/issues/31 at the same time)
Will do that in a separate PR
This a work in progress!
This PR will make numba mode faster by rewriting the entire retrieve process into a numba JIT-able function (see
_retrieve_internal_numba_parallel
)TODO:
retrieve_numba
to make it compatible withretrieve
whenBM25
object is initiatilized withbackend="numba"
selection_backend
inretrieve
so that it happens at the object init time_retrieve_internal_numba_parallel
_retrieve_internal_numba_parallel
selection
andnumba.selection
modules