issues
search
xhluca
/
bm25s
Fast lexical search library implementing BM25 in Python using Numpy and Scipy
https://bm25s.github.io
MIT License
761
stars
29
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add tests for BM25.retrieve in different scenario (tokenized, ids/vocab tuple, object with ids and vocab attributes, ids, strings)
#52
xhluca
opened
1 day ago
0
Improve tokenizer
#51
xhluca
closed
1 day ago
0
Add weight mask that are applied to scores during retrieval
#50
xhluca
closed
5 days ago
1
Replace ujson with orjson, add load and close for `jsonlcorpus`
#49
xhluca
closed
5 days ago
0
Refactor retrieval to make it faster to run in numba mode
#47
xhluca
closed
6 days ago
4
Refactor tests to be ran in different jobs
#45
xhluca
closed
3 weeks ago
0
Add type hint for `texts` argument in `tokenize` function, use `time.monotonic` instead of `time.time`
#44
dantetemplar
closed
3 weeks ago
1
Use `time.monotonic` instead of `time.time`
#43
dantetemplar
closed
3 weeks ago
1
Maybe use `time.monotonic` instead of `time.time`?
#42
dantetemplar
closed
3 weeks ago
1
Add numba integration to allow for faster scoring and retrieval
#41
xhluca
closed
3 weeks ago
0
[feature request] Implement BMX algorithm
#40
logan-markewich
opened
3 weeks ago
3
Consider orjson as faster and more robust alternative to ujson
#39
xhluca
closed
5 days ago
1
Thread safe search
#37
okhat
closed
1 month ago
3
[Feature request] Document metadata and filtering
#35
dl423
closed
5 days ago
3
How to apply bm25s to languages such as Chinese?
#34
AlanLu0808
closed
1 month ago
2
Add stopwords for 10 new languages
#33
bm777
closed
3 weeks ago
14
Other language than english for the stopwords list
#32
bm777
closed
3 weeks ago
2
On-the-fly stemming
#31
xhluca
closed
1 day ago
1
Bug fix and add link
#28
xhluca
closed
2 months ago
0
🚨Before submitting an issue, read this 🚨
#27
xhluca
closed
1 month ago
0
Update dev-0.1 branch
#24
xhluca
closed
2 months ago
0
Update branch
#23
xhluca
closed
2 months ago
0
可以增量更新索引吗?
#22
bojone
closed
2 months ago
0
Can you query without a tokenization step?
#21
snewcomer
closed
2 months ago
0
how to dynamic add/delete documents
#19
luoyangen
closed
2 months ago
0
[Feature Request] Support attaching metadata to the corpus
#18
logan-markewich
closed
2 months ago
3
Not Working for langchain Documents
#16
pradhandebasish2046
closed
2 months ago
0
Minor bug: `show_progress` not propagated in `BM25.index`
#15
ValeKnappich
closed
2 months ago
1
Pre-computed TF-IDF
#9
celsofranssa
closed
2 months ago
0
Capability Inquiry: Retrieving Specific JSON Records Based on Text
#8
RakshitKhajuria
closed
2 months ago
4
Using with postgres?
#7
Tejaswgupta
closed
2 months ago
3
Improve readme generated when saving to huggingface
#6
xhluca
closed
2 months ago
0
Updating an index for batch indexing
#5
fabiannagel
closed
2 months ago
14
Order-based matching of corpus metadata to to tokens
#4
fabiannagel
closed
2 months ago
2
[`compat`] Allow for local install on Windows
#3
tomaarsen
closed
2 months ago
1
[`hf`] Add "library_name" metadata to avoid confusion about what the primary library is
#2
tomaarsen
closed
2 months ago
0
Initial Release
#1
xhluca
closed
2 months ago
0