-
https://github.com/HKUNLP/instructor-embedding
https://simonwillison.net/2023/Jan/13/semantic-search-answers/
-
FastEmbed should/can support sparse vector creation which is based on Bag of Words e.g. TF-IDF and BM25 Okapi. We can launch with existing Python implementations e.g https://pypi.org/project/rank-bm25…
-
**Describe the bug**
Recently this PR was merged that set `scale_scores` to False by default: https://github.com/deepset-ai/haystack/pull/6717
Tagging you here @shadeMe since you were involved in th…
-
https://github.com/nmslib/hnswlib/issues/442
https://github.com/castorini/pyserini
-
When searching in Tribler, search results (which can consist of torrents and channels) are ranked according to several criteria. Channels and torrents are sorted separately from each other. The associ…
-
`Weaviate` build-in [bm25 + vector hybrid](https://weaviate.io/developers/weaviate/search/hybrid#weight-keyword-vs-vector-results)
In short, the results is ranked by a weighted composite score of:
…
-
For the different retrievers, we use bm25 (https://en.wikipedia.org/wiki/Okapi_BM25), gpt-index simply uses `Davinci v1` from OpenAI to embed all the documents and do simple cosine simil…
-
Hi!
thanks for the wonderful work! During reading your paper, I'm confused about the document retrievers mentioned in your paper. You mentioned several of them, such as gpt and oracle. I cannot fin…
-
I have used BERT NextSentencePredictor to find similar sentences or similar news, However, It's super slow. Even on Tesla V100 which is the fastest GPU till now. It takes around 10secs for a query tit…
-
In the course of considering the list question at , I took a slightly-deeper look at `gensim.summarization` than before.
From that look, my opinion is that its presence is more likely to waste peo…