chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
14.64k stars 1.22k forks source link

[New Feature][Accuracy] Reranker #2283

Open atroyn opened 3 months ago

atroyn commented 3 months ago

Reranker

Re-ranking is a quick accuracy win, and as cheap computationally as sentence transformers.

Re-ranking can take account of the query as well as additional metadata when evaluating the relevance of results. For example, it’s fairly straightforward to add a weighted term to account for data recency.

It also provides a normalized measure of relevancy, allowing users to more easily filter on the relevancy of results.

API Design

# Get a re-ranker
reranker = chromadb.utils.rerankers.Reranker()

# API 1.

# Pass it as an additional argument to query:
result = collection.query(...., reranker=reranker)

# Return a new field, 'rank_scores' 
result.reranker_scores: List[List[Double]] # Score per retrived document per query

# API 2.
reranker.rerank(results, query_text) # return the query back? 

# API 3. 
results.rerank(reranker) 

[Complexity] Subtask

Yimi81 commented 2 months ago

any update?

atroyn commented 2 months ago

@Yimi81 We expect to release this as part of the current (v.0.6) milestone