snexus / llm-search

Querying local documents, powered by LLM
MIT License
421 stars 51 forks source link

Plans for RRF? #82

Open saswat0 opened 5 months ago

saswat0 commented 5 months ago

Hey @snexus Are there any plans of including an option for [RRF](Reciprocal Rank Fusion) along with Marco and BGE for reranking?

snexus commented 5 months ago

Hi @saswat0

It crossed my mind as well, are you aware of any benefits of RRF compared to cross-encoder (besides speed)? In RRF, there are manual hyper-parameters that the user will need to adjust, such as weights dedicated to sparse vs dense results.

saswat0 commented 5 months ago

@snexus I've found hardly any upside except computational efficiency. In staging setups, it gives us an additional degree of controllability (weighing) against various retrievers. But when using a non-embeddings based sparse retriever (BM25, tf-idf), I found RRF to be a better bet.

snexus commented 5 months ago

Makes sense, think it is worth implementing in sake of feature completeness, not as a priority though