langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
37.82k stars 5.14k forks source link

Configurable Weighting for Hybrid Search (Full-Text and Vector Search) #4466

Open sunxichen opened 2 months ago

sunxichen commented 2 months ago

Self Checks

1. Is this request related to a challenge you're experiencing?

Yes, I am currently facing a challenge with the hybrid search functionality in the project. I've noticed that the influence of full-text search (keyword search?) on the hybrid search results is more significant than I would like it to be. In some instances, I would prefer to give more weight to the vector search component to refine the relevancy of the results. Unfortunately, there is no existing feature that allows me to adjust the weighting between full-text search and vector search.

2. Describe the feature you'd like to see

I would like to propose the addition of a feature that allows users to configure the weights assigned to full-text search and vector search within the hybrid search function. This feature would enable users to fine-tune how much each search component (full-text and vector) contributes to the final search results. Ideally, this could be implemented as a simple interface where users can input numerical values or use a slider to adjust the weights for each search type.

3. How will this feature improve your workflow or experience?

The ability to adjust the weights between full-text search and vector search would greatly enhance the precision and relevancy of search results for users. In my case, it would allow me to diminish the impact of full-text search when necessary, and prioritize the results from the vector search, which might be more pertinent to the specific context I am working with. This flexibility would significantly improve the user experience by providing more control over the search results and ensuring that they are as relevant as possible to the user's intent.

4. Additional context or comments

No response

5. Can you help us with this feature?

ifsheldon commented 1 month ago

+1. Now hybrid search requires reranking, which incurs a lot of latency. With this weighting, we will be able to get rid of rerank models.