LlamaEdge / rag-api-server

A RAG API server written in Rust following OpenAI specs
https://llamaedge.com/docs/user-guide/server-side-rag/quick-start
Apache License 2.0
31 stars 8 forks source link

Support multi-pass RAG search #25

Closed juntao closed 3 days ago

juntao commented 1 month ago

The current approach to search only the last user message for RAG content is too simplistic, especially in multi-turn conversations or in agentic apps where the agent automatically adds or re-phrases the last user message.

I think we need to combine the last 3 to 5 user messages together, and perform a second search pass. The highest scored vectors from both searches will be selected for the context.

apepkuss commented 3 days ago

Fixed in 0.9.14