Open wassname opened 9 months ago
Here's an example of my draft in https://github.com/wassname/stampy-chat
NOTE: I'm using GPT4 in the second screenshot, so please compare the references not the writing
What's happening behind the scenes in the screenshot?
The user put in an initial query
Whats the differences between Inverse Reinforcement Learning, reward modelling, RLHF, and recursive reward modelling?.
It's transformed into a better query and and example answer using these prompts
Please draft an academic search query with synonyms and alternative phrases that will find documents to answer the following question: {query}"
and
"Please draft an concrete and concise example answer to the following question: {query}"
Then they are joined into the new 3 part query (user query, improved query, example)
There are nicer and better ways to do this, but hopefully it shows how improving retrieval can de-bottleneck stampy. And there are lots of low-hanging fruit here compared to your excellent dataset and UI work.
Impressive work, it's efficient and potent. Here's a suggestion.
The search is the critical component! It's the bottleneck for answering all queries, given you already possess a robust corpus.
Currently, you're using a standard vectordb search on the query. However, this approach has significant limitations:
EnsembleRetriever
Fortunately, Langchain offers modules for various retriever enhancements, and all you need to do is test them out. You can bundle multiple retrievers in an EnsembleRetriever.
In this scenario, you have a vectordb query match, but you might also want to implement a standard search like a BM25Retriever. It's cheap and can drastically improve your retrieval.
and potentially MultiVector since the document's content may differ from the questions the document could answer.
A concrete example:
Adv Pinecone features
You might also want to consider using the other retrieval feature of Pinecone: https://www.pinecone.io/learn/hybrid-search-intro/
Better embedding
You might also consider using the BEST embedding, as per this retrieval leaderboard https://huggingface.co/spaces/mteb/leaderboard the best ones are the `e5`` series of embedding because they specifically tie query and answer passages rather than just text and text.
Result re-ranking
in continue.dev they use LLM re-ranking of results. I'm not sure about this, but it's worth considering.