Open namesty opened 11 months ago
Some twitter nuggets:
- https://twitter.com/llama_index/status/1730320635279331524?t=8wzkjR83l5-3B92BDeWj2g&s=03
Introducing RAGs v3 🌟:
Build a RAG bot that can also search the web 🌐, to find answers that aren’t immediately in its corpus. Do this all in natural language, not code! 💬
Get a better experience than ChatGPT + Bing ⚡️
We used our integration with
@metaphorsystems
- a search engine designed for LLMs - to pull in relevant text from the internet.
- https://twitter.com/RLanceMartin/status/1727019896394207624
Deconstructing RAG
It can be hard to follow all of the RAG strategies that have come out over the past months.
I created a few guides to organize them into major themes and show how to build multi-modal / semi-structured RAG on complex docs (w/ images, tables).
Here's a few of the major themes
- https://twitter.com/philipvollet/status/1726577467320623465?t=u1vntWMYd-FTqJGg21T8yg&s=03
Learning to Filter Context for Retrieval-Augmented Generation
- https://twitter.com/andrejusb/status/1726249442779730367?t=c2NgS8Ig7MsU00iiE2-3Nw&s=03
Vector Database Impact on RAG Efficiency: A Simple Overview 🚀
- https://twitter.com/ZainHasan6/status/1725942162171232704?t=qWbrpib5Nt_ud5B0Hqemug&s=03
Fine-tuned larger language models and longer context lengths eliminate the need for retrieval from external knowledge/vector databases, right? ... Not quite!!
- https://twitter.com/bindureddy/status/1724965498100806140?t=RcLiGnC5CPj7Zs9ozLHGHQ&s=03
The Secret Sauce Behind Effective Retrieval (RAG) - Choosing The Right Embeddings!
- https://twitter.com/JonSaadFalcon/status/1724998954402935061?t=2A6Mm9banNiP1MsqYkfEzw&s=03
Excited to announce ARES: An Automatic Evaluation Framework for Retrieval-Augmented Generation (RAG)!
ARES allows you to evaluate your RAG approaches with minimal human annotations while providing statistical guarantees for LLM-based scoring.
- https://twitter.com/CShorten30/status/1725151347756990563?t=sq4HoHYav--vlhV3fJJiYQ&s=03
Multi-Index RAG is probably one of my top 5 favorite ideas out there -- it just makes a ton of sense intuitively the symbolically break a search into a composition of sources such as "Blogs" + "Documentation" + "Videos" + "Podcast".
- https://twitter.com/yi_ding/status/1721668276894527933?t=OXxTHHqdDTuwO2MEkdh-jg&s=03
A few thoughts on how OpenAI is implementing RAG in the new Assistants Retrieval tool before I was locked out.
- https://twitter.com/LangChainAI/status/1716482177331011833?t=hdH4BxxWdsdjXuiE6lCjPw&s=03
Step-back prompting
A new prompting technique from Google Deepmind, can be used to improve RAG results
Now in LangChain!
- https://twitter.com/jerryjliu0/status/1717692841571590538?t=e_pSY-lQkdxP6rdEmoDZAw&s=03
We’ve put out literally hundreds of guides on RAG and advanced retrieval in the past few months.
But if you’re just starting out, how do you best use them to build production LLM apps?
- https://twitter.com/jerryjliu0/status/1718368662955204739?t=HP4oMHJ1POuNsbM1yn1LgQ&s=03
Demystifying Advanced RAG Pipelines ☀️
showing you how to build an advanced RAG pipeline from scratch.
Query decomposition/planning
Hybrid vector/summary retrieval depending on the user question
- https://twitter.com/jerryjliu0/status/1705009416851087644?t=dTgyH5VCy9V8qjIm0x2TPQ&s=03
Here’s one cool trick to avoid lost in the middle problems in your RAG pipeline
Clickbaity title aside, the broader point here is that currently, low precision retrieval is still correlated with low RAG performance
- https://twitter.com/jerryjliu0/status/1734022664598294686?t=TcMDwrE5cq-RFvbosZnoBg&s=03
A good way to make your RAG pipeline more robust to retrieval failures is to do two-stage retrieval: 1) top-k embedding lookup, and then 2) reranking.
- https://twitter.com/jerryjliu0/status/1733192045202956768?t=uZlRlKTZIGDkom2jzodZ0g&s=03
An increasing use case in retrieval is not only fetching the top-k most similar chunks to queries, but exploring entity relationships.
- https://twitter.com/amanrsanger/status/1732145826963828997?s=03
At Cursor, we've built very high-quality retrieval datasets (for training embeddings/rerankers).
To do this, we use GPT-4 grading and the Trueskill ratings system (a better version of Elo)
Here’s how.. (1/10)
-https://twitter.com/Prince_Canuma/status/1732022224494760078?t=AW_urfPex-yxf5_iZTnxgQ&s=03
Our new RAG pipeline now hits 99% accuracy with <5 sec latency.
revamped our approach with Edge functions, smart filtering, and context-aware queries.
https://github.com/microsoft/autogen/blob/main/notebook/agentchat_graph_modelling_language_using_select_speaker.ipynb