Closed carolina-museum closed 11 hours ago
Here is a recent survey paper that thoroughly overviews Retrieval-Augmented Generation (RAG).
@article{gao2023retrieval, title={Retrieval-augmented generation for large language models: A survey}, author={Gao, Yunfan and Xiong, Yun and Gao, Xinyu and Jia, Kangxiang and Pan, Jinliu and Bi, Yuxi and Dai, Yi and Sun, Jiawei and Wang, Haofen}, journal={arXiv preprint arXiv:2312.10997}, year={2023} } Link: https://arxiv.org/abs/2312.10997
Abstract:
Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination, outdated knowledge, and non-transparent, untraceable reasoning processes. Retrieval-Augmented Generation (RAG) has emerged as a promising solution by incorporating knowledge from external databases. This enhances the accuracy and credibility of the generation, particularly for knowledge-intensive tasks, and allows for continuous knowledge updates and integration of domain-specific information. RAG synergistically merges LLMs' intrinsic knowledge with the vast, dynamic repositories of external databases. This comprehensive review paper offers a detailed examination of the progression of RAG paradigms, encompassing the Naive RAG, the Advanced RAG, and the Modular RAG. It meticulously scrutinizes the tripartite foundation of RAG frameworks, which includes the retrieval, the generation and the augmentation techniques. The paper highlights the state-of-the-art technologies embedded in each of these critical components, providing a profound understanding of the advancements in RAG systems. Furthermore, this paper introduces up-to-date evaluation framework and benchmark. At the end, this article delineates the challenges currently faced and points out prospective avenues for research and development.
The summarized main points of this paper that Carolina wants to share with other members are listed here.
⭐️Figure 2 shows the overview of RAG architecture. Here is how it works:
⭐️Table 1 is the summary of RAG methods. The columns are: Method Retrieval, Source Retrieval, DataType, Retrieval Granularity, Augmentation Stage, Retrieval process From this table, I learned:
⭐️Process in RAG: Indexing, Retrieval, Generation
⭐️Challenges in RAG
⭐️Improving naive RAG
- pre-retrieval
- enhancing data granularity, optimizing index structures, adding metadata, alignment optimization, and mixed retrieval
- query rewriting query transformation, query expansion
- post-retrieval
- rerank chunks and context compressing
- selecting the essential information, emphasizing critical sections, and shortening the context to be processed
- Add modules
- Search module: look for data efficiently
- Memory module: save data in a specific way so that the search is easier
- Predict module: remove unnecessaries like similar information and extra information
- Task Adapter module: allow zero-shot inputs, without being task specific
- New Patterns
- update and improve the model by rewriting the prompt using feedback
- complex search unit (keyword, semantic, vector)
- flexible architecture, replacing a part of the model to adapt to each use case
I will keep reading this paper and add more information. I will also look into other sources for important concepts for RAG.
"evaluation framework and benchmark" can be an interesting point to dig in. I was first thinking that validation can be done by our intuition, but if there is any unbiased way, that would make our model stronger and more persuasive.
Continuing to the previous comment on a recent survey paper, here is the summary of Section 4, Task and Evaluation.
Article{gao2023retrieval, title={Retrieval-augmented generation for large language models: A survey}, author={Gao, Yunfan and Xiong, Yun and Gao, Xinyu and Jia, Kangxiang and Pan, Jinliu and Bi, Yuxi and Dai, Yi and Sun, Jiawei and Wang, Haofen}, journal={arXiv preprint arXiv:2312.10997}, year={2023} } Link: https://arxiv.org/abs/2312.10997
⭐️Table 2 lists sub-tasks, datasets, and methods. The sub-tasks are Question Answering (QA), Dialog, Information Extraction, Reasoning, and others. In our project, QA and Information Extraction are important.
⭐️Evaluation Target
RALLE is an evaluation tool for RAG that uses the above metrics to evaluate RAG applications.
Evaluation objectives:
⭐️Evaluation Aspects
⭐️Evaluation Benchmarks and Tools
How about we add these notes as Markdown in the repository? maybe we can close this issue that way
That is a great idea! I will make a markdown file for this topic.
This is a thread for Carolina to summarize her research on RAG. The purpose is to share the information among project members.