gomyway1216 / rag

0 stars 2 forks source link

Research Notes on RAG #26

Closed carolina-museum closed 11 hours ago

carolina-museum commented 1 month ago

This is a thread for Carolina to summarize her research on RAG. The purpose is to share the information among project members.

carolina-museum commented 1 month ago

Here is a recent survey paper that thoroughly overviews Retrieval-Augmented Generation (RAG).

@article{gao2023retrieval, title={Retrieval-augmented generation for large language models: A survey}, author={Gao, Yunfan and Xiong, Yun and Gao, Xinyu and Jia, Kangxiang and Pan, Jinliu and Bi, Yuxi and Dai, Yi and Sun, Jiawei and Wang, Haofen}, journal={arXiv preprint arXiv:2312.10997}, year={2023} } Link: https://arxiv.org/abs/2312.10997

Abstract:

Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination, outdated knowledge, and non-transparent, untraceable reasoning processes. Retrieval-Augmented Generation (RAG) has emerged as a promising solution by incorporating knowledge from external databases. This enhances the accuracy and credibility of the generation, particularly for knowledge-intensive tasks, and allows for continuous knowledge updates and integration of domain-specific information. RAG synergistically merges LLMs' intrinsic knowledge with the vast, dynamic repositories of external databases. This comprehensive review paper offers a detailed examination of the progression of RAG paradigms, encompassing the Naive RAG, the Advanced RAG, and the Modular RAG. It meticulously scrutinizes the tripartite foundation of RAG frameworks, which includes the retrieval, the generation and the augmentation techniques. The paper highlights the state-of-the-art technologies embedded in each of these critical components, providing a profound understanding of the advancements in RAG systems. Furthermore, this paper introduces up-to-date evaluation framework and benchmark. At the end, this article delineates the challenges currently faced and points out prospective avenues for research and development.

The summarized main points of this paper that Carolina wants to share with other members are listed here.

⭐️Figure 2 shows the overview of RAG architecture. Here is how it works:

  1. Context texts are loaded into the database. They are embedded so that it is easier to search for the most relevant information.
  2. A user asks a question (inputs a query)
  3. The query is embedded and most related contexts are retrieved from the database. The distance between the embedded query and the embedded context determines the similarity between the query and the contexts.
  4. The k most related contexts(in text) are combined with the user's query(in text) to generate a better question (query in text) to feed to an LLM. Better question: a question that includes the context and asks to say "I don't know" if the information is not present in the context.
  5. The query generated from the last step is fed to an LLM model, and the LLM model gives the user an answer based on the query from the last step. The user receives this answer.

⭐️Table 1 is the summary of RAG methods. The columns are: Method Retrieval, Source Retrieval, DataType, Retrieval Granularity, Augmentation Stage, Retrieval process From this table, I learned:

⭐️Process in RAG: Indexing, Retrieval, Generation

⭐️Challenges in RAG

⭐️Improving naive RAG

  • post-retrieval
    • rerank chunks and context compressing
    • selecting the essential information, emphasizing critical sections, and shortening the context to be processed
  • Add modules
  • Search module: look for data efficiently
  • Memory module: save data in a specific way so that the search is easier
  • Predict module: remove unnecessaries like similar information and extra information
  • Task Adapter module: allow zero-shot inputs, without being task specific
  • New Patterns
  • update and improve the model by rewriting the prompt using feedback
  • complex search unit (keyword, semantic, vector)
  • flexible architecture, replacing a part of the model to adapt to each use case

I will keep reading this paper and add more information. I will also look into other sources for important concepts for RAG.

gomyway1216 commented 1 month ago

"evaluation framework and benchmark" can be an interesting point to dig in. I was first thinking that validation can be done by our intuition, but if there is any unbiased way, that would make our model stronger and more persuasive.

carolina-museum commented 1 month ago

Continuing to the previous comment on a recent survey paper, here is the summary of Section 4, Task and Evaluation.

Article{gao2023retrieval, title={Retrieval-augmented generation for large language models: A survey}, author={Gao, Yunfan and Xiong, Yun and Gao, Xinyu and Jia, Kangxiang and Pan, Jinliu and Bi, Yuxi and Dai, Yi and Sun, Jiawei and Wang, Haofen}, journal={arXiv preprint arXiv:2312.10997}, year={2023} } Link: https://arxiv.org/abs/2312.10997

⭐️Table 2 lists sub-tasks, datasets, and methods. The sub-tasks are Question Answering (QA), Dialog, Information Extraction, Reasoning, and others. In our project, QA and Information Extraction are important.

Screenshot 2024-10-02 at 11 08 45

⭐️Evaluation Target

RALLE is an evaluation tool for RAG that uses the above metrics to evaluate RAG applications.

Evaluation objectives:

⭐️Evaluation Aspects

⭐️Evaluation Benchmarks and Tools

huyfififi commented 1 month ago

How about we add these notes as Markdown in the repository? maybe we can close this issue that way

carolina-museum commented 1 month ago

That is a great idea! I will make a markdown file for this topic.