ManifoldRG / Manifold-KB

This repository serves as a knowledge base with key insights, details from other research and implementations to serve as references and one place to document various possible paths to achieve something.
GNU General Public License v3.0
4 stars 0 forks source link

AF Survey - Active Retrieval Augmented Generation #9

Open pranavguru opened 10 months ago

pranavguru commented 10 months ago

Paper title: Active Retrieval Augmented Generation (link to paper) Estimated time to complete the review: by 08/28/23 If you are new to Manifold, here are some helpful links:

pranavguru commented 10 months ago

Active Retrieval Augmented Generation

Review author: Pranav Guruprasad

Summary:

The authors of this paper propose FLARE - Forward-Looking Active Retrieval augmented generation, a retrieval-augmented generation (RAG) method which iteratively uses the prediction of the next sentence to retrieve relevant documents necessary, only when the predicted next sentence contains low-confidence tokens. With Large Language Models (LLMs) demonstrating abilities in complex tasks that involve long-form text generation such as long-form QA, open-domain summarization, Chain-of-Thought (CoT) reasoning, etc. they require gathering knowledge throughout the generation process, just like how humans gradually gather information for complex tasks such as writing papers, essays, books, etc. FLARE aims to provide an efficient and intelligent method to achieve this, improving upon static and fixed interval RAG methods.

The authors propose two FLARE methods, FLARE instruct and FLARE direct.

Inspired by Toolformer, in the FLARE instruct method, the LLMs are shown exemplars to generate a “[Search(query]” token when additional information is required, which brings up 2 issues:

The authors address these 2 issues using two methods:

However, the authors realize that since fine-tuning on black-box LLMs is not interpretable, queries generated by FLARE instruct through retrieval instructions might not be reliable. This leads them to propose FLARE direct.

In the FLARE direct method, at step t, the LM first generates a temporary next sentence without conditioning on retrieved documents. If the LLM is confident about this temporary next sentence, it is accepted for the next step of the task without retrieving additional information. If not, the temporary sentence is used to retrieve documents using 2 methods:

For retrieval, the authors use off-the-shelf retrievers such as BM25.

FLARE outperforms all base-lines on various tasks/datasets such as 2WikiMultihopQA, STrategyQA, ASQA, ASQA-hint, and WikiAsp. The authors also conduct extensive ablation studies to analyze the importance of forward-looking retrieval, the importance of active retrieval, and the effectiveness of different query formulation methods.

Motivation:

Experiments and Results:

Limitations:

Significance:

Future work:

Related work:

Paper link: Active Retrieval Augmented Generation