RAFT - Githubissues

https://x.com/llama_index/status/1772662480210198809?s=46&t=aOEVGBVv9ICQLUYL4fQHlQ

RAFT - Retrieval Augmented Fine Tuning 🔥

RAFT offers a method to fine-tune pre-trained LLMs for specific domain RAG settings.

Conventional RAG is like an open-book exam, retrieving documents from an index to provide context for answering queries. This makes it more effective than the closed-book exam setting where LLMs rely solely on their pre-training and fine-tuning to respond to prompts, but doesn't allow the LLM to learn the domain beforehand.

RAFT proposed a fine-tuning approach that enhances LLMs for domain-specific open-book exams, where the model is trained to attend to relevant docs and ignore irrelevant documents.

RAFT creates a synthetic dataset where each data sample consists of: 1️⃣ A question 2️⃣ Two docs: An "oracle" document relevant to the question, and a "distractor" documents irrelevant to the question. 3️⃣ An answer generated from the documents 4️⃣ A Chain-of-Thought explanation including excerpts from the relevant documents.

The created synthetic dataset is further used for fine-tuning to improve RAG performance.

LlamaPack: github.com/run-llama/llam… Video Tutorial: youtube.com/watch?v=sqPckk… Paper: arxiv.org/abs/2403.10131

manisnesan / til

RAFT #88