Open manisnesan opened 8 months ago
https://x.com/llama_index/status/1772662480210198809?s=46&t=aOEVGBVv9ICQLUYL4fQHlQ
RAFT - Retrieval Augmented Fine Tuning 🔥
RAFT offers a method to fine-tune pre-trained LLMs for specific domain RAG settings.
Conventional RAG is like an open-book exam, retrieving documents from an index to provide context for answering queries. This makes it more effective than the closed-book exam setting where LLMs rely solely on their pre-training and fine-tuning to respond to prompts, but doesn't allow the LLM to learn the domain beforehand.
RAFT proposed a fine-tuning approach that enhances LLMs for domain-specific open-book exams, where the model is trained to attend to relevant docs and ignore irrelevant documents.
RAFT creates a synthetic dataset where each data sample consists of: 1️⃣ A question 2️⃣ Two docs: An "oracle" document relevant to the question, and a "distractor" documents irrelevant to the question. 3️⃣ An answer generated from the documents 4️⃣ A Chain-of-Thought explanation including excerpts from the relevant documents.
The created synthetic dataset is further used for fine-tuning to improve RAG performance.
LlamaPack: github.com/run-llama/llam… Video Tutorial: youtube.com/watch?v=sqPckk… Paper: arxiv.org/abs/2403.10131
Source