New paper: Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive

Paper: Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive

Authors: Zhuowan Li, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky

Abstract: Retrieval Augmented Generation (RAG) has been a powerful tool for LargeLanguage Models (LLMs) to efficiently process overly lengthy contexts. However,recent LLMs like Gemini-1.5 and GPT-4 show exceptional capabilities tounderstand long contexts directly. We conduct a comprehensive comparisonbetween RAG and long-context (LC) LLMs, aiming to leverage the strengths ofboth. We benchmark RAG and LC across various public datasets using three latestLLMs. Results reveal that when resourced sufficiently, LC consistentlyoutperforms RAG in terms of average performance. However, RAG's significantlylower cost remains a distinct advantage. Based on this observation, we proposeSelf-Route, a simple yet effective method that routes queries to RAG or LCbased on model self-reflection. Self-Route significantly reduces thecomputation cost while maintaining a comparable performance to LC. Our findingsprovide a guideline for long-context applications of LLMs using RAG and LC.

Link: https://arxiv.org/abs/2407.16833

Reasoning: produce the is_lm_paper. We start by examining the title and abstract for key terms and concepts related to language models. The title mentions "Retrieval Augmented Generation" and "Long-Context LLMs," both of which are directly related to language models. The abstract further discusses the comparison between RAG and long-context LLMs, specifically mentioning models like Gemini-1.5 and GPT-4, which are well-known language models. The focus on benchmarking these models and proposing a method to optimize their use also indicates a strong emphasis on language models.

ur-whitelab / LLMs-in-science

New paper: Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive #25