Query Translation: Experiment with different kinds. Measure against evals.

Query Translation

First, consider the user input(s) to your RAG system. Ideally, a RAG system can handle a wide range of inputs, from poorly worded questions to complex multi-part queries. Using an LLM to review and optionally modify the input is the central idea behind query translation. This serves as a general buffer, optimizing raw user inputs for your retrieval system. For example, this can be as simple as extracting keywords or as complex as generating multiple sub-questions for a complex query.

Name	When to use	Description
Multi-query	When you need to cover multiple perspectives of a question.	Rewrite the user question from multiple perspectives, retrieve documents for each rewritten question, return the unique documents for all queries.
Decomposition	When a question can be broken down into smaller subproblems.	Decompose a question into a set of subproblems / questions, which can either be solved sequentially (use the answer from first + retrieval to answer the second) or in parallel (consolidate each answer into final answer).
Step-back	When a higher-level conceptual understanding is required.	First prompt the LLM to ask a generic step-back question about higher-level concepts or principles, and retrieve relevant facts about them. Use this grounding to help answer the user question. Paper.
HyDE	If you have challenges retrieving relevant documents using the raw user inputs.	Use an LLM to convert questions into hypothetical documents that answer the question. Use the embedded hypothetical documents to retrieve real documents with the premise that doc-doc similarity search can produce more relevant matches. Paper.

tip

See our RAG from Scratch videos for a few different specific approaches:

bossjones / goob_ai

Query Translation: Experiment with different kinds. Measure against evals. #65

Query Translation