EdgeChains.js Typescript/Javascript production-friendly Generative AI. Based on Jsonnet. Works anywhere that Webassembly does. Prompts live declaratively & "outside code in config". Kubernetes & edge friendly. Compatible with OpenAI GPT, Gemini, Llama2, Anthropic, Mistral and others
When deciding what to retrieve, we argue
that it is important to consider what LMs intend to
generate in the future, as the goal of active retrieval
is to benefit future generations. Therefore, we propose anticipating the future by generating a temporary next sentence, using it as a query to retrieve
relevant documents, and then regenerating the next
sentence conditioning on the retrieved documents.
Combining the two aspects, we propose ForwardLooking Active REtrieval augmented generation
(FLARE), as illustrated in Figure 1. FLARE iteratively generates a temporary next sentence, use
it as the query to retrieve relevant documents if it
contains low-probability tokens and regenerate the
next sentence until reaches the end
When deciding what to retrieve, we argue that it is important to consider what LMs intend to generate in the future, as the goal of active retrieval is to benefit future generations. Therefore, we propose anticipating the future by generating a temporary next sentence, using it as a query to retrieve relevant documents, and then regenerating the next sentence conditioning on the retrieved documents. Combining the two aspects, we propose ForwardLooking Active REtrieval augmented generation (FLARE), as illustrated in Figure 1. FLARE iteratively generates a temporary next sentence, use it as the query to retrieve relevant documents if it contains low-probability tokens and regenerate the next sentence until reaches the end
https://arxiv.org/pdf/2305.06983.pdf