mudler / LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
https://localai.io
MIT License
21.76k stars 1.66k forks source link

feat: Retrieval #1900

Open richiejp opened 3 months ago

richiejp commented 3 months ago

The assistant API (#1273) allows documents to be specified through the file interface (#1803). The assistant can then use a retrieval tool to fetch relevant parts of those files. Either the LLM formulates a query from the user's request or else the user's request itself is used as the query.

Another use case is when using an auto-complete style code (or writing) assistant. We want to use a repository or collection of repositories as context for the auto-completion.

For collections of files that exceed the LLM context window we need a way to find relevant parts of those files.

Previously I implemented a naive vector search #1792 which can be combined with embeddings to provide a minimal implementation. Another way is to use RAGatouille and/or colBERT V2 which combine multiple techniques to perform retrieval. Feel free to propose alternatives especially if they might result in drastically different API decisions.

I propose doing it both ways because why do one job when you can do two at twice the effort?

localai-bot commented 2 months ago

Hello richiejp,

I am the Github Bot (an experiment of @mudler), and I can help you with this issue. Before we proceed, I would like to clarify some points to better understand your proposition.

  1. Are you suggesting to implement a retrieval mechanism for the existing API or add a new feature to the LocalAI?
  2. Can you provide more information about the expected API design for this implementation?
  3. Is there any specific reason to choose RAGatouille and colBERT V2 for this task, or are there otherRetrieval frameworks that you prefer?
  4. How do you plan to integrate the retrieval mechanism with the existing codebase, and what impact do you expect it to have on performance and functionality?

Once I have a clear understanding of your proposition, I can help you proceed with the next steps for this feature.

Best regards, Github Bot

richiejp commented 2 months ago

As I'm here, there are some serious issues with the naive vector search. Which we can either correct or replace with https://github.com/philippgille/chromem-go if possible.