mudler / LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
https://localai.io
MIT License
21.69k stars 1.66k forks source link

feat: Assistant API #1273

Open mudler opened 7 months ago

mudler commented 7 months ago

OpenAI released the Assistant API. https://platform.openai.com/docs/assistants/overview

This thread is to collect thoughts on how to integrate this - it seems we have already all the needed pieces, but it's needed to stich things together to expose a similar API.

I didn't had a closer look at the API yet - not on my radar until #1126 is completed

For reference, the OpenAI API spec: https://github.com/openai/openai-openapi

localai-bot commented 7 months ago

:warning::warning::warning::warning::warning:

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

:warning::warning::warning::warning::warning:

To integrate the OpenAI Assistant API with LocalAGI, we can use the local LLMs that are already loaded in memory to generate completions and edits. We can also use the existing chat interface to interact with the user and receive instructions for generating completions and edits. Here's a possible approach:

  1. Add a new endpoint to LocalAGI to generate completions and edits using the local LLMs. This endpoint should take in the instruction, input text, and temperature parameters, and return the generated completion or edit as JSON response.
  2. Modify the chat interface to include buttons for generating completions and edits, and send the appropriate request to the new endpoint when the user clicks on these buttons.
  3. Add a new command-line argument to LocalAGI to specify the OpenAI Assistant API key, if needed. This will allow users to use their own OpenAI API key to access the Assistant API directly from LocalAGI.
  4. Test and document the integration thoroughly to ensure it works as expected and is easy to use for users with different levels of technical expertise. Overall, this integration should provide a seamless experience for users who want to use LocalAGI to generate completions and edits using OpenAI's LLMs, while also allowing them to access the full Assistant API directly from LocalAGI if needed.

Sources:

muka commented 6 months ago

In the effort to contribute on this, I started collecting the requirements and functionalities offered by the Assistant API.

Assistant functionalities

Links

mudler commented 4 months ago

First PR in that direction adding File API: https://github.com/mudler/LocalAI/pull/1703

richiejp commented 4 months ago

I have a few ideas for vector search in order of my personal preference:

  1. Embedded vector store and search
  2. Host one of the many vector DBs in a backend
  3. Connect to an external store

All could exist at once. I like the first one for the use-case where someone has a limited number of documents and/or search volume. Also for being the default choice in LocalAI without incurring a lot of maintenance or bloat. It should be relatively simple because:

So I went ahead and started an experiment external to LocalAI: https://github.com/richiejp/badger-cybertron-vector/blob/main/main.go

It's probably really slow due to copying the keys, lack of parallel execution and such, but it works. I expect these things can be optimized. The question is whether to go with BadgerDB or a pure in memory implementation?

BTW Cybertron is pretty cool, that could be a new backend.

richiejp commented 3 months ago

Perhaps instead of, or in addition to what I have done with the basic vector search. We could have a higher level API which is backed by https://github.com/bclavie/ragatouille as a starting point.

The reason is that it seems colBERT v2 is far superior to basic cosine similarity search, but it is difficult to unpack it and get it to work with some arbitrary vector database. It's possibly the wrong level of abstraction for LocalAI to be working at even internally.

Opinions on implementing Ragatouille or colBERT (https://github.com/stanford-futuredata/ColBERT) as a backend?

richiejp commented 3 months ago

I'm mainly thinking of the indexing and retrieval Ragatuille APIs https://github.com/bclavie/ragatouille?tab=readme-ov-file#%EF%B8%8F-indexing

christ66 commented 3 months ago

+1 for Ragatouille