Swapping OpenAI with local LLM?

Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.

https://useanything.com

MIT License

17.02k stars 1.82k forks source link

Swapping OpenAI with local LLM? #83

Closed kfeeeeee closed 8 months ago

kfeeeeee commented 1 year ago

Hi,

Basically title. THe intro suggests that openai-access can be replaced with locally running models (maybe with oobabooga-openai-api?) Anyway, can't seem to find instructions / env settings for it. Could you tell me if it has bee implented already?

Nasnl commented 1 year ago

Not yet as far as I can see...

timothycarambat commented 1 year ago

Not currently, the issue with localLLM or other local LLM programs is you need an API accessible endpoint much like what GPT4ALL provides. There are some API wrappers for LocalLLM that some people have built that would work in this instance.

More work needs to be done around this - open to PRs, embeddings is another issue altogether

kfeeeeee commented 1 year ago

Oka, thank you. Maybe a good starting point would be oobabooga's text-webui since it's capable of mimicking an openai API on port 5001.

simonSlamka commented 1 year ago

Not currently, the issue with localLLM or other local LLM programs is you need an API accessible endpoint much like what GPT4ALL provides. There are some API wrappers for LocalLLM that some people have built that would work in this instance.

More work needs to be done around this - open to PRs, embeddings is another issue altogether

Gradio usually comes with a working API

AntonioCiolino commented 1 year ago

i could suggest chromadb for local embeddings - it's already set up and I've gotten it to work with two docker instances (localhost changes to host.docker.internal). getting a wrapper to work with LocalLLM is something I haven't tried yet though. If that works, this could become a fully autonomous solution for document search and chat, though possibly slow.

AntonioCiolino commented 1 year ago

I got close, but the chromadb code has code in it that force calls openai currently.

AntonioCiolino commented 1 year ago

I should be more clear - I was using LocalAI to read through to a local LLM with the OpenAI API calls. After commenting out the /moderations endpoint (which LocalAI doesn't handle), I was eventually able to call LocalAI and get it to return. However, I discovered that the embeddings - which I also overrode to use BERT, don't work in AnythingLLM, as it's expecting embeddings to be OpenAI only. In the process I've managed to mess some of the indexing up; new files aren't getting properly found. I'm probably going to have to wipe this all clean to purge out the BERTs :) LocalAI can call OpenAI and get the OpenAI embeddings, so it's not a complete failure; I was able to get GPT4ALL to connect and respond to the questions.

To summarize: It does work, with some tweaking, but unless this project supports other types of embeddings (which I don't think is a calling), a fully non-connected Local LLM model isn't likely.

Too bad; I really want to do the stuff completely offline.

ishaan-jaff commented 9 months ago

Hi @AntonioCiolino @simonSlamka @kfeeeeee @timothycarambat I’m the maintainer of LiteLLM - we allow you to create a proxy server to call 100+ LLMs, and I think it can solve your problem (I'd love your feedback if it does not)

Try it here: https://docs.litellm.ai/docs/proxy_server

Using LiteLLM Proxy Server

import openai
openai.api_base = "http://0.0.0.0:8000/" # proxy url
print(openai.ChatCompletion.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))

Creating a proxy server

Ollama models

$ litellm --model ollama/llama2 --api_base http://localhost:11434

Hugging Face Models

$ export HUGGINGFACE_API_KEY=my-api-key #[OPTIONAL]
$ litellm --model claude-instant-1

Anthropic

$ export ANTHROPIC_API_KEY=my-api-key
$ litellm --model claude-instant-1

Palm

$ export PALM_API_KEY=my-palm-key
$ litellm --model palm/chat-bison

danielnbalasoiu commented 8 months ago

@ishaan-jaff have you managed to get anything-llm running using the proxy workaround you suggested? If so, can you describe the steps?

timothycarambat commented 8 months ago

@ishaan-jaff Would really like to see NodeJS support for all the models the python client supports

franzbischoff commented 8 months ago

PR #335

timothycarambat commented 8 months ago

LMStudio integration is now live: f499f1ba59f2e9f8be5e44c89a951e859382e005

timothycarambat commented 8 months ago

Moving conversation to #118