Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
https://anythingllm.com
MIT License
27.65k stars 2.79k forks source link

Custom LLM #440

Closed KPCOFGS closed 11 months ago

KPCOFGS commented 11 months ago

When I tried to use my own custom Llama model that I downloaded from internet, I see it says "Using a locally hosted LLM is experimental. Use with caution." When I tried to load a model by clicking "waiting for models," there is no response of it. If I want to add the custom LLM manually, I need to go the storage/models/downloaded ?

timothycarambat commented 11 months ago

Since running on the custom model selection will use CPU (or GPU if available or using Apple M-series chips) the next best option would be using a purpose built tool like LocalAI where you can run the service with GPU acceleration and custom models (GGUF format).

You will likely get better performance this way by offloading the inferencing somewhere with a GPU.

It is also worth noting that running AnythingLLM in docker will force CPU base inferencing and not use your underlying host system. Only in development will the local model model use the underlying host system - hence why this is still in experimental :)

songkq commented 11 months ago

@timothycarambat Hi, I'm also confusing why the gguf model cannot be loaded into anythingllm. When running anythingllm, Model Selection is always empty. Could you please give some kind advice?

image

image

My env was built locally from source following the tutorial. image

My env: Mac M1 model path: /app/server/storage/models/ggml-model-q4_0.gguf

KPCOFGS commented 11 months ago

Hi! I think you will need to put the model under /server/storage/models/downloaded folder Screenshot from 2023-12-15 07-44-08

songkq commented 11 months ago

@KPCOFGS Thanks. It works after changing the model_path to /app/server/storage/models/downloaded/ggml-model-q4_0.gguf.

songkq commented 11 months ago

@KPCOFGS @timothycarambat Hi, could you please help me with this issue? When I send a chat message to the local llm(https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v0.6), it could not respond to message. After resetting the llm preference, docker container crashed after sending a chat message.

image image image
timothycarambat commented 11 months ago

What do the docker logs say?

songkq commented 11 months ago

@timothycarambat Where can I find some log information for debug?

Even setting a LocalAI LLM server, anythingllm cannot build connections to the server. It seems that anythingllm also could not respond to message from the LocalAI server. It works well only if I set the OpenAI API as a LLM server.

LocalAI LLM server: image

Local connection test:

curl http://localhost:8080/v1/models
{"object":"list","data":[{"id":"ggml-gpt4all-j","object":"model"}]}%

anything-llm with LocalAI server:

image image

anything-llm with OpenAI API Key:

image image
timothycarambat commented 11 months ago

The docker logs are available with however you run AnythingLLM, so docker logs <CONTAINER_ID> or if in docker desktop you can see it by clicking on the running container

nihs269 commented 6 months ago

image where can I find this model?