mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
23.21k stars 1.76k forks source link

error: llama: model does not exist gpt: model does not exist gpt2: model does not exist stableLM: model does not exist #68

Closed limcheekin closed 12 months ago

limcheekin commented 1 year ago

Hi there,

First of all, I managed to compile the binary using make build, it produced an executable file local-ai.

I started the local-ai using the following command:

./local-ai --f16 true --debug true --threads 2 --models-path ./models --context-size 2048

The app is started properly in my Ubuntu linux machine with the following screen:

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.42.0                                   
 │               http://127.0.0.1:8080                            
 │       (bound on host 0.0.0.0 and port 8080)        
 │                                                                            
 │ Handlers ............ 10  Processes ........... 1  
 │ Prefork ....... Disabled  PID ............. 31788 
 └───────────────────────────────────────────────────┘

For your information, there's only one model file in the ./models directory: ggml-gpt4all-j.bin

However, when I send the following request to the endpoint:

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
        "model": "ggml-gpt4all-j.bin",            
        "prompt": "What is Flutter?",
        "temperature": 0.7
}'

It send me the following response:

{"error":"llama: model does not exist gpt: model does not exist gpt2: model does not exist stableLM: model does not exist"}

May I know what did I missed? How do I find out what's wrong. Let me know if you need more information.

Please advise. Thank you.

normen commented 1 year ago

It seems none of the loaders can load your model.. Also, a chat completion request doesn't have a "prompt", it has a "messages" array of objects with role/content fields, but I don't think thats the issue here.

limcheekin commented 1 year ago

Thanks for reply. Just find out there are two endpoints.

I tried with the http://localhost:8080/v1/completions endpoint, the same error persist.

localai-bot commented 1 year ago

:warning::warning::warning::warning::warning:

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

:warning::warning::warning::warning::warning:

It seems like the issue is that the model file ggml-gpt4all-j.bin is not recognized by the local-ai server. One possible solution is to try downloading a different model file from the GPT family, such as gglm-base.bin, and see if it works. Another possibility is to check if the model file is in the correct format and has the right size (usually around 200MB). If the issue persists, it might be helpful to check the server logs for more information on the error.

Sources:

Aisuko commented 12 months ago

Hi,guys. Now we migrate to support to use model gallery. And here is the example that support check if the model was loaded correctly. https://localai.io/basics/getting_started/

Thanks for your feedback, and I will close this issue. If the issue still occurred please reopen it.