Nothing or 404 on port 8080,.. after seemingly working start

ai-bits commented 6 months ago

LocalAI version: 2.11.0

Environment, CPU architecture, OS, and Version: Windows 11 latest, Xeon(R) w5-3435X, 256GB, 2x 20GB RTX 4000 NVIDIA-SMI 550.65 Driver Version: 551.86 CUDA Version: 12.4, Docker Desktop via WSL latest

Describe the bug docker run -p 8088:8080 --gpus all --name local-ai -ti localai/localai:latest-aio-gpu-nvidia-cuda-12 netstat -a -n -o | findstr "8080" #or 8088 not used before start, then 8080 Docker & WSL Looks a lot like (https://github.com/mudler/LocalAI/issues/720) After startup says http://127.0.0.1:8080, but 8080 or 8088 or 0.0.0.0:.. return a 404,...

To Reproduce see descr

Expected behavior What would one expect?

Logs Will try --debugorDEBUG=true` tomorrow; tried binary on Ubuntu 22.04 today but didn't get very far

Additional context

fakezeta commented 6 months ago

can you try:

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
          "model": "gpt-4",
          "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]
      }'

or alse curl http://127.0.0.1:8080/models to check if the AIO container correctly setup itself.

ai-bits commented 6 months ago

Thanks for the light-speed response on Easter Sunday. Was in bed seconds after bug rep. (Vienna time ;-)

Just tried the bin with your curl on Ubuntu 22.04 latest and I think I saw some piper bug rep yday!? c u later ` gy@gnu:~/local-ai./local-ai-cuda12-Linux-x86_64 9:42AM INF Starting LocalAI using 4 threads, with models path: /home/gy/local-ai/models 9:42AM INF LocalAI version: v2.11.0 (1395e505cd8f1cc90ce575602c7eb21706da6067) 9:42AM INF Preloading models from /home/gy/local-ai/models 9:42AM INF core/startup process completed!

┌───────────────────────────────────────────────────┐ │ Fiber v2.50.0 │ │ http://127.0.0.1:8080 │ │ (bound on host 0.0.0.0 and port 8080) │ │ │ │ Handlers ........... 117 Processes ........... 1 │ │ Prefork ....... Disabled PID .............. 9416 │ └───────────────────────────────────────────────────┘

9:44AM INF Trying to load the model 'gpt-4' with all the available backends: llama-cpp, llama-ggml, gpt4all, bert-embeddings, rwkv, whisper, stablediffusion, tinydream, piper 9:44AM INF [llama-cpp] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend llama-cpp 9:44AM INF [llama-cpp] Fails: could not load model: rpc error: code = Canceled desc = 9:44AM INF [llama-ggml] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend llama-ggml 9:44AM INF [llama-ggml] Fails: could not load model: rpc error: code = Unknown desc = failed loading model 9:44AM INF [gpt4all] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend gpt4all 9:44AM INF [gpt4all] Fails: could not load model: rpc error: code = Unknown desc = failed loading model 9:44AM INF [bert-embeddings] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend bert-embeddings 9:44AM INF [bert-embeddings] Fails: could not load model: rpc error: code = Unknown desc = failed loading model 9:44AM INF [rwkv] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend rwkv 9:44AM INF [rwkv] Fails: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF 9:44AM INF [whisper] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend whisper 9:44AM INF [whisper] Fails: could not load model: rpc error: code = Unknown desc = stat /home/gy/local-ai/models/gpt-4: no such file or directory 9:44AM INF [stablediffusion] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend stablediffusion 9:44AM INF [stablediffusion] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/stablediffusion. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS 9:44AM INF [tinydream] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend tinydream 9:44AM INF [tinydream] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/tinydream. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS 9:44AM INF [piper] Attempting to load 9:44AM INF Loading model 'gpt-4' with backend piper 9:44AM INF [piper] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/piper. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS `

fakezeta commented 6 months ago

it seems it does not find gpt-4 model definition. can you post the content of /home/gy/local-ai/models directory and also the output from curl http://127.0.0.1:8080/models?

Thank you

ai-bits commented 6 months ago

Please forgive if I mix Windows & Docker and Ubuntu and bin, but didn't get to start Ubuntu and fell into Windows:

docker run -p 8088:8080 --gpus all --name local-ai -ti localai/localai:latest-aio-gpu-nvidia-cuda-12

Mind, I'm on 8088, cuz I was weary 8080 was taken (but, no).

(base) c:\dl>curl http://127.0.0.1:8080/models curl: (7) Failed to connect to 127.0.0.1 port 8080 after 2025 ms: Couldn't connect to server

(base) c:\dl>curl http://127.0.0.1:8088/models {"object":"list","data":[{"id":"text-embedding-ada-002","object":"model"},{"id":"gpt-4","object":"model"},{"id":"whisper-1","object":"model"},{"id":"stablediffusion","object":"model"},{"id":"gpt-4-vision-preview","object":"model"},{"id":"tts-1","object":"model"},{"id":"MODEL_CARD","object":"model"},{"id":"llava-v1.6-7b-mmproj-f16.gguf","object":"model"},{"id":"voice-en-us-amy-low.tar.gz","object":"model"}]}

Models seem to be stored in the container, cuz every time I had to throw it away to remap ports it redownloaded the models.

Your curl w localhost: (base) c:\dl>curl http://localhost:8088/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt-4", "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]}' {"error":{"code":500,"message":"failed reading parameters from request:failed parsing request body: invalid character '\'' looking for beginning of value","type":""}}curl: (3) URL rejected: Port number was not a decimal number between 0 and 65535 curl: (3) URL rejected: Bad hostname curl: (3) URL rejected: Port number was not a decimal number between 0 and 65535 curl: (3) bad range specification in URL position 2: [{role: ^

Your curl w 127.0.0.1 (base) c:\dl>curl http://127.0.0.1:8088/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt-4", "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]}' {"error":{"code":500,"message":"failed reading parameters from request:failed parsing request body: invalid character '\'' looking for beginning of value","type":""}}curl: (3) URL rejected: Port number was not a decimal number between 0 and 65535 curl: (3) URL rejected: Bad hostname curl: (3) URL rejected: Port number was not a decimal number between 0 and 65535 curl: (3) bad range specification in URL position 2: [{role: ^

http://127.0.0.1:8080 can't reach (naturally) http://127.0.0.1:8088 {"error":{"code":404,"message":"Cannot GET /","type":""}}

Chrome debug says Failed to load resource: the server responded with :8088/favicon.ico:1 a status of 404 (Not Found)

Ubuntu bin side next mess. (abbrev. / pun) \<helpless foolish grin>

ai-bits commented 6 months ago

Ubuntu 22.04 bin

it seems it does not find gpt-4 model definition. can you post the content of /home/gy/local-ai/models directory and also the output from curl http://127.0.0.1:8080/models?

As expected models is empty here.

gy@gnu:~/local-ai$ curl http://127.0.0.1:8080/models
{"object":"list","data":[]}gy@gnu:~/local-ai$

But interestingly the app reacts.

So, this is the client side out:

gy@gnu:~/local-ai$ curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt-4", "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]}'

{"error":{"code":500,"message":"could not load model - all backends returned error: 9 errors occurred:\n\t* could not load model: rpc error: code = Canceled desc = \n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unavailable desc = error reading from server: EOF\n\t* could not load model: rpc error: code = Unknown desc = stat /home/gy/local-ai/models/gpt-4: no such file or directory\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/stablediffusion. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/tinydream. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/piper. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\n","type":""}}

Server side:

 gy@gnu:~/local-ai$ ./local-ai-cuda12-Linux-x86_64 
12:58PM INF Starting LocalAI using 4 threads, with models path: /home/gy/local-ai/models
12:58PM INF LocalAI version: v2.11.0 (1395e505cd8f1cc90ce575602c7eb21706da6067)
12:58PM INF Preloading models from /home/gy/local-ai/models
12:58PM INF core/startup process completed!

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.50.0                   │ 
 │               http://127.0.0.1:8080               │ 
 │       (bound on host 0.0.0.0 and port 8080)       │ 
 │                                                   │ 
 │ Handlers ........... 117  Processes ........... 1 │ 
 │ Prefork ....... Disabled  PID .............. 4348 │ 
 └───────────────────────────────────────────────────┘ 

1:08PM INF Trying to load the model 'gpt-4' with all the available backends: llama-cpp, llama-ggml, gpt4all, bert-embeddings, rwkv, whisper, stablediffusion, tinydream, piper
1:08PM INF [llama-cpp] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend llama-cpp
1:08PM INF [llama-cpp] Fails: could not load model: rpc error: code = Canceled desc = 
1:08PM INF [llama-ggml] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend llama-ggml
1:08PM INF [llama-ggml] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
1:08PM INF [gpt4all] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend gpt4all
1:08PM INF [gpt4all] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
1:08PM INF [bert-embeddings] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend bert-embeddings
1:08PM INF [bert-embeddings] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
1:08PM INF [rwkv] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend rwkv
1:08PM INF [rwkv] Fails: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
1:08PM INF [whisper] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend whisper
1:08PM INF [whisper] Fails: could not load model: rpc error: code = Unknown desc = stat /home/gy/local-ai/models/gpt-4: no such file or directory
1:08PM INF [stablediffusion] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend stablediffusion
1:08PM INF [stablediffusion] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/stablediffusion. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
1:08PM INF [tinydream] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend tinydream
1:08PM INF [tinydream] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/tinydream. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
1:08PM INF [piper] Attempting to load
1:08PM INF Loading model 'gpt-4' with backend piper
1:08PM INF [piper] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/piper. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS

Compilation error? Thx G.

fakezeta commented 6 months ago

can you try to put this file in models directory and restart?

https://github.com/mudler/LocalAI/blob/3c778b538aee121543ddaeb334cbb7f0e4790d98/aio/gpu-8g/text-to-text.yaml

ai-bits commented 6 months ago

BINGO!

gy@gnu:~/dl$ curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt-4", "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}] }'

{"created":1712000959,"object":"chat.completion","id":"6d6f0d5e-29ae-4d21-b822-2ee2f4458687","model":"gpt-4","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"I'm doing well, thank you for asking! How about you?"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}gy@gnu:~/dl$

So I'll put the rest of the yaml files.

Thanks a ton!

So, I got this running on Ubuntu. 1 h left till bedtime \<g> Got a shot on Windows +Docker? (my default env)

EDIT: localhost:8080 & 127.. still nogo. Same as Windows+Docker.

Also: I got a MacBook Pro M3 Max 36GB. Should I try to compile an Apple Silicon version and upload?

Thx again G.

ai-bits commented 6 months ago

You seem to have overlooked that only a few of the curl sample commands work, but http://127.0.0.1:8080/ throws

{"error":{"code":404,"message":"Cannot GET /","type":""}}

on any platform, Windows Docker AiO and AVX-512 binary on Ubuntu and Ubuntu WSL in platform Chrome and other browsers. Docker logs don't show any anomaly! See attachment.

Interesting twist:

curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{ "input": "Your text string goes here", "model": "text-embedding-ada-002" }'

works with LocalAI Windows Docker as backend and WSL Ubuntu console!

As mentioned above Windows cmd throws:

{"error":{"code":500,"message":"failed reading parameters from request:failed parsing request body: invalid character '\\'' looking for beginning of value","type":""}}curl: (3) URL rejected: Port number was not a decimal number between 0 and 65535
curl: (3) URL rejected: Malformed input to a URL function
curl: (3) URL rejected: Port number was not a decimal number between 0 and 65535
curl: (6) Could not resolve host: text-embedding-ada-002
curl: (3) unmatched close brace/bracket in URL position 1:
}'
 ^

Regards G. LocalAIDockerLogs.txt

fakezeta commented 6 months ago

Hi @ai-bits sorry for the late reply,

from the quoted text it seems a curl error: are you sure that there are no spurious character in the request? Just to be sure the curl request is being launched by Windows cmd, powershell or Linux prompt?

ai-bits commented 5 months ago

sorry for the late reply

No need to be sorry. Appreciate support from open source IAC, albeit had stopped looking, so it took some time from my side now.

from the quoted text it seems a curl error: are you sure that there are no spurious character in the request?

Just tried again with the localai image from 2 weeks ago. I'm pretty sure no spurious chars there, especially no line-continuation chars. I have the cleaned-up string in a localai.md in vscode in a github repo file, cuz the sample in terminal-out had added line breaks and indentation.

curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{ "input": "Your text string goes here", "model": "text-embedding-ada-002" }'

The clip went here, into Windows Terminal cmd and WSL Ubuntu 22.04 bash and just now into PS. It only worked in Ubuntu bash.

THE cmd SOLUTION IS IN THE NEXT MESSAGE

PS threw

`Invoke-WebRequest : Der Parameter "Headers" kann nicht gebunden werden. Der Wert "Content-Type: application/json" vom Typ "System.String" kann nicht in den Typ "System.Collections.IDictionary" konvertiert werden. In Zeile:1 Zeichen:50

... t:8080/embeddings -X POST -H "Content-Type: application/json" -d '{ " ...


+ CategoryInfo          : InvalidArgument: (:) [Invoke-WebRequest], ParameterBindingException
+ FullyQualifiedErrorId : CannotConvertArgumentNoMessage,Microsoft.PowerShell.Commands.InvokeWebRequestCommand`

Any hint there?

I'll post this now and maybe try the latest image later. Just saw it got pushed to DockerHub hours ago.

Also had hoped to switch on debug for the Linux binary I had installed on non-WSL / to-the-metal Ubuntu 22.04 to find out what's wrong with 127.0.0.1:8080, but didn't get around to it.

No worries re timing. Got enuf 'AI' to 'play'. Cheers G.

ai-bits commented 5 months ago

Hey man, Copilot just solved it! \<rolling on the floor laughing and banging my head>

Query:

curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d "{ \"input\": \"Your text string goes here\", \"model\": \"text-embedding-ada-002\" }"

in cmd throws

{"error":{"code":500,"message":"failed reading parameters from request:failed parsing request body: invalid character '\'' looking for beginning of value","type":""}}curl: (3) URL rejected: Port number was not a decimal number between 0 and 65535 curl: (3) URL rejected: Malformed input to a URL function curl: (3) URL rejected: Port number was not a decimal number between 0 and 65535 curl: (6) Could not resolve host: text-embedding-ada-002 curl: (3) unmatched close brace/bracket in URL position 1: }' ^

Answer:

The error message suggests that the command is not being parsed correctly. This is likely due to the use of single quotes (') in the JSON data part of the command. The Windows command prompt does not interpret single quotes the same way as Unix-like systems do.

You should use double quotes (") instead, and escape the inner double quotes with a backslash (). Here's how you can modify the command:

curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d "{ \"input\": \"Your text string goes here\", \"model\": \"text-embedding-ada-002\" }"

This command should work correctly in the Windows command prompt.

ai-bits commented 5 months ago

And here is the Copilot-adapted PS variant:

If you're trying to run the curl command in PowerShell, you should use the following syntax:

curl http://localhost:8080/embeddings -Method POST -ContentType "application/json" -Body '{ "input": "Your text string goes here", "model": "text-embedding-ada-002" }'

In PowerShell, the -X option for curl is replaced with -Method, -H is replaced with -ContentType, and -d is replaced with -Body.

Made my day.

fakezeta commented 5 months ago

Happy to hear that you finally fixed it. I tought it should be a shell quoting issue but I'm really unexperienced on Windows.

Shall the issue be closed?

ai-bits commented 5 months ago

As you're outputting the usage / curl examples during model download I suggest you supplant the current Linux-only variant with the escaped one which works in bash and Windows cmd, i.e.

curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d "{ \"input\": \"Your text string goes here\", \"model\": \"text-embedding-ada-002\" }"

For the other curl example transforms / escaped ones you could employ Copilot. ;-)

Dunno if you care to quote the PS variants somewhere as well.

And then I asked for a solution for the 404 with 127.0.0.1:8080. No-go on all platforms.

Also had hoped to switch on debug for the Linux binary I had installed on non-WSL / to-the-metal Ubuntu 22.04 to find out what's wrong with 127.0.0.1:8080, but didn't get around to it.

Thanks G.

fakezeta commented 5 months ago

Good catch. Why don't you open a PR? :)

localai-bot commented 5 months ago

I'll pass along your suggestions to the AI Bits team. They will be responsible for making these changes in their code, as they are the maintainers of the project. You could also open a PR yourself if you have forked the repository, or you can open an issue to bring attention to these suggestions.

For the 127.0.0.1:8080 issue, you might want to check if there is an entry in your hosts file that could be causing a conflict or if there are any network settings that could be interfering with the localhost connection. If the problem persists, I suggest reaching out to the AI Bits team or the community for further assistance.

Please keep us updated on any progress or additional issues you encounter.

ai-bits commented 5 months ago

The bot is a bit off, but, anyway.. Will have a PR for the curl problem ASAP.

curl embeddings on Linux worked and I'll have a fix / PR for Windows cmd and PS. To test this on the M3 Mac I just retried

docker run -p 8080:8080 --name local-ai -ti localai/localai:latest-aio-cpu

(amd64 emulation) and 'magically' http://127.0.0.1:8080/ worked there (browser GUI) w/ latest v2.12.4.

EDIT: Verified on Mac, but guess will need escaping in Windows cmd and different arg names in PS. Will try to include all examples in my PR. curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{ "model":"tts-1", "input": "Hi, this is a test." }' --output ~/dl/local-ai.wav

EDIT: On Windows just trashed the 40+GB LocalAI Docker image and the container including the models and the kitchen-sink to find out that the browser GUI DOES WORK now on Windows, but does nothing more but list the models and point to the docs.

Any chance to decouple the model downloads from the container to have it survive a version update?

Next up I'll retry http://127.0.0.1:8080/ using the binary on Ubuntu 22.04.

Cheers G.

ai-bits commented 5 months ago

I forked the repo for editing and a future pull request, but then I thought I'd better asked what to do / edit as there are 7 occurrences of curl for embeddings alone and then there is at least another one in localai.io/features/embeddings/ and github says the curl was fetched from the web page.

The problem with the current examples: They are bash-only and that fact is hidden in markdown. For Windows cmd one would need to swap the single quote for double quotes and the inner for escaped ones. If you then add a variant with line continuation carets for Windows cmd it gets bulky and confusing.

How about putting a single-line (auto-wrapped) with (escaped) double quotes curl into the .yaml files to show in the logs? It would not only cover bash, but also cmd. Leave PS behind. Cheers G.

mudler / LocalAI

Nothing or 404 on port 8080,.. after seemingly working start #1938