Quickstart -> Running LocalAI with All-in-One-> Couldn't get working.

LocalAI version:

localai/localai:latest-aio-cpu (as on April 7th,2024)

Environment, CPU architecture, OS, and Version:

Linux 5.15.0-101-generic #111~20.04.1-Ubuntu SMP Mon Mar 11 15:44:43 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux "Ubuntu 20.04.6 LTS (Focal Fossa)"

Describe the bug

Well, I followed the quick start and ran:

docker run -p 8080:8080 --name local-ai -ti localai/localai:latest-aio-cpu

and things seemed good... downloaded and started running, I only saw this one error...

5:32AM ERR error downloading models: failed to download file "/build/models/c231ac8305a82f0293c2ba25e1549620": Get "https://huggingface.co/mudler/all-MiniLM-L6-v2/resolve/main/ggml-model-q4_0.bin": dial tcp: lookup huggingface.co: i/o timeout

http://127.0.0.1:8080/models returns {"object":"list","data":[{"id":"gpt-4","object":"model"},{"id":"text-embedding-ada-002","object":"model"},{"id":"whisper-1","object":"model"},{"id":"stablediffusion","object":"model"},{"id":"gpt-4-vision-preview","object":"model"},{"id":"tts-1","object":"model"}]}

So I tried this as I'd seen in another issue:

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
          "model": "gpt-4",
          "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]
      }'

and just lots of fails in return:

{"error":{"code":500,"message":"could not load model - all backends returned error: 23 errors occurred:\n\t* could not load model: rpc error: code = Canceled desc = \n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unavailable desc = error reading from server: EOF\n\t* could not load model: rpc error: code = Unknown desc = stat /build/models/huggingface:/l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf: no such file or directory\n\t* could not load model: rpc error: code = Unknown desc = stat /build/models/huggingface:/l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf: no such file or directory\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/tinydream. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* could not load model: rpc error: code = Unknown desc = unsupported model type /build/models/huggingface:/l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf (should end with .onnx)\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\t* grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n\n","type":""}}

I look inside the container...

root@43f55d3c5491:/build/models# ls -la
total 24
drwxr-xr-x 1 root root  444 Apr  7 04:24 .
drwxr-xr-x 1 root root   12 Mar 26 17:54 ..
-rwxr-xr-x 1 root root  431 Apr  7 05:29 01ef9266c178bdcf654bb8730bf1d55f.yaml
-rwxr-xr-x 1 root root  679 Apr  7 05:31 112ee7db86acb6c32ff0ad9e1d651094.yaml
-rwxr-xr-x 1 root root 3743 Apr  7 05:30 1bea8ef81b66dd63fcf3b68e5b959925.yaml
-rwxr-xr-x 1 root root 1230 Apr  7 05:32 23edd98fb1aee8a2cc1d8b07f2a8adec.yaml
-rwxr-xr-x 1 root root  457 Apr  7 05:30 4e8bce301ebed58b98689146c493d8c4.yaml
-rwxr-xr-x 1 root root  739 Apr  7 05:31 f3cb558f560f3efdeadc6b36d9064998.yaml

But I don't really know what I am looking for, and a bit lost at this point. I saw the "If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true" message, but where is that supposed to go?

To Reproduce

Expected behavior

Logs

hip@dragon:~$ docker start -i local-ai 
===> LocalAI All-in-One (AIO) container starting...
Intel GPU detected
Intel GPU detected, but Intel GPU drivers are not installed. GPU acceleration will not be available.
GPU acceleration is not enabled or supported. Defaulting to CPU.
Starting LocalAI with the following models: /aio/cpu/embeddings.yaml,/aio/cpu/text-to-speech.yaml,/aio/cpu/image-gen.yaml,/aio/cpu/text-to-text.yaml,/aio/cpu/speech-to-text.yaml,/aio/cpu/vision.yaml
@@@@@
Skipping rebuild
@@@@@
If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true
If you are still experiencing issues with the build, try setting CMAKE_ARGS and disable the instructions set as needed:
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF"
see the documentation at: https://localai.io/basics/build/index.html
Note: See also https://github.com/go-skynet/LocalAI/issues/288
@@@@@
CPU info:
model name  : Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp pku ospke md_clear flush_l1d arch_capabilities
CPU:    AVX    found OK
CPU:    AVX2   found OK
CPU: no AVX512 found
@@@@@
5:29AM INF Starting LocalAI using 4 threads, with models path: /build/models
5:29AM INF LocalAI version: v2.11.0 (1395e505cd8f1cc90ce575602c7eb21706da6067)
5:32AM INF Preloading models from /build/models
5:32AM INF Downloading "https://huggingface.co/mudler/all-MiniLM-L6-v2/resolve/main/ggml-model-q4_0.bin"
5:32AM ERR error downloading models: failed to download file "/build/models/c231ac8305a82f0293c2ba25e1549620": Get "https://huggingface.co/mudler/all-MiniLM-L6-v2/resolve/main/ggml-model-q4_0.bin": dial tcp: lookup huggingface.co: i/o timeout
5:32AM INF core/startup process completed!

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.50.0                   │ 
 │               http://127.0.0.1:8080               │ 
 │       (bound on host 0.0.0.0 and port 8080)       │ 
 │                                                   │ 
 │ Handlers ........... 117  Processes ........... 1 │ 
 │ Prefork ....... Disabled  PID ................ 25 │ 
 └───────────────────────────────────────────────────┘ 

5:38AM INF Trying to load the model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with all the available backends: llama-cpp, llama-ggml, gpt4all, bert-embeddings, rwkv, whisper, stablediffusion, tinydream, piper, /build/backend/python/autogptq/run.sh, /build/backend/python/vall-e-x/run.sh, /build/backend/python/sentencetransformers/run.sh, /build/backend/python/exllama/run.sh, /build/backend/python/transformers-musicgen/run.sh, /build/backend/python/petals/run.sh, /build/backend/python/vllm/run.sh, /build/backend/python/transformers/run.sh, /build/backend/python/bark/run.sh, /build/backend/python/diffusers/run.sh, /build/backend/python/sentencetransformers/run.sh, /build/backend/python/mamba/run.sh, /build/backend/python/exllama2/run.sh, /build/backend/python/coqui/run.sh
5:38AM INF [llama-cpp] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend llama-cpp
5:38AM INF [llama-cpp] Fails: could not load model: rpc error: code = Canceled desc = 
5:38AM INF [llama-ggml] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend llama-ggml
5:38AM INF [llama-ggml] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
5:38AM INF [gpt4all] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend gpt4all
5:38AM INF [gpt4all] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
5:38AM INF [bert-embeddings] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend bert-embeddings
5:38AM INF [bert-embeddings] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
5:38AM INF [rwkv] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend rwkv
5:38AM INF [rwkv] Fails: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
5:38AM INF [whisper] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend whisper
5:38AM INF [whisper] Fails: could not load model: rpc error: code = Unknown desc = stat /build/models/huggingface:/l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf: no such file or directory
5:38AM INF [stablediffusion] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend stablediffusion
5:38AM INF [stablediffusion] Fails: could not load model: rpc error: code = Unknown desc = stat /build/models/huggingface:/l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf: no such file or directory
5:38AM INF [tinydream] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend tinydream
5:38AM INF [tinydream] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/tinydream. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [piper] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend piper
5:38AM INF [piper] Fails: could not load model: rpc error: code = Unknown desc = unsupported model type /build/models/huggingface:/l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf (should end with .onnx)
5:38AM INF [/build/backend/python/autogptq/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/autogptq/run.sh
5:38AM INF [/build/backend/python/autogptq/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/vall-e-x/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/vall-e-x/run.sh
5:38AM INF [/build/backend/python/vall-e-x/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/sentencetransformers/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/sentencetransformers/run.sh
5:38AM INF [/build/backend/python/sentencetransformers/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/exllama/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/exllama/run.sh
5:38AM INF [/build/backend/python/exllama/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/transformers-musicgen/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/transformers-musicgen/run.sh
5:38AM INF [/build/backend/python/transformers-musicgen/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/petals/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/petals/run.sh
5:38AM INF [/build/backend/python/petals/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/vllm/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/vllm/run.sh
5:38AM INF [/build/backend/python/vllm/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/transformers/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/transformers/run.sh
5:38AM INF [/build/backend/python/transformers/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/bark/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/bark/run.sh
5:38AM INF [/build/backend/python/bark/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/diffusers/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/diffusers/run.sh
5:38AM INF [/build/backend/python/diffusers/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/sentencetransformers/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/sentencetransformers/run.sh
5:38AM INF [/build/backend/python/sentencetransformers/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/mamba/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/mamba/run.sh
5:38AM INF [/build/backend/python/mamba/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/exllama2/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/exllama2/run.sh
5:38AM INF [/build/backend/python/exllama2/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
5:38AM INF [/build/backend/python/coqui/run.sh] Attempting to load
5:38AM INF Loading model 'huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf' with backend /build/backend/python/coqui/run.sh
5:38AM INF [/build/backend/python/coqui/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS

Additional context

mudler / LocalAI

Quickstart -> Running LocalAI with All-in-One-> Couldn't get working. #1965