:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
When I start some text generation with that docker, it can't load model. And Log :
[llama-cpp] Fails: backend not found: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp
To Reproduce
Docker pull and Docker run
Expected behavior
backend must be found
Logs
api-1 | 11:52PM INF Trying to load the model 'llama-3.2-3b-instruct-q8_0.gguf' with the backend '[llama-cpp llama-ggml llama-cpp-fallback piper rwkv stablediffusion whisper huggingface bert-embeddings /build/backend/python/exllama2/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/vllm/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/mamba/run.sh /build/backend/python/coqui/run.sh /build/backend/python/transformers/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/bark/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/transformers-musicgen/run.sh]'
api-1 | 11:52PM INF [llama-cpp] Attempting to load
api-1 | 11:52PM INF Loading model 'llama-3.2-3b-instruct-q8_0.gguf' with backend llama-cpp
api-1 | 11:52PM INF [llama-cpp] Fails: backend not found: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp
api-1 | 11:52PM INF [llama-cpp] Autodetection failed, trying the fallback
api-1 | 11:52PM INF Loading model 'llama-3.2-3b-instruct-q8_0.gguf' with backend
api-1 | 11:52PM INF [llama-cpp] Fails: fork/exec grpc: permission denied
LocalAI version:
Docker, latest image: quay.io/go-skynet/local-ai:latest-aio-gpu-nvidia-cuda-12
Environment, CPU architecture, OS, and Version:
CPU: intel i9-11900, 64GB DDR4, GPU: nVidia RTX3090
Describe the bug
When I start some text generation with that docker, it can't load model. And Log : [llama-cpp] Fails: backend not found: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp
To Reproduce
Docker pull and Docker run
Expected behavior
backend must be found
Logs
api-1 | 11:52PM INF Trying to load the model 'llama-3.2-3b-instruct-q8_0.gguf' with the backend '[llama-cpp llama-ggml llama-cpp-fallback piper rwkv stablediffusion whisper huggingface bert-embeddings /build/backend/python/exllama2/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/vllm/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/mamba/run.sh /build/backend/python/coqui/run.sh /build/backend/python/transformers/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/bark/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/transformers-musicgen/run.sh]' api-1 | 11:52PM INF [llama-cpp] Attempting to load api-1 | 11:52PM INF Loading model 'llama-3.2-3b-instruct-q8_0.gguf' with backend llama-cpp api-1 | 11:52PM INF [llama-cpp] Fails: backend not found: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp api-1 | 11:52PM INF [llama-cpp] Autodetection failed, trying the fallback api-1 | 11:52PM INF Loading model 'llama-3.2-3b-instruct-q8_0.gguf' with backend api-1 | 11:52PM INF [llama-cpp] Fails: fork/exec grpc: permission denied
Additional context