mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
24.3k stars 1.86k forks source link

[llama-cpp] Fails: backend not found: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp #3727

Closed tosintech-web closed 2 weeks ago

tosintech-web commented 3 weeks ago

LocalAI version:

Docker, latest image: quay.io/go-skynet/local-ai:latest-aio-gpu-nvidia-cuda-12

Environment, CPU architecture, OS, and Version:

CPU: intel i9-11900, 64GB DDR4, GPU: nVidia RTX3090

Describe the bug

When I start some text generation with that docker, it can't load model. And Log : [llama-cpp] Fails: backend not found: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp

To Reproduce

Docker pull and Docker run

Expected behavior

backend must be found

Logs

api-1 | 11:52PM INF Trying to load the model 'llama-3.2-3b-instruct-q8_0.gguf' with the backend '[llama-cpp llama-ggml llama-cpp-fallback piper rwkv stablediffusion whisper huggingface bert-embeddings /build/backend/python/exllama2/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/vllm/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/mamba/run.sh /build/backend/python/coqui/run.sh /build/backend/python/transformers/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/bark/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/transformers-musicgen/run.sh]' api-1 | 11:52PM INF [llama-cpp] Attempting to load api-1 | 11:52PM INF Loading model 'llama-3.2-3b-instruct-q8_0.gguf' with backend llama-cpp api-1 | 11:52PM INF [llama-cpp] Fails: backend not found: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp api-1 | 11:52PM INF [llama-cpp] Autodetection failed, trying the fallback api-1 | 11:52PM INF Loading model 'llama-3.2-3b-instruct-q8_0.gguf' with backend api-1 | 11:52PM INF [llama-cpp] Fails: fork/exec grpc: permission denied

Additional context

tosintech-web commented 3 weeks ago

Why "/tmp", "/build" isn't it?

CiraciNicolo commented 3 weeks ago

The llama-cpp backend has been renamed, or it cannot be found. Use llama-cpp-grpc.

mudler commented 2 weeks ago

this should be fixed by https://github.com/mudler/LocalAI/pull/3789