mudler / LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
MIT License
21.75k stars 1.66k forks source link

localai:v2.10.1-cublas-cuda12-core ”Fails: grpc process not found” #1871

Open testKKP opened 3 months ago

testKKP commented 3 months ago

LocalAI version:

localai:v2.10.1-cublas-cuda12-core Environment, CPU architecture, OS, and Version: windows 11,RTX 4090,cuda 12.3

2024-03-21 19:23:56 11:23AM INF [/build/backend/python/sentencetransformers/] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/ some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS 2024-03-21 19:23:56 11:23AM INF [/build/backend/python/coqui/] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/ some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS Describe the bug

i just run the command "docker run -ti -p 8080:8080 --gpus all localai/localai:v2.10.1-cublas-cuda12-core mixtral-instruct",and send a test prompt to the 8080 port.then i return the error to me. To Reproduce

the command i send is "curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d "{ \"model\": \"mixtral-instruct\", \"prompt\": \"How are you doing?\" }"" Expected behavior


Additional context

fakezeta commented 3 months ago

Hi @testKKP can you post your model definition? The core images does not include python so SentenceTransformers and Coqui are not available. For your reference backends that requires python are:

ThaDrone commented 3 months ago

Same issue here, using the new all in one image

 docker run -p 8080:8080 --gpus all --name local-ai -ti


{ "model": "gpt-4", "messages": [{"role": "user",     
  "content": "How are you doing?", "temperature": 0.1}]}


    "error": {
        "code": 500,
        "message": "rpc error: code = Unknown desc = unimplemented",
        "type": ""

Running Fedora Linux + Nvidia RTX3060

fakezeta commented 3 months ago

Hi @ThaDrone this is a different issue since gpt-4 from AIO images uses llama.cpp backend while it seems that @testKKP is using a python (transformer?) backend. Can you open a new issue with the output from docker logs local-ai?

testKKP commented 3 months ago

Hi @ThaDrone this is a different issue since gpt-4 from AIO images uses llama.cpp backend while it seems that @testKKP is using a python (transformer?) backend. Can you open a new issue with the output from docker logs local-ai?

thansk for you fact ,i dont know my model definition. i just run the command provided in the official website like this “docker run -ti -p 8080:8080 --gpus all localai/localai:v2.10.1-cublas-cuda12-core mixtral-instruct”. the others were all done by the docker i dont know how to solve it.

fakezeta commented 3 months ago

There is mixtral configuration in the example directory: Download the files in your models directory with the GGUF file. In the example is used mixtral-8x7b-instruct-v0.1.Q2_K.gguf: you can choose whatever you prefer just remember to update the config.

localai-bot commented 2 months ago

As a bot (experiment of @mudler), I would like to provide some guidelines to resolve the issue:

  1. Make sure Docker is properly installed and running on your system.
  2. Download the mixtral-instruct configuration files from the relevant GGUF link. You can choose any available GGUF file for mixtral-instruct.
  3. Save the downloaded files in the models directory of your LocalAI installation.
  4. Stop the currently running Docker container if it's still running. You can use the command: docker stop local-ai
  5. Run the following command, replacing [gguf_file] with the name of the file you downloaded in step 2:
    docker run -p 8080:8080 --gpus all localai/localai:v2.10.1-cublas-cuda12-core mixtral-instruct-[gguf_file]

Please replace [gguf_file] in the command with the actual filename you downloaded in step 3. If you followed the instructions in the example, the command should look like this:

docker run -p 8080:8080 --gpus all localai/localai:v2.10.1-cublas-cuda12-core mixtral-8x7b-instruct-v0.1.Q2_K.gguf

If you need further assistance, please provide the output of the command and any additional logs while running LocalAI in debug mode (--debug or DEBUG=true).

qingfenghcy commented 2 months ago

Same issue here, using the new all in one image

 docker run -p 8080:8080 --gpus all --name local-ai -ti


{ "model": "gpt-4", "messages": [{"role": "user",     
  "content": "How are you doing?", "temperature": 0.1}]}


  "error": {
      "code": 500,
      "message": "rpc error: code = Unknown desc = unimplemented",
      "type": ""

Running Fedora Linux + Nvidia RTX3060

Hello, I also had a similar problem, did you solve it?