xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.7k stars 369 forks source link

v0.13.3 Docker 镜像无法拉起 #1949

Closed eric1932 closed 1 month ago

eric1932 commented 1 month ago

System Info / 系統信息

docker run --gpus all xprobe/xinference xinference-local

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 75, in _load_shared_library
    return ctypes.CDLL(str(_lib_path), **cdll_args)  # type: ignore
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/xinference-local", line 5, in <module>
    from xinference.deploy.cmdline import local
  File "/usr/local/lib/python3.10/dist-packages/xinference/__init__.py", line 37, in <module>
    _install()
  File "/usr/local/lib/python3.10/dist-packages/xinference/__init__.py", line 34, in _install
    install_model()
  File "/usr/local/lib/python3.10/dist-packages/xinference/model/__init__.py", line 17, in _install
    from .llm import _install as llm_install
  File "/usr/local/lib/python3.10/dist-packages/xinference/model/llm/__init__.py", line 20, in <module>
    from .core import (
  File "/usr/local/lib/python3.10/dist-packages/xinference/model/llm/core.py", line 26, in <module>
    from ...types import PeftModelConfig
  File "/usr/local/lib/python3.10/dist-packages/xinference/types.py", line 399, in <module>
    from llama_cpp import Llama
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/__init__.py", line 1, in <module>
    from .llama_cpp import *
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 88, in <module>
    _lib = _load_shared_library(_lib_base_name)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 77, in _load_shared_library
    raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")
RuntimeError: Failed to load shared library '/usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so': libcuda.so.1: cannot open shared object file: No such file or directory

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

v0.13.3

The command used to start Xinference / 用以启动 xinference 的命令

docker run --gpus all xprobe/xinference xinference-local

Reproduction / 复现过程

不需要额外操作

Expected behavior / 期待表现

容器不退出

eric1932 commented 1 month ago

也测试了 docker run --gpus all xprobe/xinference:v0.13.2 xinference-local --log-level debug 就可以正常运行

zhanghx0905 commented 1 month ago

可以暂时将启动指令改为sh -c "pip uninstall -y llama-cpp-python && xinference-local --host 0.0.0.0 --port 8080"

ConleyKong commented 1 month ago

按照官方建议将内部的llama-cpp-python 改为 0.2.28 可以正常启动了