xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.4k stars 438 forks source link

gpu版docker无法启动 #2340

Closed SDAIer closed 1 month ago

SDAIer commented 1 month ago

System Info / 系統信息

Driver Version: 535.183.06 CUDA Version: 12.2 Python 3.12.4

centos:3.10.0-1160.el7.x86_64

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

latest

The command used to start Xinference / 用以启动 xinference 的命令

docker run -v /root/fastgpt/xinference_image:/root/xinference_image -e XINFERENCE_MODEL_SRC=modelscope -e XINFERENCE_HOME=/root/xinference_image -d -p 9998:9997 --name xinference_gpu registry.cn-hangzhou.aliyuncs.com/xprobe_xinference/xinference:latest xinference-local -H 0.0.0.0 --log-level debug

Reproduction / 复现过程

docker log xinference_gpu

Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 75, in _load_shared_library return ctypes.CDLL(str(_lib_path), **cdll_args) # type: ignore File "/usr/lib/python3.10/ctypes/init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/bin/xinference-local", line 5, in from xinference.deploy.cmdline import local File "/usr/local/lib/python3.10/dist-packages/xinference/init.py", line 37, in _install() File "/usr/local/lib/python3.10/dist-packages/xinference/init.py", line 34, in _install install_model() File "/usr/local/lib/python3.10/dist-packages/xinference/model/init.py", line 17, in _install from .audio import _install as audio_install File "/usr/local/lib/python3.10/dist-packages/xinference/model/audio/init.py", line 22, in from .core import ( File "/usr/local/lib/python3.10/dist-packages/xinference/model/audio/core.py", line 20, in from ..core import CacheableModelSpec, ModelDescription File "/usr/local/lib/python3.10/dist-packages/xinference/model/core.py", line 19, in from ..types import PeftModelConfig File "/usr/local/lib/python3.10/dist-packages/xinference/types.py", line 386, in from llama_cpp import Llama File "/usr/local/lib/python3.10/dist-packages/llama_cpp/init.py", line 1, in from .llama_cpp import * File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 88, in _lib = _load_shared_library(_lib_base_name) File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 77, in _load_shared_library raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}") RuntimeError: Failed to load shared library '/usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so': libcuda.so.1: cannot open shared object file: No such file or directory

Expected behavior / 期待表现

可以启动运行

qinxuye commented 1 month ago

NVIDIA Container Toolkit 安装了吗?

SDAIer commented 1 month ago

Driver Version: 535.183.06 CUDA Version: 12.2

Driver Version: 535.183.06 CUDA Version: 12.2 (base) [root@gpu ~]# rpm -qa | grep nvidia-container-toolkit nvidia-container-toolkit-base-1.13.5-1.x86_64 nvidia-container-toolkit-1.13.5-1.x86_64

版本不满足吗?