xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.27k stars 424 forks source link

Failed to launch model, detail: [address=0.0.0.0:58184, pid=1222936] No available slot found for the model #2508

Open chaoStart opened 2 hours ago

chaoStart commented 2 hours ago

System Info / 系統信息

cuda:12.04

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

0.15.4

The command used to start Xinference / 用以启动 xinference 的命令

xinference launch --model_path "/home/sciyonadmin/xinference/modelscope/hub/glm-4-9b-chat" --model-engine Transformers -n glm4-chat

Reproduction / 复现过程

(xinference) (base) bqc@sciyon-LEGION-REN9000K-34IRZ:~/project/ragflow$ xinference launch --model_path "/home/sciyonadmin/xinference/modelscope/hub/glm-4-9b-chat" --model-engine Transformers -n glm4-chat Launch model name: glm4-chat with kwargs: {'model_path': '/home/sciyonadmin/xinference/modelscope/hub/glm-4-9b-chat'} Traceback (most recent call last): File "/home/sciyonadmin/miniforge3/envs/xinference/bin/xinference", line 8, in sys.exit(cli()) ^^^^^ File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 783, in invoke return __callback(args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 901, in model_launch model_uid = client.launch_model( ^^^^^^^^^^^^^^^^^^^^ File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 940, in launch_model raise RuntimeError( RuntimeError: Failed to launch model, detail: [address=0.0.0.0:58184, pid=1222936] No available slot found for the model

Expected behavior / 期待表现

succeed launch LLM

chaoStart commented 1 hour ago

原因是已经启动了LLM,所以不能launch,需要关闭之前的LLM。因此突然就想到,是否能够通过xinference一次性启动多个LLM,在调用的时候选择一个LLM