Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
(xinference) (base) bqc@sciyon-LEGION-REN9000K-34IRZ:~/project/ragflow$ xinference launch --model_path "/home/sciyonadmin/xinference/modelscope/hub/glm-4-9b-chat" --model-engine Transformers -n glm4-chat
Launch model name: glm4-chat with kwargs: {'model_path': '/home/sciyonadmin/xinference/modelscope/hub/glm-4-9b-chat'}
Traceback (most recent call last):
File "/home/sciyonadmin/miniforge3/envs/xinference/bin/xinference", line 8, in
sys.exit(cli())
^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1157, in call
return self.main(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 901, in model_launch
model_uid = client.launch_model(
^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 940, in launch_model
raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:58184, pid=1222936] No available slot found for the model
System Info / 系統信息
cuda:12.04
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
0.15.4
The command used to start Xinference / 用以启动 xinference 的命令
xinference launch --model_path "/home/sciyonadmin/xinference/modelscope/hub/glm-4-9b-chat" --model-engine Transformers -n glm4-chat
Reproduction / 复现过程
(xinference) (base) bqc@sciyon-LEGION-REN9000K-34IRZ:~/project/ragflow$ xinference launch --model_path "/home/sciyonadmin/xinference/modelscope/hub/glm-4-9b-chat" --model-engine Transformers -n glm4-chat Launch model name: glm4-chat with kwargs: {'model_path': '/home/sciyonadmin/xinference/modelscope/hub/glm-4-9b-chat'} Traceback (most recent call last): File "/home/sciyonadmin/miniforge3/envs/xinference/bin/xinference", line 8, in
sys.exit(cli())
^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1157, in call
return self.main(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 901, in model_launch
model_uid = client.launch_model(
^^^^^^^^^^^^^^^^^^^^
File "/home/sciyonadmin/miniforge3/envs/xinference/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 940, in launch_model
raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:58184, pid=1222936] No available slot found for the model
Expected behavior / 期待表现
succeed launch LLM