Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
[ ] docker / docker
[X] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装
Version info / 版本信息
xinference=0.15.4
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 0.0.0.0 --port 9997
Reproduction / 复现过程
代码如下
from xinference.client import Client
client = Client("http://192.0.0.181:9997")
list_models_run = client.list_models()
model_uid = list_models_run['bge-m3']['id']
embedding_client = client.get_model(model_uid)
text_lsit = 文本块list #每个文本块小于5K字
text_list_len = len(text_list)
step = 100
for index in range(0, text_list_len, step):
text_embeddings = embedding_client.create_embedding(text_list[index:index + step])
报错如下:
File "/home/netted/img_process_ml/nlp/net/embed.py", line 34, in text_embed
text_embeddings = embedding_client.create_embedding(text_list[index:index + step])
File "/home/netted/anaconda3/envs/nlp/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 122, in create_embedding
raise RuntimeError(
RuntimeError: Failed to create the embeddings, detail: Remote server 192.0.0.181:40919 closed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/netted/img_process_ml/nlp/net/embed.py", line 68, in <module>
text_embed(text_units, embedding_client)
File "/home/netted/img_process_ml/nlp/net/embed.py", line 38, in text_embed
text_embeddings = embedding_client.create_embedding(text_list[index:index + step])
File "/home/netted/anaconda3/envs/nlp/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 122, in create_embedding
raise RuntimeError(
RuntimeError: Failed to create the embeddings, detail: Remote server 192.0.0.181:44667 closed
74%|███████▍ | 3271500/4425878 [00:12<00:04, 263091.43it/s]
Process finished with exit code 1
System Info / 系統信息
Linux vllm=0.5.2 python=3.10.0 CUDA Version: 12.0
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
xinference=0.15.4
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 0.0.0.0 --port 9997
Reproduction / 复现过程
代码如下
报错如下:
Expected behavior / 期待表现
解决这个问题