Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
大佬们,麻烦问下,我用的xinference官方提供的docker镜像,用他提供的命令启动容器后,按照你这个方法自定义注册运行embedding模型,但是运行后只显示launch model name,然后就报错了 docker启动命令如下,参考官方 docker run -v /app/modelFile:/modelDir -e XINFERENCE HOME=/modelDir -p 8081:9997 --gpus '"device=1"' xinference:123 xinference-local -H 0.0.0.0 110
modeljson文件内容如下 { "model name":"custom-gte-large-en-v1-5", "dimensions":1024, "max tokens":512, "language":["en"], "model id":"Alibaba-NLP/gte-large-en-v1-5", "model uri":"/modelDir/huggingface/gte-large" }
自定义模型地址如下 https://hf-mirror.com/Alibaba-NLP/gte-large-en-v1.5/tree/main
模型pwd目录如下 /modelDir/huggingface/gte-large,docker启动时通过-v命令挂载的主机文件目录
错误内容如下 Launch model name:custom-gte-large-en-v1-5 with kwargs:{ Traceback (most recent call last): File "/opt/conda/bin/xinference",line 8,in
sys.exit(cli())
File "/opt/conda/lib/python3.10/site-packages/click/core.py"line 1157,in-call
return self.main(args,kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py",line 1078,in main
rv self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py",line 1688,in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py",line 1434,in invoke
return ctx.invoke(self.callback,ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py",line 783,in invoke
return__callback(*args,*kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/decorators.py",line 33,in new_func
return f(get_current_context(),args,**kwargs)
File "/opt/conda/lib/python3.10/site-packages/xinference/deploy/cmdline.py",line 898,in model_launch
model_uid client.launch_model(
File "/opt/conda/lib/python3.10/site-packages/xinference/client/restful/restful_client.py",line 911,in launch_model
raise RuntimeError(
containing a file named configuration.py.
checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.