Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
System Info / 系統信息
Ubuntu 22.04 cuda 12.4
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
v0.15.2
The command used to start Xinference / 用以启动 xinference 的命令
docker run -d -e XINFERENCE_MODEL_SRC=modelscope -e XINFERENCE_HOME=/data/models -v /path:/data/models -p 9997:9997 --gpus all --name xinference_v0.15.2 crpi-aeuahu5q8412u6ti.cn-hangzhou.personal.cr.aliyuncs.com/xinferencetest/huancun:v0.15.2 xinference-local -H 0.0.0.0
Reproduction / 复现过程
1.模型注册页面 2.Model Format 选择 GPTQ 3.然后Model Format 再选择回 pytorch 4.其他选项填入后,会提示 Please fill in valid value for all fields
Expected behavior / 期待表现
希望各种Model Format切换可以正常注册模型。