-
As Xinference becomes popular, please add it as an ModelWrapper.
-
**Describe the bug**
when cluster.spec.componentSpecs.instances is set, the env KB_0_HOSTNAME value is not equal to pod's real hostname
![Rjb3UBdv5e](https://github.com/user-attachments/assets/758e5…
-
20220407PWEA:/tmp$ slim build --target xprobe/xinference:v0.14.0 --tag xprobe/xinference:v0.14.0-slim
cmd=build info=param.http.probe message='using default probe'
cmd=build state=started
cmd=build…
-
docker.io/apecloud/xinference:v0.11.0-cpu
docker.io/apecloud/xinference:v0.11.0
-
xprobe/xinference:v0.15.2
-
### Your current environment
在A800(80G显存) 2卡机器上启动两个qwen-14B的模型,一张卡上一个模型,第一个模型启动正常,但是在启动第二个模型的时候,vllm版本是0.3.3
### 🐛 Describe the bug
WARNING 03-29 18:28:18 tokenizer.py:64] Using a slow tokeni…
-
### Describe your problem
I deployed models locally with Xinference, both LLM and embedding models, with API key authorization enabled.
When I configure RAGFlow for Xinference models, there's a 401 …
-
### Describe your problem
add rerank model by Xinference 0.11.1
![image](https://github.com/user-attachments/assets/fc6d2c00-5183-4f8d-8a3a-f279e2086724)
ragflow logs: [INFO] [2024-09-06 16:17:18…
-
### System Info / 系統信息
CUDA==12.1
transformers == 4.44.2
llama_cpp_python == 0.2.90
vllm == 0.6.1.post2
vllm-flash-attn == 2.6.1
Python==3.10.14
Ubuntu==24.04
### Running Xinference wit…
-
### System Info / 系統信息
cuda:12.2
llama-cpp-python: llama_cpp_python-0.2.88-cp311-cp311-linux_x86_64.whl
python: 3.11
Ubuntu:22.04.4
### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfer…