Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Is your feature request related to a problem? Please describe
No.
Describe the solution you'd like
K8s 中可以用headless(服务名)形式访问服务,而不仅是IP。但是目前部署在xinference-supervisor 和xinference-worker都不支持将“--host” 设置为服务名。请支持使用服务名作为host name,这将使k8s上的部署大大简化。
Describe alternatives you've considered
n/a
Additional context
FastChat 支持以headless模式访问,可以参考它的实现方式。