xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.4k stars 438 forks source link

按官网要求,通过K8s集群部署 kbcli addon enable xinference 提示not found #1852

Open Fun-Fox opened 4 months ago

Fun-Fox commented 4 months ago

System Info / 系統信息

image

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

最新版本

The command used to start Xinference / 用以启动 xinference 的命令

kbcli addon enable xinference

Reproduction / 复现过程

kbcli addon enable xinference

Expected behavior / 期待表现

正常运行

lynnleelhl commented 4 months ago

可能是 addon 安装失败了,试试 kbcli addon install xinference

Fun-Fox commented 4 months ago

@lynnleelhl 多谢,可以了 我的k8s集群,使用的是工作站显卡 4个节点:1台1080t(master) 、2台3080(work)、1台4060(work) 引出几个使用用例,想评估一下,都可以支持吗? image

Fun-Fox commented 4 months ago

另外遇到了新的问题: 错误1:

root@node4:~# kbcli cluster create xinference 123a
error: execution error at (xinference-cluster/templates/cluster.yaml:4:11): Release name "123a" is invalid. It must match the regex "^[a-z]([-a-z0-9]*[a-z0-9])?$".

错误2:

root@node4:~# kbcli cluster create xinference a123
Info: --version is not specified, xinference-0.11.0 is applied by default.
The Cluster "a123" is invalid: spec.componentSpecs[0].volumeClaimTemplates[0].spec.resources.requests.storage: Invalid value: "<nil>Gi": spec.componentSpecs[0].volumeClaimTemplates[0].spec.resources.requests.storage iE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$'

尝试集群名字以数字开头可以绕过错误2,但是会出现错误1 尝试集群名字以字母开头会出现错误2

image

另外如何指定 --version,我看到提示默认是从0.11.0 想改为0.13.0 如何做到

lynnleelhl commented 4 months ago

另外遇到了新的问题: 错误1:

root@node4:~# kbcli cluster create xinference 123a
error: execution error at (xinference-cluster/templates/cluster.yaml:4:11): Release name "123a" is invalid. It must match the regex "^[a-z]([-a-z0-9]*[a-z0-9])?$".

错误2:

root@node4:~# kbcli cluster create xinference a123
Info: --version is not specified, xinference-0.11.0 is applied by default.
The Cluster "a123" is invalid: spec.componentSpecs[0].volumeClaimTemplates[0].spec.resources.requests.storage: Invalid value: "<nil>Gi": spec.componentSpecs[0].volumeClaimTemplates[0].spec.resources.requests.storage iE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$'

尝试集群名字以数字开头可以绕过错误2,但是会出现错误1 尝试集群名字以字母开头会出现错误2

image

另外如何指定 --version,我看到提示默认是从0.11.0 想改为0.13.0 如何做到

试下 kbcli cluster create xxx --cluster-definition xinference 这个命令,也可以创建集群

lynnleelhl commented 4 months ago

修改版本号使用参数 --cluster-version kbcli cluster create -h 可以查看更多参数的说明 不过目前没有 0.13 的版本,只有 0.11,需要的话我们可以添加。 查看有哪些版本命令 kbcli cv list --cluster-definition=xinference

Fun-Fox commented 4 months ago

image

error: failed to find the default storageClass, use '--set storageClass=NAME' to set it

出现这样的提示

lynnleelhl commented 4 months ago

你的 k8s 环境中没有 storageclass,本地环境的话需要自己创建一个默认的 storageclass,可以参考 k8s 官方文档 https://kubernetes.io/docs/concepts/storage/storage-classes/

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 7 days with no activity.