Project-HAMi / HAMi

Heterogeneous AI Computing Virtualization Middleware
http://project-hami.io/
Apache License 2.0
945 stars 195 forks source link

HAMi成功安装,服务运行正常,但是算力无法切分 #606

Open ak47947 opened 1 week ago

ak47947 commented 1 week ago

What happened: 使用GPU Operator安装Kubernetes GPU 环境搭建,然后安装HAMi插件,服务安装正常,但是GPU数量还是显示1,在容器中也未切分

What you expected to happen: GPU数量显示10份(默认),容器中资源得到限制

How to reproduce it (as minimally and precisely as possible): 使用GPU Operator安装Kubernetes GPU 环境

Anything else we need to know?:

  1. 安装后服务正常

    1
  2. GPU没有切分

    2

Environment:

archlitchi commented 6 days ago

have you uninstalled nvidia-k8s-device-plugin before installing HAMi?

ak47947 commented 6 days ago

have you uninstalled nvidia-k8s-device-plugin before installing HAMi?

image

已经安装了的,是否需要卸载

ak47947 commented 6 days ago

我通过helm uninstall hami -n kube-system 卸载后重装hami解决了,现在可以看到GPU信息了

image

进入容器也可以看到隔离信息了

image
ak47947 commented 6 days ago

发现一个新的问题,这个问题可能是因为开关机为主机增加和删除新的显卡引起的,在增加和删除显卡后,hami会失效