AliyunContainerService / gpushare-scheduler-extender

GPU Sharing Scheduler for Kubernetes Cluster
Apache License 2.0
1.39k stars 308 forks source link

aliyun.com/gpu-mem 为0 #141

Open alanyanglong opened 3 years ago

alanyanglong commented 3 years ago

我按照文档安装完后kubectl describe node 发现aliyun.com/gpu-mem:0
请教下这是哪里有问题? image

alanyanglong commented 3 years ago

Capacity: aliyun.com/gpu-count: 1 aliyun.com/gpu-mem: 0 cpu: 8 ephemeral-storage: 109691332Ki

2811299 commented 3 years ago

Rebooting the system solve the same issue here, not sure what is the root cause and better solution.

Toxictoma commented 3 weeks ago

多卡资源分配设置单位为GiB 多卡单位是MiB的时候,可能超出k8s限额,资源分配单位设置成GiB后重启相关pod可解决