Open alanyanglong opened 3 years ago
Capacity: aliyun.com/gpu-count: 1 aliyun.com/gpu-mem: 0 cpu: 8 ephemeral-storage: 109691332Ki
Rebooting the system solve the same issue here, not sure what is the root cause and better solution.
多卡单位是MiB的时候,可能超出k8s限额,资源分配单位设置成GiB后重启相关pod可解决
我按照文档安装完后kubectl describe node 发现aliyun.com/gpu-mem:0
请教下这是哪里有问题?