tkestack / tke

Native Kubernetes container management platform supporting multi-tenant and multi-cluster
Other
1.47k stars 335 forks source link

tke增加GPU节点报错 #1628

Open cloudcache opened 3 years ago

cloudcache commented 3 years ago

Machine.platform.tkestack.io "mc-2nd9f98g" is invalid: spec.labels: Invalid value: map[string]string{"nvidia-device-enable":"enable"}: must have GPU card if set GPU label 而主机是有显卡且驱动安装正常。nvidia-smi正常。 nvidia-smi Tue Oct 19 09:07:39 2021
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla T4 Off | 00000000:21:01.0 Off | 0 | | N/A 38C P0 25W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 Tesla T4 Off | 00000000:21:02.0 Off | 0 | | N/A 37C P0 25W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+

Environment:

TianFengshou commented 2 years ago

我现在也是有这个问题,请问后续解决该问题了吗?