update nodelabel for config-manger k8s-device-plugin continuing printing error msg, not stop

NVIDIA / k8s-device-plugin

NVIDIA device plugin for Kubernetes

Apache License 2.0

2.45k stars 573 forks source link

if i use nvidia.com/device-plugin.config to set config, just set config0 and after minutes set config1.

k8d-device-plugin continuing print msg, not stop

health.go:142] Error waiting for event: ERROR_UNKNOWN; Marking all devices as unhealthy

k8s-device-plugin verison is v0.15.0-rc.2
gpu driver is 535.129.03
GPU Info Tesla P100-PCIE-16GB

and I found gpu driver 470.129.06 not have set_default_device_pinned_mem_limit command param if has gpu driver least limit for gpu mem limit and Is it possible to monitor the GPU utilization for each MPS client independently?

NVIDIA / k8s-device-plugin

update nodelabel for config-manger k8s-device-plugin continuing printing error msg, not stop #669