Open zh168654 opened 6 years ago
@zh168654 have you find any workaround or clues? I am facing a similar error which says:
Failed to collect metrics: nvml: Not Supported
My Driver Version is : 390.59, GPU is Tesla K80.
While this error does NOT occur on other env whose GPU is GTX 1080
hi,
@zh168654 have you find any workaround or clues? I am facing a similar error which says:
Failed to collect metrics: nvml: Not Supported
My Driver Version is : 390.59, GPU is Tesla K80.
While this error does NOT occur on other env whose GPU is GTX 1080
hi, I have the same problem. I think it is the reason why exporter can not get metrics. My Driver Version is 390.48, with two GTX 980. Server Os is Ubuntu 16.04
I'm running into the same problem. I suspect it's because the Docker image is built with Alpine (and hence musl libc) while Nvidia's NVML library (libnvidia-ml.so) depends on glibc.
This is my deployment:
when I exec into nvidia-exporter and run
ls /usr/local/nvidia/lib64
there exists libnvidia-ml.so.1\ But the container logs always show
Failed to collect metrics: could not load NVML library