NVIDIA / gpu-monitoring-tools

Tools for monitoring NVIDIA GPUs on Linux
Apache License 2.0
1.01k stars 301 forks source link

DCGM_FI_DEV_GPU_UTIL Abnormal Output #152

Open Jea-Eok-Kim opened 3 years ago

Jea-Eok-Kim commented 3 years ago

Hello You want to use the gpu monitoring system in grafana using dcgm-exporter and prometheus, but the value of gpu usage is strangely output.

Grafana screen output is as follows.

스크린샷 2021-01-25 오전 9 52 46

As you can see on the screen, GPU utilization is 9223372036854776000%.

The dcgmidmon -e 203 (DCGM_FI_DEV_GPU_UTIL) command results are as follows:

스크린샷 2021-01-25 오전 10 03 10

The MIG Enable status is as follows.

스크린샷 2021-01-25 오전 9 49 55

Please let me know if there is a way to obtain GPU usage rate even in MIG Enable.

Please help me.