Closed 287400117 closed 6 months ago
This is expected behavior. DCGM diagnostics were removed as they increase the size of the container and are not used by DCGM-Exporter. If DCGM diagnostics are needed, the standalone DCGM container has that functionality.
What is the version?
3.3.5-3.4.1
What happened?
dcgmi diag -r 3
in dcgm-exporter, the prompt shows:Final troubleshooting revealed that there is a section of code in the Dockerfile that deletes the
/usr/share/nvidia-validation-suite
directory after installing datacenter-gpu-manager.What did you expect to happen?
The command
dcgmi diag -r 3
can be executed normally.What is the GPU model?
No response
What is the environment?
No response
How did you deploy the dcgm-exporter and what is the configuration?
No response
How to reproduce the issue?
No response
Anything else we need to know?
No response