Open mintchocohoco opened 2 years ago
hello, I am using p100 gpu, and there is a problem that more than 1000 features(1002,1003,1004,1005....) are not work with this error code
Error setting watches. Result: -33: This request is serviced by a module of DCGM that is not currently loaded
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb sudo dpkg -i cuda-keyring_1.0-1_all.deb sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /" sudo apt-get update && sudo apt-get install -y datacenter-gpu-manager sudo systemctl --now enable nvidia-dcgm
@mintchocohoco,
The DCP metrics (1001...) are supported starting from Turing architecture. Pascal is not supported. For some metrics (1013,1014) you would need at least an Ampere GA100 chip.
hello, I am using p100 gpu, and there is a problem that more than 1000 features(1002,1003,1004,1005....) are not work with this error code