ROCm / rocm_smi_lib

ROCm SMI LIB
https://rocm.docs.amd.com/projects/rocm_smi_lib/en/latest/
MIT License
112 stars 48 forks source link

`rocm-smi -a` fails with error code 2 #74

Closed misos1 closed 8 months ago

misos1 commented 3 years ago

This never happened with previous versions. Maybe better would be to just ignore info which cannot be queried and throw errors only when the user specifically uses options like --showpagesinfo on the command line?

Maybe error code is returned because of this:

================================== Pages Info ==================================
ERROR: 2 GPU[0]: ras: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
============================ Show Valid sclk Range =============================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
GPU[0]      : Unable to display sclk range
ERROR: 2 GPU[1]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
GPU[1]      : Unable to display sclk range
================================================================================
============================ Show Valid mclk Range =============================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
GPU[0]      : Unable to display mclk range
ERROR: 2 GPU[1]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
GPU[1]      : Unable to display mclk range
================================================================================
=========================== Show Valid voltage Range ===========================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
GPU[0]      : Unable to display voltage range
ERROR: 2 GPU[1]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
GPU[1]      : Unable to display voltage range
================================================================================
============================= Voltage Curve Points =============================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
GPU[0]      : Voltage Curve is not supported
ERROR: 2 GPU[1]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment. 
GPU[1]      : Voltage Curve is not supported
================================================================================
WARNING:         One or more commands failed
============================= End of ROCm SMI Log ==============================

https://github.com/RadeonOpenCompute/ROC-smi/issues/95

dmitrii-galantsev commented 8 months ago

Seems fixed in the latest version.