ROCm / ROC-smi

ROC System Management Interface
https://github.com/RadeonOpenCompute/ROC-smi/blob/master/README.md
179 stars 55 forks source link

rocm-smi crashing #12

Closed syifan closed 6 years ago

syifan commented 7 years ago

I keep getting this error when I run rocm-smi, can anyone see that is the problem?

Traceback (most recent call last):
  File "/opt/rocm/bin/rocm-smi", line 1009, in <module>
    showCurrentFans(deviceList)
  File "/opt/rocm/bin/rocm-smi", line 545, in showCurrentFans
    fanspeed = getFanSpeed(device, 'speed')
  File "/opt/rocm/bin/rocm-smi", line 311, in getFanSpeed
    currFan = int(fanfile.read().rstrip('\n'))
OSError: [Errno 22] Invalid argument
antonl1911 commented 7 years ago

I also experience this when trying to reset the compute profile on MI 25.

kentrussell commented 6 years ago

Can you guys give it a shot with 1.7? I did a pretty big rework on it, so hopefully this should be handled now. Thanks!

kentrussell commented 6 years ago

If neither of you have an issue with the 1.7 version, I'll close this issue. Thanks!

kentrussell commented 6 years ago

I have a fix for this being tested internally, and it should get pushed for 1.7.1. Thanks again for the feedback

syifan commented 6 years ago

I just updated to rocm 1.7. Now the error looks like this. I feel the new error is very related to the previous one. I am waiting for rocm 1.7.1 to solve the problem.

====================    ROCm System Management Interface    ====================
================================================================================
 GPU  Temp    AvgPwr   SCLK     MCLK     Fan      Perf    SCLK OD
Traceback (most recent call last):
  File "/opt/rocm/bin/rocm-smi", line 1058, in <module>
    showAllConcise(deviceList)
  File "/opt/rocm/bin/rocm-smi", line 728, in showAllConcise
    fan = str(getFanSpeed(device))
  File "/opt/rocm/bin/rocm-smi", line 358, in getFanSpeed
    fanLevel = int(getSysfsValue(device, 'fan'))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
gstoner commented 6 years ago

@syifan we released a beta4 this weekend, 1.7.1 will most likely go out Mid Next week

kentrussell commented 6 years ago

Can you try it out with the 1.7.1 release that just came out?

kentrussell commented 6 years ago

Assuming that it's fine since I haven't heard back in 2 months