ROCm / ROC-smi

ROC System Management Interface
https://github.com/RadeonOpenCompute/ROC-smi/blob/master/README.md
179 stars 55 forks source link

NameError: name 'y' is not defined #39

Closed calvintam236 closed 6 years ago

calvintam236 commented 6 years ago

ROCm 1.8.199 Ubuntu 18.04.1 RX470 x2 Ryzen 1700X

$  /opt/rocm/bin/rocm-smi

====================    ROCm System Management Interface    ====================
================================================================================
 GPU  Temp    AvgPwr   SCLK     MCLK     Fan      Perf    SCLK OD    MCLK OD
  1   67c     97.121W  1236Mhz  1750Mhz  100.0%   manual    0%         0%       
  0   73c     89.208W  1236Mhz  1750Mhz  100.0%   manual    0%         0%       
================================================================================
====================           End of ROCm SMI Log          ====================

$ /opt/rocm/bin/rocm-smi --setoverdrive 10 -d 0

====================    ROCm System Management Interface    ====================

          ******WARNING******

          Operating your AMD GPU outside of official AMD specifications or outside of
          factory settings, including but not limited to the conducting of overclocking
          (including use of this overclocking software, even if such software has been
          directly or indirectly provided by AMD or otherwise affiliated in any way
          with AMD), may cause damage to your AMD GPU, system components and/or result
          in system failure, as well as cause other problems. DAMAGES CAUSED BY USE OF
          YOUR AMD GPU OUTSIDE OF OFFICIAL AMD SPECIFICATIONS OR OUTSIDE OF FACTORY
          SETTINGS ARE NOT COVERED UNDER ANY AMD PRODUCT WARRANTY AND MAY NOT BE COVERED
          BY YOUR BOARD OR SYSTEM MANUFACTURER'S WARRANTY. Please use this utility with caution.

Do you accept these terms? [y/N] y
Traceback (most recent call last):
  File "/opt/rocm/bin/rocm-smi", line 1160, in <module>
    setClockOverDrive(deviceList, 'gpu', args.setoverdrive, args.autorespond)
  File "/opt/rocm/bin/rocm-smi", line 795, in setClockOverDrive
    confirmOverDrive(autoRespond)
  File "/opt/rocm/bin/rocm-smi", line 179, in confirmOverDrive
    user_input = input('Do you accept these terms? [y/N] ')
  File "<string>", line 1, in <module>
NameError: name 'y' is not defined

Because of the error, I cannot overdrive the GPU. echo 10 > /sys/class/drm/card0/device/pp_sclk_od doesn't work either.

jlgreathouse commented 6 years ago

This appears to be an incompatibility between python 2 and python 3.

Could you modify your version of rocm-smi to change the top line from: #!/usr/bin/env python to #!/usr/bin/env python3 ?

In addition, to get this working in ROCm 1.8.x, you will likely need to change line 805 from if clktype == 'mem': to elif clktype == 'mem':

I believe the latter problem will be fixed in ROCm 1.9.x. I'll report the python 2/3 thing internally if my recommendation fixes this for you.

kentrussell commented 6 years ago

Thanks, I hadn't tested it on python2, so I'll see if we can add that to our test coverage. Working through some potential ideas to ensure that it's completely covered so we don't have situations like this again

kentrussell commented 6 years ago

I did just test python2 and python3 and both worked, so this should be covered in the 1.9 release (I made a patch to make the SMI python2 compatible but it hasn't been released yet). Can you try out the krussell/fixes branch? It's not an official release but the fixes would be included inside, so if that fixes it, we can stick with that until 1.9 officially drops

calvintam236 commented 6 years ago

@kentrussell The master branch version does not have this error. The one I have at /opt/rocm/bin/ is still broken. (Per dpkg - rocm-smi is v1.0.0-46-g81ef66f from Radeon repo)

Any chance to add a function at rocm-smi to show its release version natively? Look like there isn't one you can call from command line.

kentrussell commented 6 years ago

We never really introduced versioning to the script, since it really is just a massive Python script and I never intended to worry about versioning when I made it. I'll see if it's something worth looking into going forward. I should at least be able to change the version reported by dpkg, but I never thought about versioning the SMI itself, since I never thought that the versions would mean anything.

kentrussell commented 6 years ago

Closing this since the issue is resolved in the krussell/fixes branch, which will include the fix in the 1.9 release being distributed tomorrow (and you can use the krussell/fixes or roc-1.9.x branch if you want to just copy it in yourself to save a day of waiting)