Closed calvintam236 closed 6 years ago
This appears to be an incompatibility between python 2 and python 3.
Could you modify your version of rocm-smi to change the top line from:
#!/usr/bin/env python
to #!/usr/bin/env python3
?
In addition, to get this working in ROCm 1.8.x, you will likely need to change line 805 from if clktype == 'mem':
to elif clktype == 'mem':
I believe the latter problem will be fixed in ROCm 1.9.x. I'll report the python 2/3 thing internally if my recommendation fixes this for you.
Thanks, I hadn't tested it on python2, so I'll see if we can add that to our test coverage. Working through some potential ideas to ensure that it's completely covered so we don't have situations like this again
I did just test python2 and python3 and both worked, so this should be covered in the 1.9 release (I made a patch to make the SMI python2 compatible but it hasn't been released yet). Can you try out the krussell/fixes branch? It's not an official release but the fixes would be included inside, so if that fixes it, we can stick with that until 1.9 officially drops
@kentrussell The master branch version does not have this error. The one I have at /opt/rocm/bin/
is still broken. (Per dpkg
- rocm-smi
is v1.0.0-46-g81ef66f from Radeon repo)
Any chance to add a function at rocm-smi
to show its release version natively? Look like there isn't one you can call from command line.
We never really introduced versioning to the script, since it really is just a massive Python script and I never intended to worry about versioning when I made it. I'll see if it's something worth looking into going forward. I should at least be able to change the version reported by dpkg, but I never thought about versioning the SMI itself, since I never thought that the versions would mean anything.
Closing this since the issue is resolved in the krussell/fixes branch, which will include the fix in the 1.9 release being distributed tomorrow (and you can use the krussell/fixes or roc-1.9.x branch if you want to just copy it in yourself to save a day of waiting)
ROCm 1.8.199 Ubuntu 18.04.1 RX470 x2 Ryzen 1700X
Because of the error, I cannot overdrive the GPU.
echo 10 > /sys/class/drm/card0/device/pp_sclk_od
doesn't work either.