Closed PorcelainMouse closed 10 months ago
Oops. 'b', not 'd'.
Oops. 'b', not 'd'.
I would like to better understand the error so that I can improve error checking. Are you saying you incorrectly entered the feature mask with a b instead of a d?
I was able to duplicate this issue and am working on improved handling of the issue. Thanks for reporting your observation!
I have this issue (or very similar) as well, not sure what to do.
qwerty@qwerty-asus-g14:~$ gpu-ls --debug
Ubuntu: Validated
Detected GPUs: NVIDIA: 1, AMD: 1
AMD: amdgpu/rocm version: UNKNOWN
AMD: Wattman features not enabled: 0xfff7bfff, See README file.
### read_time_val: 13-Nov-2023 06:01:19
model_display: True: GA106M GeForce RTX
loading: True: None
mem_loading: True: None
mem_vram_usage: True: None
mem_gtt_usage: True: None
power: True: None
power_cap: True: None
energy: True: 0.0
temp_val: True: None
vddgfx_val: True: nan
fan_pwm: True: None
sclk_f_val: True: None
sclk_ps_val: True:
mclk_f_val: True: None
mclk_ps_val: True:
ppm: True:
### read_time_val: 13-Nov-2023 06:01:19
model_display: True: Cezanne Vega Series
loading: True: None
mem_loading: True: None
mem_vram_usage: True: None
mem_gtt_usage: True: None
power: True: 21.0
power_cap: True: None
energy: True: 1e-06
temp_val: True: 53.0
vddgfx_val: True: 1400
fan_pwm: True: None
sclk_f_val: True: 400Mhz
sclk_ps_val: True: 1
mclk_f_val: True: 1600Mhz
mclk_ps_val: True: 3
ppm: True:
Total of 2 GPUs: 0 are rw, 1 is r-only, and 0 are w-only
Traceback (most recent call last):
File "/usr/bin/gpu-ls", line 174, in <module>
main()
File "/usr/bin/gpu-ls", line 149, in main
gpu_list.read_gpu_pstates()
File "/usr/lib/python3/dist-packages/GPUmodules/GPUmodule.py", line 2503, in read_gpu_pstates
gpu.read_gpu_pstates()
File "/usr/lib/python3/dist-packages/GPUmodules/GPUmodule.py", line 1236, in read_gpu_pstates
lineitems[0] = int(re.sub(':', '', lineitems[0]))
~~~~~~~~~^^^
IndexError: list index out of range
@qwertychouskie
It is slightly different. Are you running the latest release on PyPI? I have fixed a similar problem in that release, but your issue is slightly different. Can you uninstall the version you are using and install the latest from PyPI? Also, a copy of the debug logfile would be useful. Thanks!
OK, I saw that the update should fix it, so I updated all the appropriate .py files, but got a weirder error:
qwerty@qwerty-asus-g14:~$ gpu-ls --debug
Ubuntu: Validated
Detected GPUs: NVIDIA: 1, AMD: 1
AMD: amdgpu/rocm version: UNKNOWN
AMD: Wattman features not enabled: 0xfff7bfff, See README file.
### read_time_val: 13-Nov-2023 06:20:02
model_display: True: GA106M GeForce RTX
loading: True: None
mem_loading: True: None
mem_vram_usage: True: None
mem_gtt_usage: True: None
power: True: None
power_cap: True: None
energy: True: 0.0
temp_val: True: None
vddgfx_val: True: nan
fan_pwm: True: None
sclk_f_val: True: None
sclk_ps_val: True:
mclk_f_val: True: None
mclk_ps_val: True:
ppm: True:
### read_time_val: 13-Nov-2023 06:20:03
model_display: True: Cezanne Vega Series
loading: True: None
mem_loading: True: None
mem_vram_usage: True: None
mem_gtt_usage: True: None
power: True: 22.0
power_cap: True: None
energy: True: 1e-06
temp_val: True: 61.0
vddgfx_val: True: 1387
fan_pwm: True: None
sclk_f_val: True: 400Mhz
sclk_ps_val: True: 1
mclk_f_val: True: 1600Mhz
mclk_ps_val: True: 3
ppm: True:
Total of 2 GPUs: 0 are rw, 1 is r-only, and 0 are w-only
Traceback (most recent call last):
File "/usr/bin/gpu-ls", line 179, in <module>
main()
File "/usr/bin/gpu-ls", line 173, in main
gpu_list.print()
File "/usr/lib/python3/dist-packages/GPUmodules/GPUmodule.py", line 2590, in print
else: gpu.print()
^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/GPUmodules/GPUmodule.py", line 1887, in print
if param_name in GpuItem.amd_type_skip_lists[self.prm.gpu_type]:
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
KeyError: Supported
qwerty@qwerty-asus-g14:~$
Commenting out lines 1887 and 1888 seem to let the program run properly.
@qwertychouskie Thanks for your detailed feedback. I have made a change to the code in the repository. Let me know if this works. If no issue, I will push to PyPI.
Seems good on my end.
Might want to tag a new release on GitHub as well, so the fixes get picked by distros.
Seems good on my end.
Might want to tag a new release on GitHub as well, so the fixes get picked by distros.
I want to double check my implementation of Enum objects as dict keys before I release for distro update. I will update to PyPI when work is complete for additional testing.
Confirmed fixed in v3.8.4.
Sorry, but I'm not quite sure what I'm seeing here. Previously, gpu-ls worked, but, after enabling writing to the card (w/ kernel param amdgpu.ppfeaturemask=... as directed) I get this error running gpu-ls & gpu-pac.