Ricks-Lab / gpu-utils

A set of utilities for monitoring and customizing GPU performance
GNU General Public License v3.0
133 stars 23 forks source link

Can not read parameter and list index out of range #124

Closed RobisLV closed 2 years ago

RobisLV commented 2 years ago

I am trying to run gpu-ls but I get this error:

Detected GPUs: AMD: 1
AMD: Wattman features enabled: 0xfffd7fff
Warning: Can not read parameter: loading, disabling for this GPU: 0
Warning: Can not read parameter: mem_loading, disabling for this GPU: 0
Traceback (most recent call last):
  File "./gpu-ls", line 154, in <module>
    main()
  File "./gpu-ls", line 125, in main
    gpu_list.read_gpu_sensor_set(data_type=Gpu.GpuItem.SensorSet.All)
  File "/home/stratos/.local/lib/python3.8/site-packages/GPUmodules/GPUmodule.py", line 2164, in read_gpu_sensor_set
    gpu.read_gpu_sensor_set(data_type)
  File "/home/stratos/.local/lib/python3.8/site-packages/GPUmodules/GPUmodule.py", line 1277, in read_gpu_sensor_set
    return self.read_gpu_sensor_set_amd(data_type)
  File "/home/stratos/.local/lib/python3.8/site-packages/GPUmodules/GPUmodule.py", line 1416, in read_gpu_sensor_set_amd
    self.set_params_value(param, rdata)
  File "/home/stratos/.local/lib/python3.8/site-packages/GPUmodules/GPUmodule.py", line 592, in set_params_value
    self.sclk_dpm_state.update({ps_key: sclk_ps[1]})
IndexError: list index out of range

My grub file looks like this:

# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="amdgpu.cik_support=1 amdgpu.si_support=1 quiet splash amdgpu.ppfeaturemask=0xfffd7fff"
GRUB_CMDLINE_LINUX=""
Ricks-Lab commented 2 years ago

What version of gpu-utils are you using? It looks like you are possibly running the version that is the default from debian apt installation. That is very old and I am working with the package developer to update to the latest. I suggest to remove the version you have and install from pypi or rickslab.com com as described in the project readme. I will announce when the debian package is back in sync with the project.

RobisLV commented 2 years ago

Thanks for quick reply! I installed it via pip3 with python 3.8. The gpu-ls --about command says Version: 3.6.1.

Ricks-Lab commented 2 years ago

I have pushed a commit that handles the case where an invalid p-state is found and modified logging to collect more information on the problem. Please remove your current version and do a repository install to run this test version. See readme on details of running repository version. Then run ./gpu-ls --debug and post the log file here. Thanks!

RobisLV commented 2 years ago

Thanks! Looks like that did the trick.

Detected GPUs: AMD: 1
AMD: Wattman features enabled: 0xfffd7fff
Warning: Can not read parameter: loading, disabling for this GPU: 0
Warning: Can not read parameter: mem_loading, disabling for this GPU: 0
Warning: Can not read parameter: power_cap_range, disabling for this GPU: 0
Warning: Can not read parameter: power, disabling for this GPU: 0
Warning: Can not read parameter: power_cap, disabling for this GPU: 0
Warning: Can not read parameter: fan_speed_range, disabling for this GPU: 0
1 total GPUs, 0 rw, 1 r-only, 0 w-only

Card Number: 0
   Vendor: AMD
   Readable: True
   Writable: False
   Compute: False
   GPU UID: None
   Device ID: {'device': '0x6819', 'subsystem_device': '0x2553', 'subsystem_vendor': '0x1458', 'vendor': '0x1002'}
   Decoded Device ID: Pitcairn PRO [Radeon HD 7850 / R7 265 / R9 270 1024SP]
   Card Model: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn PRO [Radeon HD 7850 / R7 265 / R9 270 1024SP]
   Display Card Model:  Pitcairn PRO HD
   PCIe ID: 01:00.0
      Link Speed: 8 GT/s
      Link Width: 16
   ##################################################
   Driver: radeon, amdgpu
   vBIOS Version: 113-xxx-xxx
   Compute Platform: None
   GPU Type: Modern
   HWmon: /sys/class/drm/card0/device/hwmon/hwmon1
   Card Path: /sys/class/drm/card0/device
   System Card Path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
   ##################################################
   Current Power (W): None
   Power Cap (W): None
      Power Cap Range (W): [None, None]
   Fan Enable: 1
   Fan PWM Mode: [0, 'None']
   Fan Target Speed (rpm): 0
   Current Fan Speed (rpm): 0
   Current Fan PWM (%): 40
      Fan Speed Range (rpm): [None, None]
      Fan PWM Range (%): [0, 100]
   ##################################################
   Current GPU Loading (%): None
   Current Memory Loading (%): None
   Current GTT Memory Usage (%): 2.462
      Current GTT Memory Used (GB): 0.074
      Total GTT Memory (GB): 3.000
   Current VRAM Usage (%): 19.386
      Current VRAM Used (GB): 0.388
      Total VRAM (GB): 2.000
   Current  Temps (C): {'edge': 36.0}
   Critical Temps (C): {'edge': 120.0}
   Current Voltages (V): {}
   Current Clk Frequencies (MHz): {'mclk': 1200.0, 'sclk': 300.0}
   Current SCLK P-State: ['', '']
   Current MCLK P-State: ['', '']
   Power Profile Mode: None
   Power DPM Force Performance Level: auto

edit: Although, the card does not appear to be writable. Am I still missing something?

Ricks-Lab commented 2 years ago

Older GPUs like Pitcairn have limited Linux driver support. I think the earliest AMD GPU that supports writing is Fiji based.