Ricks-Lab / gpu-utils

A set of utilities for monitoring and customizing GPU performance
GNU General Public License v3.0
139 stars 23 forks source link

Feature Request: Support values from sysfs gpu_metrics #122

Open leuc opened 2 years ago

leuc commented 2 years ago

amdgpu exports various values via sysfs under the path /sys/devices/pci***/gpu_metrics

I wrote a python script amdgpu_metrics.py to decode the binary data. The code should be fairly easy to integrate with gpu-utils. I am not gonna write and maintain a proper python library, but the script should help with the initial testing and implementation.

It's only tested against 2 cards for now, so please give it a try and comment on the gist.

Example output with Navi 10 on kernel 5.15.rc7

/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/0000:02:00.0/0000:03:00.0/gpu_metrics
> MetricsTableHeader
structure_size: 120
format_revision: 1
content_revision: 3
> GpuMetrics_v1_3
temperature_edge: 53
temperature_hotspot: 53
temperature_mem: 58
temperature_vrgfx: 0
temperature_vrsoc: 0
temperature_vrmem: 0
average_gfx_activity: 0
average_umc_activity: 0
average_mm_activity: 65535
average_socket_power: 12
energy_accumulator: 6294297477348589567
system_clock_counter: 142426363985407644
average_gfxclk_frequency: 100
average_socclk_frequency: 65535
average_uclk_frequency: 65535
average_vclk0_frequency: 65535
average_dclk0_frequency: 65535
average_vclk1_frequency: 300
average_dclk1_frequency: 506
current_gfxclk: 100
current_socclk: 1266
current_uclk: 1085
current_vclk0: 65535
current_dclk0: 65535
current_vclk1: 0
current_dclk1: 0
throttle_status: 589823
current_fan_speed: 50
pcie_link_width: 65535
pcie_link_speed: 65535
padding: 65535
gfx_activity_acc: 4294967295
mem_activity_acc: 4294967295
temperature_hbm: 65535
firmware_timestamp: 18446744073709551615
voltage_soc: 65535
voltage_gfx: 65535
voltage_mem: 0
padding1: 0
ThrottleStatus Bitmask
524288|32768|16384|8192|4096|
EDC_GFX|EDC_CPU|PROCHOT_GFX|PROCHOT_CPU|TDC_SOC|TDC_VDD|THM_SOC|THM_GFX|THM_CORE|SPPT_APU|SPPT|FPPT