Umio-Yasuno / libdrm-amdgpu-sys-rs

libdrm_amdgpu bindings for Rust, and some methods ported from Mesa3D
MIT License
8 stars 1 forks source link

Throttle status almost always shows TEMP_HOTSPOT [Navi31] #7

Open zenofile opened 4 months ago

zenofile commented 4 months ago

I'm not sure if this belongs here or if this is an issue with the kernel (6.8.0-0.rc6) or firmware, but gpu_metrics (and LACT) nearly always include TEMP_HOTSPOT for the throttle status, thus reporting a throttled GPU independent of the shown hotspot temperature (which is fine).

It doesn't seem to matter whether the GPU is idling or experiencing full load, the only difference being that TEMP_HOTSPOT sometimes disappears for a split second when idling.

V1_3(
    gpu_metrics_v1_3 {
        common_header: metrics_table_header {
            structure_size: 120,
            format_revision: 1,
            content_revision: 3,
        },
        temperature_edge: 38,
        temperature_hotspot: 41,
        temperature_mem: 57,
        temperature_vrgfx: 38,
        temperature_vrsoc: 39,
        temperature_vrmem: 39,
        average_gfx_activity: 0,
        average_umc_activity: 1,
        average_mm_activity: 0,
        average_socket_power: 10,
        energy_accumulator: 0,
        system_clock_counter: 2748118101619,
        average_gfxclk_frequency: 5,
        average_socclk_frequency: 65535,
        average_uclk_frequency: 29,
        average_vclk0_frequency: 25,
        average_dclk0_frequency: 25,
        average_vclk1_frequency: 25,
        average_dclk1_frequency: 25,
        current_gfxclk: 5,
        current_socclk: 750,
        current_uclk: 96,
        current_vclk0: 25,
        current_dclk0: 25,
        current_vclk1: 25,
        current_dclk1: 25,
        throttle_status: 2,
        current_fan_speed: 0,
        pcie_link_width: 16,
        pcie_link_speed: 25,
        padding: 65535,
        gfx_activity_acc: 4294967295,
        mem_activity_acc: 4294967295,
        temperature_hbm: [
            65535,
            65535,
            65535,
            65535,
        ],
        firmware_timestamp: 18446744073709551615,
        voltage_soc: 707,
        voltage_gfx: 176,
        voltage_mem: 658,
        padding1: 65535,
        indep_throttle_status: 68719476736,
    },
)
Average Socket Power: 10 W
Throttle Status: [TEMP_HOTSPOT]

AMD Radeon RX 7900 XTX
Family:     GC 11.0.0
ASIC Name:  GFX1100/Navi31
Chip class: GFX11
GFX ID:     gfx1100
GPU Type:   dGPU
gfx_target_version: gfx1100
Umio-Yasuno commented 4 months ago

I think that is a issue on the part of the AMDGPU driver or firmware.
Please report to drm/amd.

https://gitlab.freedesktop.org/drm/amd/-/issues

Related: https://github.com/flightlessmango/MangoHud/issues/1243
Related: https://gitlab.freedesktop.org/drm/amd/-/issues/2720

zenofile commented 4 months ago

Thank you. I reported it on the drm/amd issue tracker: https://gitlab.freedesktop.org/drm/amd/-/issues/3251