erpalma / throttled

Workaround for Intel throttling issues in Linux.
MIT License
2.68k stars 166 forks source link

Fix for Alder Lake and Tiger Lake compatibility #308

Closed lakotamm closed 2 years ago

lakotamm commented 2 years ago

Labelling Alder Lake as supported in the code and making it pass the initial test, allowing us to test the support.

From my initial testing it is possible to decrease the TDP level, but possibly not increase the TDP.

282

lakotamm commented 2 years ago

I found that MCHBAR PACKAGE_POWER_LIMIT is being written into address 0xFED159A0. This is a combination of 0xFED10000 (MCHBAR address) + 59A0h offset and it is correct for my 8th gen CPU. However on my 12th gen CPU, using command sudo setpci -s 0:0.0 48.l I can see that MCHBAR address is actually 0xFEDC0001.

So the correct address should actually be 0xFEDC59A0.

On the top of that it could be good to check whether we are writing correct data into the 16th bit of the register, since its function has been changed in 11th and 12th gen compared to 10th gen CPUs (but maybe you have already done it).

The change of the address fixes the issue for my i7-12700h. I created a commit https://github.com/erpalma/throttled/commit/2629c786a6f4b17a26dfb6d17b8bf3d69e541937 where I temporarily implemented the fix do that we can properly test it. Since the address is atm just a fixed number, for safety I disabled support for all other than Alder Lake CPUs.

Since my knowledge of Python is very low, I would need a help of someone else to properly implement changing of the address for different CPU families.

10th gen DATASHEET 11th gen DATASHEET 12th gen DATASHEET

lakotamm commented 2 years ago

The MCHBAR address for Tiger Lake should be the same as for Alder Lake according to https://github.com/erpalma/throttled/issues/255#issuecomment-1204060567

Therefore I enabled testing for it in the last commit.

phlb commented 2 years ago

I've tested this patch on Tiger Lake. It seems to work. There is no more power limit after few seconds on stress test.

drvenabili commented 2 years ago

@phlb Any chance you could share your etc/throttled.conf ? I tried the patch with temps I consider reasonable (see below) and a few seconds under an s-tui stress test I get power limited. This is on AC (through a dock). Trip temp is at 75 but it never goes above 70, somehow.

Thanks.

## Settings to apply while connected to AC power
[AC]
# Update the registers every this many seconds
Update_Rate_s: 5
# Max package power for time window #1
PL1_Tdp_W: 12
# Time window #1 duration
PL1_Duration_s: 28
# Max package power for time window #2
PL2_Tdp_W: 22
# Time window #2 duration
PL2_Duration_S: 0.002
# Max allowed temperature before throttling
Trip_Temp_C: 75
phlb commented 2 years ago

@faustusdotbe I've the default config etc/throttled.conf from 532026d8994ffc70352ecc9b056e9f89c97ca0fd . The cpu frequency slow down because of the temperature but there is no more power capped to 15 Watts.

It seems to resolve #255

Debug log :

[D] TEMPERATURE_TARGET - write 0x5 - read 0x0 - match ERR
[D] CONFIG_TDP_CONTROL - write 0x0 - read 0x0 - match OK
[D] MSR PACKAGE_POWER_LIMIT - write 0x42816000dd8160 - read 0x42816000dd8160 - match OK
[D] MCHBAR PACKAGE_POWER_LIMIT - write 0x42816000dd8160 - read 0x42816000dd8160 - match OK

Dell bios version :

          description: BIOS
          vendor: Dell Inc.
          physical id: 1
          version: 3.8.0
          date: 06/08/2022

Kernel :

Linux version 5.18.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-3) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.50.20220615) #1 SMP PREEMPT_DYNAMIC Debian 5.18.5-1 (2022-06-16)
Command line: BOOT_IMAGE=/vmlinuz-5.18.0-2-amd64 root=<...> ro quiet

Secure boot:

# mokutil --sb-state
SecureBoot disabled

intel-rapl :

# powercap-info -p intel-rapl
enabled: 1
Zone 0
  name: package-0
  enabled: 1
  max_energy_range_uj: 262143328850
  energy_uj: 77859605854
  Constraint 0
    name: long_term
    power_limit_uw: 44000000
    time_window_us: 27983872
    max_power_uw: 28000000
  Constraint 1
    name: short_term
    power_limit_uw: 44000000
    time_window_us: 2440
    max_power_uw: 0
  Constraint 2
    name: peak_power
    power_limit_uw: 105000000
    time_window_us: 0
    max_power_uw: 0
  Zone 0:0
    name: core
    enabled: 0
    max_energy_range_uj: 262143328850
    energy_uj: 26401738319
    Constraint 0
      name: long_term
      power_limit_uw: 0
      time_window_us: 976
  Zone 0:1
    name: uncore
    enabled: 0
    max_energy_range_uj: 262143328850
    energy_uj: 771791847
    Constraint 0
      name: long_term
      power_limit_uw: 0
      time_window_us: 976
Zone 1
  name: psys
  enabled: 0
  max_energy_range_uj: 262143328850
  energy_uj: 44731534962
  Constraint 0
    name: long_term
    power_limit_uw: 0
    time_window_us: 27983872
  Constraint 1
    name: short_term
    power_limit_uw: 97000000
    time_window_us: 976
drvenabili commented 2 years ago

Thanks @phlb

This is confirmed to be working now here on a Tigerlake (11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz). Launching a stress test with s-tui does not make the CPU throttle anymore.

FYI @whoenig, you might be interested.

lakotamm commented 2 years ago

I noticed that Alder Lake S has a different model and stepping number (#297) , so I added it. But this makes me wonder whether the U series also has different marking.

lakotamm commented 2 years ago

@erpalma included all the suggestions in the latest version, so I am closing down this pull request.

I will add a few more new Alder lake CPUs in #310.