erpalma / throttled

Workaround for Intel throttling issues in Linux.
MIT License
2.65k stars 160 forks source link

Limited Power on XPS 9370 #77

Open marcolaux opened 5 years ago

marcolaux commented 5 years ago

Hello and good day,

I'm testing a little more at the moment with the XPS 9370 and the new --monitor feature.

It seems that Power is limited most of the time with the charger plugged in. I'm using the official 45W charger and an XPS 15 charger with a lot more juice and an USB-C adapter.

s-tui says it just consumes 11W of 18W max.

On battery the Power value is OK because of the lower TPD (I guess).

Perhaps someone has a clue where I could look further.

2018-12-03_11-02

erpalma commented 5 years ago

Can you please check with --debug if all the values are correctly written?

marcolaux commented 5 years ago

Thank you for the quick response.

It seems so - this is my output:

[D] core 0 thermal status: thermal limit status = 0 [D] core 0 thermal status: thermal limit log = 0 [D] core 0 thermal status: prochot or forcepr status = 0 [D] core 0 thermal status: prochot or forcepr log = 0 [D] core 0 thermal status: crit temp status = 0 [D] core 0 thermal status: crit temp log = 0 [D] core 0 thermal status: thermal threshold1 status = 0 [D] core 0 thermal status: thermal threshold1 log = 0 [D] core 0 thermal status: thermal threshold2 status = 0 [D] core 0 thermal status: thermal threshold2 log = 0 [D] core 0 thermal status: power limit status = 0 [D] core 0 thermal status: power limit log = 0 [D] core 0 thermal status: current limit status = 0 [D] core 0 thermal status: current limit log = 0 [D] core 0 thermal status: cross domain limit status = 0 [D] core 0 thermal status: cross domain limit log = 0 [D] core 0 thermal status: cpu temp = 59 [D] core 0 thermal status: temp resolution = 1 [D] core 0 thermal status: reading valid = 1 [D] core 1 thermal status: thermal limit status = 0 [D] core 1 thermal status: thermal limit log = 0 [D] core 1 thermal status: prochot or forcepr status = 0 [D] core 1 thermal status: prochot or forcepr log = 0 [D] core 1 thermal status: crit temp status = 0 [D] core 1 thermal status: crit temp log = 0 [D] core 1 thermal status: thermal threshold1 status = 0 [D] core 1 thermal status: thermal threshold1 log = 0 [D] core 1 thermal status: thermal threshold2 status = 0 [D] core 1 thermal status: thermal threshold2 log = 0 [D] core 1 thermal status: power limit status = 0 [D] core 1 thermal status: power limit log = 0 [D] core 1 thermal status: current limit status = 0 [D] core 1 thermal status: current limit log = 0 [D] core 1 thermal status: cross domain limit status = 0 [D] core 1 thermal status: cross domain limit log = 0 [D] core 1 thermal status: cpu temp = 57 [D] core 1 thermal status: temp resolution = 1 [D] core 1 thermal status: reading valid = 1 [D] core 2 thermal status: thermal limit status = 0 [D] core 2 thermal status: thermal limit log = 0 [D] core 2 thermal status: prochot or forcepr status = 0 [D] core 2 thermal status: prochot or forcepr log = 0 [D] core 2 thermal status: crit temp status = 0 [D] core 2 thermal status: crit temp log = 0 [D] core 2 thermal status: thermal threshold1 status = 0 [D] core 2 thermal status: thermal threshold1 log = 0 [D] core 2 thermal status: thermal threshold2 status = 0 [D] core 2 thermal status: thermal threshold2 log = 0 [D] core 2 thermal status: power limit status = 0 [D] core 2 thermal status: power limit log = 0 [D] core 2 thermal status: current limit status = 0 [D] core 2 thermal status: current limit log = 0 [D] core 2 thermal status: cross domain limit status = 0 [D] core 2 thermal status: cross domain limit log = 0 [D] core 2 thermal status: cpu temp = 59 [D] core 2 thermal status: temp resolution = 1 [D] core 2 thermal status: reading valid = 1 [D] core 3 thermal status: thermal limit status = 0 [D] core 3 thermal status: thermal limit log = 0 [D] core 3 thermal status: prochot or forcepr status = 0 [D] core 3 thermal status: prochot or forcepr log = 0 [D] core 3 thermal status: crit temp status = 0 [D] core 3 thermal status: crit temp log = 0 [D] core 3 thermal status: thermal threshold1 status = 0 [D] core 3 thermal status: thermal threshold1 log = 0 [D] core 3 thermal status: thermal threshold2 status = 0 [D] core 3 thermal status: thermal threshold2 log = 0 [D] core 3 thermal status: power limit status = 0 [D] core 3 thermal status: power limit log = 0 [D] core 3 thermal status: current limit status = 0 [D] core 3 thermal status: current limit log = 0 [D] core 3 thermal status: cross domain limit status = 0 [D] core 3 thermal status: cross domain limit log = 0 [D] core 3 thermal status: cpu temp = 59 [D] core 3 thermal status: temp resolution = 1 [D] core 3 thermal status: reading valid = 1 [D] core 4 thermal status: thermal limit status = 0 [D] core 4 thermal status: thermal limit log = 0 [D] core 4 thermal status: prochot or forcepr status = 0 [D] core 4 thermal status: prochot or forcepr log = 0 [D] core 4 thermal status: crit temp status = 0 [D] core 4 thermal status: crit temp log = 0 [D] core 4 thermal status: thermal threshold1 status = 0 [D] core 4 thermal status: thermal threshold1 log = 0 [D] core 4 thermal status: thermal threshold2 status = 0 [D] core 4 thermal status: thermal threshold2 log = 0 [D] core 4 thermal status: power limit status = 0 [D] core 4 thermal status: power limit log = 0 [D] core 4 thermal status: current limit status = 0 [D] core 4 thermal status: current limit log = 0 [D] core 4 thermal status: cross domain limit status = 0 [D] core 4 thermal status: cross domain limit log = 0 [D] core 4 thermal status: cpu temp = 59 [D] core 4 thermal status: temp resolution = 1 [D] core 4 thermal status: reading valid = 1 [D] core 5 thermal status: thermal limit status = 0 [D] core 5 thermal status: thermal limit log = 0 [D] core 5 thermal status: prochot or forcepr status = 0 [D] core 5 thermal status: prochot or forcepr log = 0 [D] core 5 thermal status: crit temp status = 0 [D] core 5 thermal status: crit temp log = 0 [D] core 5 thermal status: thermal threshold1 status = 0 [D] core 5 thermal status: thermal threshold1 log = 0 [D] core 5 thermal status: thermal threshold2 status = 0 [D] core 5 thermal status: thermal threshold2 log = 0 [D] core 5 thermal status: power limit status = 0 [D] core 5 thermal status: power limit log = 0 [D] core 5 thermal status: current limit status = 0 [D] core 5 thermal status: current limit log = 0 [D] core 5 thermal status: cross domain limit status = 0 [D] core 5 thermal status: cross domain limit log = 0 [D] core 5 thermal status: cpu temp = 57 [D] core 5 thermal status: temp resolution = 1 [D] core 5 thermal status: reading valid = 1 [D] core 6 thermal status: thermal limit status = 0 [D] core 6 thermal status: thermal limit log = 0 [D] core 6 thermal status: prochot or forcepr status = 0 [D] core 6 thermal status: prochot or forcepr log = 0 [D] core 6 thermal status: crit temp status = 0 [D] core 6 thermal status: crit temp log = 0 [D] core 6 thermal status: thermal threshold1 status = 0 [D] core 6 thermal status: thermal threshold1 log = 0 [D] core 6 thermal status: thermal threshold2 status = 0 [D] core 6 thermal status: thermal threshold2 log = 0 [D] core 6 thermal status: power limit status = 0 [D] core 6 thermal status: power limit log = 0 [D] core 6 thermal status: current limit status = 0 [D] core 6 thermal status: current limit log = 0 [D] core 6 thermal status: cross domain limit status = 0 [D] core 6 thermal status: cross domain limit log = 0 [D] core 6 thermal status: cpu temp = 59 [D] core 6 thermal status: temp resolution = 1 [D] core 6 thermal status: reading valid = 1 [D] core 7 thermal status: thermal limit status = 0 [D] core 7 thermal status: thermal limit log = 0 [D] core 7 thermal status: prochot or forcepr status = 0 [D] core 7 thermal status: prochot or forcepr log = 0 [D] core 7 thermal status: crit temp status = 0 [D] core 7 thermal status: crit temp log = 0 [D] core 7 thermal status: thermal threshold1 status = 0 [D] core 7 thermal status: thermal threshold1 log = 0 [D] core 7 thermal status: thermal threshold2 status = 0 [D] core 7 thermal status: thermal threshold2 log = 0 [D] core 7 thermal status: power limit status = 0 [D] core 7 thermal status: power limit log = 0 [D] core 7 thermal status: current limit status = 0 [D] core 7 thermal status: current limit log = 0 [D] core 7 thermal status: cross domain limit status = 0 [D] core 7 thermal status: cross domain limit log = 0 [D] core 7 thermal status: cpu temp = 59 [D] core 7 thermal status: temp resolution = 1 [D] core 7 thermal status: reading valid = 1 [D] TEMPERATURE_TARGET - write 0x5 - read 0x5 - match OK [D] CONFIG_TDP_CONTROL - write 0x0 - read 0x0 - match OK [D] MSR PACKAGE_POWER_LIMIT - write 0xcc816000dc8160 - read 0xcc816000dc8160 - match OK [D] MCHBAR PACKAGE_POWER_LIMIT - write 0xcc816000dc8160 - read 0xcc816000dc8160 - match OK

galfwender commented 5 years ago

I see the same issue on my 9370 after upgrading to recent version (I had not done so for several months). If I boot with AC power connected, it limits power as per the battery section of the configuration file. I have to disconnect and reconnect AC power for it to detect it and remove the power limit. Despite this, "cat /sys/class/power_supply/AC*/online" returns the correct value at all times. I am not sure if it makes any difference, but power is connected via a Thunderbolt / USB-C hub.

marcolaux commented 5 years ago

reconnecting the power cord does not have an effect for me.

When it's connected it's limited. When I disconnect the AC the settings of the battery section are working fine. When I connect the AC it's limited again.

Thunderbolt ports on the left and the USB-C on the right don't make a difference.

marcolaux commented 5 years ago

I just flashed the older BIOS 1.5.1 again (had 1.6.3 before) and the issue persists.

It's not going beyond 10W on AC (what s-tui tells me)

erpalma commented 5 years ago

I see the same issue on my 9370 after upgrading to recent version (I had not done so for several months). If I boot with AC power connected, it limits power as per the battery section of the configuration file. I have to disconnect and reconnect AC power for it to detect it and remove the power limit. Despite this, "cat /sys/class/power_supply/AC*/online" returns the correct value at all times. I am not sure if it makes any difference, but power is connected via a Thunderbolt / USB-C hub.

This sounds like a bug to me. It might be related to the event-based code which previously was just constantly polling the value from sysfs.

marcolaux commented 5 years ago

@erpalma Thanks for the hint. I'm using v0.3 for now and it works fine.

EDIT: not quite. Today I only have max power of 18W and it clocks under stress only up to 2.1GHz - warm and cold reboot. Yesterday it worked with max power of 39W and stayed at 3.7GHz for quite a time. Weird machine.

erpalma commented 5 years ago

I've just updated the monitor mode to print the power source. This way you can check if there is a mismatch and thus a bug in that code.

marcolaux commented 5 years ago

Battery / AC is detected correctly when I unplug and plug the power cord.

galfwender commented 5 years ago

Still some issues here. It correctly detects the power source but does not behave correctly until unplugging and reconnected power.

1) Laptop (XPS 9370) rebooted with AC power connected. 2) Ran s-tui stress test. Power usage peaked at 18W and --monitor displayed power limiting, pushing it down to ~11W. 3) Stopped stress test 4) Disconnected and reconnected AC 5) Started another stress test - Initially power limited, but this quickly went away and reached 35W before thermal throttling (intended behaviour).

Edit: I don't have 18W or 11W specified anywhere in the config file. The OP also observed these values.

throttle

erpalma commented 5 years ago

I also observed a similar behavior a couple of times, but it was related to TLP. Restarting the daemon fixed the issue.

marcolaux commented 5 years ago

I for myself did not have TLP installed. I used powertop --autotune on boot.

galfwender commented 5 years ago

I repeated test and rather than disconnect/reconnect AC, tried starting/stopping TLP service as well as manually running "tlp true" and "tlp false" but this did not do anything, the problem still existed.

I then disabled TLP and repeated the test by disconnecting/reconnecting AC and got the same result as yesterday:

If you look closely you can see when it kicks in to life, momentarily applying the 15W limit whilst on battery immediately before switching to AC power.

lenovo2.log lenovo_fix.conf.txt lenovo

galfwender commented 5 years ago

After a lot of testing I believe my issue seems to be due to Dell's Thermal Profiles. Setting it to "Quiet" enforces the 18/11W power limits. Changing the thermal profile does not always reliably alter the power limit. Providing I boot with either "Balanced" or "Performance" set, this issue is resolved (for me, keen to see if for @hyphone too). https://wiki.archlinux.org/index.php/Dell_XPS_13_(9370)#Thermal_Modes_/_Fan_profiles

cedws commented 5 years ago

Huh, as another 9370 owner, I'm having different results. The system can hit 100% utilisation whilst only using 8W. No power limit warnings. screen

EDIT - I see @galfwender's CPU is able to hit much higher frequencies. Probably why mine is using such little power.

EDIT 2 - Figured out why. I disabled Intel SpeedStep in the UEFI a while ago because it was causing erroneous kernel messages about thermal throttling. Re-enabling it allows it to boost over 1.8GHz. With the fan profiles posted above, it reaches 4GHz without power issues. Thanks a lot.

marcolaux commented 5 years ago

Providing I boot with either "Balanced" or "Performance" set, this issue is resolved (for me, keen to see if for @hyphone too).

Sadly I'm planning to sell the machine and I already replaced the SSD so I can't contribute to this anymore.