Closed majanes-intel closed 2 years ago
Summary for new users:
If I'm not mistaken we have a two issues:
1) Throttling to 400mhz -- which caused by incorrect work of legacy interface. This issue was resolved by this commit https://github.com/intel/thermal_daemon/commit/1ad03424f7f3d339521635f08377b323375b2747 and v2.4.6 release 2) Throttling to 1800mhz it is another issue and currently I haven't any idea how resolve it. I will be appreciated if someone help to resolve it
Affected laptops: Dell Latitude 5420/7420/7520 Related issues: https://github.com/erpalma/throttled/issues/255
I have a Latitude 7420 and running thermald on version 2.4.6 (or master) is not helping with 400Mhz throttling.
I don't get why but I don't have the fix info message Disable rapl-msr interface and use rapl-mmio
.
Did I forgot to set up something ?
This is especially weird since after an hour playing with throttled
then thermald
the first time I ended up with a "usable laptop" running thermald (up to 1.8Ghz, and no drop to 400Mhz). But since I rebooted running thermald is not helping :thinking:
I'm now running with this to not just throw out my laptop (please don't slap me)
while true; do
rmmod intel_rapl_msr
rmmod processor_thermal_device
rmmod processor_thermal_rapl
rmmod intel_rapl_common
rmmod intel_powerclamp
modprobe intel_powerclamp
modprobe intel_rapl_common
modprobe processor_thermal_rapl
modprobe processor_thermal_device
modprobe intel_rapl_msr
sleep 1
done
Hi @n0rad, it works for my Latitude 7520(1800mhz only) without a problem, I didn't install throttled
What distro/kernel version, bios version and are you sure that thermald
enabled on startup? I think that 7420 should be identical 7520/7320 because bios is the same
Ok so while I was collecting the info I found that throttled was starting as a unit under the name lenovo_fix.service
. Definitly incompatible.
Here is the info, just in case:
os: archlinux
kernel : 5.12.15-arch1-1
bios : 1.7.1
thermald: 2.4.6
command: /usr/bin/thermald --systemd --dbus-enable --adaptive
I still don't have the fix info message, but it's no more dropping to 400Mhz. Sorry for the noise.
I hope we will have a solution for the max 1.8Ghz :crossed_fingers:
I cannot fix it, so waiting for fix from somewhere too. I will inform here if I find something
Just interesting, do linux versions of these laptops experiencing the same problems? Unfortunatelly dell doesn't offer oem image of ubuntu for download to check
I asked dell-care about this issue, but didn't receive any adequate answer. Looks like dell-care doesn't care
I can propose some "kind of fix" that seems like work sometimes.
Install dual-boot (Windows 10 + Linux), boot on Windows, install Intel Dynamic Tuning Driver, wait for some time (CPU frequency will be established to the normal state on Windows. Without the driver Windows OS has the same issue with 400 Mhz CPU frequency). Then reboot to Linux and don't turn off it :)
I've also tested OEM kernels from Ubuntu (they are publicly available in Ubuntu repo) - doesn't help. I've tried different OSes with different kernels - doesn't help. I've tried different BIOS versions - doesn't help. Dell support didn't provide any useful information too.
So for now I live with 1800 Mhz. And I see the only solution - I am waiting for Lenovo T14 Gen 2 AMD and I will change my laptop. And I will never buy Dell laptop again. (by the way, seems like Linux works fine on Dell XPS 13 model).
Install dual-boot (Windows 10 + Linux), boot on Windows, install Intel Dynamic Tuning Driver, wait for some time (CPU frequency will be established to the normal state on Windows. Without the driver Windows OS has the same issue with 400 Mhz CPU frequency). Then reboot to Linux and don't turn off it :)
It resolved already by latest version of thermald
, so we on 1800mhz now
So for now I live with 1800 Mhz. And I see the only solution - I am waiting for Lenovo T14 Gen 2 AMD and I will change my laptop. And I will never buy Dell laptop again. (by the way, seems like Linux works fine on Dell XPS 13 model).
I keep hoping that it will be fixed by bios update or hack
I'm on a Lenovo (P15s gen2) swearing I will got back to Dell over this. So it's reassuring to know that it isn't a manufacturer issue. It's the 11th gen cpu and Intel. I'm not running thermald at all and it's just as bad with the 400MHz. If thermald 2.4.6 fixes this then I'll happily use it (I have other questions like how to get power/battery profiles working with thermald). Roll on the ubuntu 21.04 release.
I would like to check that running thermald is better than not running thermald for this kind of stuff? TIA.
@mhosken some guys posted issues with lenovo in this related topic: https://github.com/erpalma/throttled/issues/255 please check it. I can't confirm that it is the Intel issue. On my second MSI laptop with i7-1185g7 cpu works perfectly(But it have a lot non-software issues -- can't recommend MSI)
I have other questions like how to get power/battery profiles working with thermald
I think that need to use TLP for that
So, I've been seeing excessive CPU throttling on my Lenovo Thinkpad T480 for a while now so I took some time to debug the thermal zone event activity and found that the passive cooling was kicking in when the acpitz reached 86 degrees C (the ACPI passive cooling threshold as returned by the APCI method _SB.PCI0.LPCB.EC.SEN1._PSV on my laptop). However, the passive cooling was disabled only when the acpitz droped below 55 degrees C (which takes 5-10 minutes on a warm day). The workaround I found was to disable this trip point by using:
echo -1 | sudo tee /sys/module/thermal/parameters/psv
Thanks @ColinIanKing, maybe it is helpful for lenovo users.
I tried this workaround and it didn't help on Dell laptop. Based on spandruvada's reply above, I think that I need to found how disable trip point for TMEM sensor, but I can't find how to do it
The problem is that TMEM sensor reaches its limits of 42C in 4 seconds,, so the system is throttled from max power. Even at the start the temperature is 39C. So not much margin. Not sure what can be done here,
@dmirubtsov my hunch is that the TMEM sensor is the INT3402 thermal driver for memory temperature reporting, found in the kernel as drivers/thermal/intel/int340x_thermal/int3402_thermal.c I'm not knowledgeable about this driver but it does provide a thermal zone and you may have a _TMP acpi object that the kernel can use to gather the temperature of this device. The driver comes with a int3402_notify() handler that will handle thermal trip events. Perhaps disabling or unloading the int3402_thermal_driver may help.
I have these modules loaded:
dell ~ » lsmod | grep int3
int3403_thermal 20480 0
int340x_thermal_zone 20480 2 int3403_thermal,processor_thermal_device
int3400_thermal 20480 0
acpi_thermal_rel 16384 1 int3400_thermal
I've tried to unload all of these modules, but it didn't affect anything:
dell ~ » lsmod | grep int3
dell ~ »
@mhosken @ZaMaZaN4iK @majanes-intel
Workaround until we will got a fix.
I've disabled SpeedShift in bios settings and got a huge video performance improvement. CPU still throttle to 1800mhz but now it not affect(or not so much) video throttling. So now gui works pretty smooth. I can even play gta5 on my laptop without freezes.
It also can be done from OS by dell-command-configure
package(aur):
sudo /opt/dell/dcc/cctk --SpeedShift=Disabled
The latest version of this driver fixes the issue of extremely low frequency of 400Mhz. However the CPU is still locked at 1800Mhz max.
I'm on a Lenovo (P15s gen2) swearing I will got back to Dell over this. So it's reassuring to know that it isn't a manufacturer issue. It's the 11th gen cpu and Intel. I'm not running thermald at all and it's just as bad with the 400MHz. If thermald 2.4.6 fixes this then I'll happily use it (I have other questions like how to get power/battery profiles working with thermald). Roll on the ubuntu 21.04 release.
@mhosken thermald 2.4.6 does fixes it. Its locked at 1800Mhz for me now, very rarely will go to 1500.
I have a Dell Latitude 5420 with Intel Core i5 1135g7, and I have trouble with this also.
Thermald fixed the 400Mhz issue, but now my CPU ofhen throttles to ~1400Mhz (all cores), especially when I do some memory intensive work.
I've tried s-tui
and when I start simple Sqrt()
tests on 8 threads, it can sustain a little bit longer the high frequencies, and after that it will slow dont to ~2100Mhz which is normal I think, but if I start the Malloc()
tests (8 threads), it's almost instantly goes down to ~1400 Mhz.
$ uname -a
Linux *** 5.13.7-arch1-1 #1 SMP PREEMPT Sat, 31 Jul 2021 13:18:52 +0000 x86_64 GNU/Linux
@carathorys Thanks for your reply and welcome to the club :)
I see that Dell released a new version of bios for your laptop about a week ago. Did you try it?
Yes, I've updated everything on 31th of July, and now I have the latest BIOS version 1.10.0, but the problem remains.
Yes, I've updated everything on 31th of July, and now I have the latest BIOS version 1.10.0, but the problem remains.
Same here unfortunately
I just got the latest Dell firmware upgrade and the issue still persists:
└─Latitude 7320, Latitude 7320, Latitude 7420, Latitude 7420, Latitude 7520 System Update:
New version: 1.7.1
Remote ID: lvfs
Summary: Firmware for the Dell Latitude 7320, Latitude 7320, Latitude 7420, Latitude 7420, Latitude 7520
License: Proprietary
Size: 23.6 MB
Created: 2021-06-08
Urgency: Critical
Vendor: Dell Inc.
Description:
This stable release fixes the following issues:
• Firmware updates to address security vulnerabilities.
• Firmware updates to address the Intel Security Advisory.
• Fixed the issue where the system screen flashes after booting.
• Fixed the issue where the BIOS recovery is initiated when you quickly turn off and turn on the system.
• Fixed the issue where the system with USB port disabled does not recognize the dock even though Type-C dock override is enabled in the BIOS. This issue occurs after the system restart.
Some new functionality has also been added:
• Enhanced the CPU thermal stability.
For me, running the following after each boot fixes the issue until next reboot:
sudo rmmod intel_rapl_msr
sudo modprobe intel_rapl_msr
For me, running the following after each boot fixes the issue until next reboot:
And it stuck on 1800mhz instead of 400 right? It's known behavior. You can install latest version of thermald
instead of module unload.
Issue with 1800mhz still not fixed
Dell released bios 1.8.2 for 7x20 laptops, but issue still persist with the new version
It's better for me following @ftsogr comment : https://github.com/erpalma/throttled/issues/255#issuecomment-903144537
@n0rad issue with stucking on 1800mhz still not fixed and video performance is very poor due to power limit. Compared with lenovo/msi laptops
Instead of modules reload you can install latest version of thermald
and remove any other related things
@carathorys
new bios was released for your model, can you please check it?
Dell released 1.9.1 bios version for 7x20, but nothing has changed
@carathorys
new bios was released for your model, can you please check it?
I've updated, and I'm still experiencing the same issue: throttled to 1.4Ghz with thermald enabled. Today we're sending the laptop back to the retailer, because I've experienced some other memory issues (sometimes the bios reports memory issues).
Not sure, can it help us or not: https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.15-Power-Management
So, if anyone can test on your local machine Linux 5.15 kernel with the corresponding PR and check it - would be awesome.
So, if anyone can test on your local machine Linux 5.15 kernel with the corresponding PR and check it - would be awesome.
I will build and test kernel when 5.15-rc1 will be released
Srinivas(maintainer of this repo) should know if power-related changes in 5.15 can help
@spandruvada what do you think? thanks
same problem dell 7400 i5-8265u last week instaled 20.04 pop_os lts with kernel 5.11 before was pop_os 20.10 with kernel 5.4 lts and all was norm now on 20.04 (5.11) cpu throttles to 400 MHz with no reason bios last update 1.13.0
sorry for my awful english...
@twtd try to install latest version of thermald
and check
@twtd try to install latest version of
thermald
and check
@dmirubtsov i found issue #291...but can you help me pls install last version, because with apt i cant do it
@dmirubtsov i think i did it ))) now i'll test it
@dmirubtsov issue with 400Mhz solved, but when I do sudo systemctl status thermald.service
output is
сен 01 18:35:03 pop-os systemd[1]: Starting Thermal Daemon Service...
сен 01 18:35:03 pop-os systemd[1]: Started Thermal Daemon Service.
сен 01 18:35:03 pop-os thermald[735]: 22 CPUID levels; family:model:stepping 0x6:8e:c (6:142:12)
сен 01 18:35:04 pop-os thermald[735]: 22 CPUID levels; family:model:stepping 0x6:8e:c (6:142:12)
сен 01 18:35:04 pop-os thermald[735]: Polling mode is enabled: 4
сен 01 18:35:04 pop-os thermald[735]: sensor id 12 : No temp sysfs for reading raw temp
сен 01 18:35:04 pop-os thermald[735]: sensor id 12 : No temp sysfs for reading raw temp
сен 01 18:35:04 pop-os thermald[735]: sensor id 12 : No temp sysfs for reading raw temp
сен 01 18:35:08 pop-os thermald[735]: Unable to find a zone for TVGA
What does it mean?
@twtd I'm not a developer of thermald
proejct. As I understand it is ok and you can ignore these messages.
If you have some trouble with it please open new issue
There is some configuration of thermal_deamon to increase the frequencies from 1400MHz to a higher value?
There is some configuration of thermal_deamon to increase the frequencies from 1400MHz to a higher value?
no, it is still not fixed
Problem still persist with the new 1.9.3 bios version(Latitude 7x20)
Looks like Dell doesn't care about this issue, so I switched to Lenovo laptop.
Thank you all. I will follow this issue until I sell the Dell laptop
Nothing has changed after update to Bios 1.9.6(Latitude 7x20)
I have just updated my Latitude 7x20 to latest System Firmware 1.11.3
and this issue seems to be fixed for me. I now see the CPU (11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
) going beyond 3GHz as needed.
dell 5421 i7-11850h bios 1.6.1 nothing changed. throttled to 800mhz after 10 seconds of running s-tui
Same issue: Long time (+2s) Power throttling at 10-12W. It can boost but quickly goes down and stay around 1,800Mhz. Latitude 7420, BIOS v1.22.2, Arch - up to date.
Not sure where exactly the fix is to be found: thermald, kernel, intel, dell or throttled.. I'll probably give up for now and just hope it will be fixed one day (only blaming the big vendors here, not any of the open source projects ;) ).
@Grtschnk
@dmirubtsov
Dell Latitude 5420 with latest BIOS - 1.13.1
Ubuntu 20.04
Kernel 5.14.21 - ppa.launchpad.net/tuxinvader/lts-mainline/ubuntu
Intel thermal_daemon 2.4.6 - https://github.com/intel/thermal_daemon/. (NOT from Ubuntu or Debian repository)
SpeedStep=Disabled
works and with
BIOS Ultra performance power setup for me work up to 3.1GHz for 2 minutes then drops to 2.3GHz because of 95 Celsius.
Optimized power bios setup got 2.3 GHz with 65 Celsius
@VitaliiSerdiuk if you enable SpeedStep, do you have drops to 400 Mhz?
I just want to understand, what is the reason for such drops. Now I am on Dell Latitude 5410 and have the same problem. If I have a CPU-only workload, it works almost fine. If I have any GPU-intensive workload (any game), my CPU drops to 400 Mhz from time to time.
@zamazan4ik SpeedStep not have huge impact. Biggest impact as for my understanding
But I not checked it with GPU stress test. I test CPU only via stress -c 8
and check result via s-tui
@VitaliiSerdiuk Thank you for the hint, but unfortunately it does not help with my system. :/ (Latitude 7420, Ultra Performance mode in BIOS, BIOS version 1.12.2, Kernel 5.15.5, thermald 2.4.6-1)
If I disable SpeedStep ,i7z reports Turbo as turned off. If I disable SpeedShift Turbo is reported availble, but higher frequencies aren't used until I change the governor manually. And then it still jumps back to power throttling after a few seconds.
5420 and 7420 use different BIOS versions numbers; not sure how much they actually differ
@Grtschnk @dmirubtsov Dell Latitude 5420 with latest BIOS - 1.13.1 Ubuntu 20.04 Kernel 5.15.4 - ppa.launchpad.net/tuxinvader/lts-mainline/ubuntu Intel thermal_daemon 2.4.6 - https://github.com/intel/thermal_daemon/. SpeedStep=Disabled works and with BIOS Ultra performance power setup for me work up to 3.1GHz for 2 minutes then drops to 2.3GHz because of 95 Celsius. Optimized power bios setup got 2.3 GHz with 65 Celsius
Me too, I fixed the problem with the new bios update. Dell 5420
I wish we could find out exactly the BIOS upgrades are fixing. That would be super insightful information.
Kernel: 5.11.3 Debian: Testing thermald: 2.4.3 (debian unstable) processor: i7-1185G7 -- 28 W TDP
After running power-intensive workloads for a short amount of time, the CPU and/or GPU will be throttled down drastically to ~10% of peak.
Running turbostat reveals that the peak current is ~16W, far below the TDP limit.
Running lm-sensors shows that the peak temp is ~50C, far below the limit.
After reading #291 and #280, I enabled debug logs for thermald. thermald.log
@spandruvada let me know if more information is needed. I can also bring the system to you in JF1. Mesa team will be using this laptop model for perf analysis.