Closed ghost closed 2 years ago
What Generation of CPU is this? Try the script https://github.com/intel/thermal_daemon/blob/master/test/thermal-debug-dump-fedora.sh And attach logs
11th gen, Intel Tigerlake, i7 1165g7 Here's the file that the script made. It has logs inside. 06230641.tar.gz
1. Logs suggest that your adaptive option failed before. I think you have a file "/tmp/ignore_adaptive". Please delete that rerun the script. 2. Also this is a Lenovo system which probably has FW thermal control, which causes thermald to exit in normal conditions. Do you have a file called "/sys/devices/platform/thinkpad_acpi/dytc_lapmode"? 3. I think you can prevent shutdown by writing TCC offset. You can "cd" to /sys/class/thermal/cooling_device*/ where type attribute is "TCC" and "echo 5 > cur_state"
was able to go through the script! here's the files 07172300.tar.gz
There is some mess up somewhere. I see [1641594278][WARN]Unable to find a zone for SEN3 [1641594278][WARN]Unable to find a zone for SEN1 They are present before. So need to provide some debug patch.
i'll be waiting
I created one change, please apply and build. This is two lines change only. You can $git clone https://github.com/intel/thermal_daemon.git $git checkout remotes/origin/ideapad-11thgen -b ideapad-11thgen
Then follow build procedure in README.txt
After build: $ sudo rm /tmp/ignore_adaptive $thermald --loglevel=debug --no-daemon --adaptive And attach the logs.
You shouldn't see [1641594278][WARN]Unable to find a zone for SEN3 [1641594278][WARN]Unable to find a zone for SEN1
Try your tests. May be other sensors trip before hopefully it will avoid shutdown.
I changed the ./autogen.sh prefix=/ to ./autogen.sh prefix=/usr/local to avoid conflicts with the distro thermald
Anyway, I got that log, but I still get the "unable to find a zone" messages.... Didn't run tests because thermald exits.
Is it possible that you are not running changed version? I added one change to print info if this is effective to the same branch. Jut git pull and retry,
I always run the changed version because I run sudo /usr/local/sbin/thermald --loglevel=debug --no-daemon --adaptive > ../thermald-logX.txt Here's the latest log. It's 20 kilobytes larger. thermald-log3.txt
Added one more change to ignore some modes. Please try. Pushed to the same branch
Just ran it, doing the tests and it seems to be working! Doesn't exit, keeps running, /tmp/ignore_adaptive isn't created, temps are at 74C, CPU throttles when it should (right now the stress I described in the first post is running and the CPU speeds go down as they should. Temps dont seem to even reach 75C. CPU speed is in 2.9GHz - 3 GHz. Fans spinning. All nice. thermald-log4.txt
Great. I will create a formal change and let you know to give one final test. This is obviously an issue on this platform for adaptive.
gotcha!
I reverted the previous changes and add a change so that it will not cause some issues to some unknown platforms. Pushed a new change to the same branch. I think it should work the same. Just send me one log for a run of minute or so.
If all good I will merge the change to master branch. In this way Fedora can pick the change.
thermald-log5.txt there it is. Works greatly! Thank you so much
Thanks for reporting. Applied change version to v2.4.8
NICE! Thank you so MUCH!
I get thermal shutdowns on my laptop. I stress it with freac (music file transcoding) and it gets to 100 celsius rather quick and doesn't do much to remediate that, and thus ends up shutting down rather quickly. I'm on Fedora 35, kernel 5.15.12-200.fc35.x86_64 Right now, in order to work around this, I've forced (by using cpupower) the cpu clock to 3.5 GHz and it's all nice, 80-87 celsius.
Current sensor readings: coretemp-isa-0000 Adapter: ISA adapter Package id 0: +88.0°C (high = +100.0°C, crit = +100.0°C) Core 0: +86.0°C (high = +100.0°C, crit = +100.0°C) Core 1: +82.0°C (high = +100.0°C, crit = +100.0°C) Core 2: +84.0°C (high = +100.0°C, crit = +100.0°C) Core 3: +82.0°C (high = +100.0°C, crit = +100.0°C)
nvme-pci-e100 Adapter: PCI adapter Composite: +41.9°C (low = -0.1°C, high = +86.8°C) (crit = +89.8°C) Sensor 1: +41.9°C (low = -273.1°C, high = +65261.8°C)
iwlwifi_1-virtual-0 Adapter: Virtual device temp1: +62.0°C
BAT0-acpi-0 Adapter: ACPI interface in0: 17.00 V