clearlinux / distribution

Placeholder repository to allow filing of general bugs/issues/etc against the Clear Linux OS for Intel Architecture linux distribution
520 stars 29 forks source link

Laptop Heating Up #1609

Open knimer opened 4 years ago

knimer commented 4 years ago

Hello and Happy New Year everyone! :)

An issue that cannot be kept for such answer and then close the issue:

It seems to depend on the machine: on my desktop there's no issue, on anther laptop (with bigger air vent) temp is ok too.

So, there is an issue for sure. I am available to provide any further input if needed.

HW Details:

Lenovo Thinkpad T480s Intel® Core™ i7-8550U CPU @ 1.80GHz × 8 Intel® UHD Graphics 620 (Kabylake GT2)

Clear Linux Details:

knimer@knimer-MW~ $ swupd info
Distribution:      Clear Linux OS
Installed version: 31990
Version URL:       https://cdn.download.clearlinux.org/update
Content URL:       https://cdn.download.clearlinux.org/update

++

knimer@knimer-MW~ $ sudo swupd bundle-list
Password: 
NetworkManager
NetworkManager-extras
acpica-unix2
alsa-utils
aspell
aspell-de
aspell-es
aspell-fr
baobab
bc
binutils
bison
bootloader
c-basic
cheese
cloc
clr-network-troubleshooter
cpio
curl
desktop
desktop-apps
desktop-assets
desktop-autostart
desktop-gnomelibs
desktop-locales
dev-utils
devpkg-base
devpkg-llvm
diffutils
docutils
dosfstools
dpdk
editors
eog
ethtool
evince
evolution
file
file-roller
findutils
firefox
flatpak
flex
fonts-basic
fuse
gdb
geary
gedit
ghostscript
gimp
git
gjs
glibc-locale
gnome-base-libs
gnome-calculator
gnome-characters
gnome-color-manager
gnome-disk-utility
gnome-font-viewer
gnome-logs
gnome-music
gnome-photos
gnome-screenshot
gnome-system-monitor
gnome-todo
gnome-weather
gparted
graphviz
gstreamer
gzip
hardware-printing
hardware-uefi
htop
icdiff
inotify-tools
intltool
iperf
iproute2
iptables
kbd
kernel-install
kernel-native
kvm-host
less
lib-imageformat
lib-opengl
lib-openssl
lib-samba
libX11client
libglib
libreoffice
libstdcpp
libva-utils
linux-firmware
linux-firmware-extras
linux-firmware-wifi
linux-tools
llvm
mail-utils
make
man-pages
minicom
mpv
mutt
nasm
nautilus
net-tools
network-basic
nfs-utils
notmuch
openldap
openssh-client
openssh-server
openssl
openvswitch
os-core
os-core-plus
os-core-update
os-core-webproxy
p11-kit
parallel
parted
patch
perl-basic
pmdk
polkit
powertop
procps-ng
pulseaudio
pygobject
python3-basic
qemu-guest-additions
rsync
samba
seahorse
shells
smartmontools
sshfs
storage-utils
strace
sudo
sysadmin-basic
syslinux
the_silver_searcher
thermal_daemon
tmux
totem
tzdata
unzip
user-basic
valgrind
vim
webkitgtk
wget
which
wpa_supplicant
x11-server
xfsprogs
xz
znc
zsh
zstd
knimer@knimer-MW~ $ 

htop Output:

image

powertop Output

image

At Every Boot:

These messages keep coming each time I boot the system!

IMG_20191231_172325

I used the following OS's without any heating on this laptop:

PS:

It was heating up (less than with Clear Linux) on Solus Gnome edition. (Maybe you can find a link)

ahkok commented 4 years ago

The messages you see are warnings. The CPU detected these and correctly paused execution until the CPU was cooled enough. This indicates insufficient cooling in the system or possibly a firmware or BIOS issue where the system isn't properly cooling.

You can try changing the CPU governor or tune various other power conserving tunables, (TLP?)

This isn't an OS bug. ClearLinux is tuned for performance and this device has insufficient cooling to support that. We should have some documentation to explain how to tune the CPU governor (disable clr-power-tweaks, install TLP etc)., but fundamentally there is no "bug" here.

jlmeeker commented 4 years ago

I've seen this in multiple distros on my T580. Haven't found a fix yet, but I'm thinking it is related to fan control or something similar. This is definitely not unique to Clear Linux.

knimer commented 4 years ago

The messages you see are warnings. The CPU detected these and correctly paused execution until the CPU was cooled enough. This indicates insufficient cooling in the system or possibly a firmware or BIOS issue where the system isn't properly cooling.

You can try changing the CPU governor or tune various other power conserving tunables, (TLP?)

This isn't an OS bug. ClearLinux is tuned for performance and this device has insufficient cooling to support that. We should have some documentation to explain how to tune the CPU governor (disable clr-power-tweaks, install TLP etc)., but fundamentally there is no "bug" here.

These messages when I restart from an already heated state in Clear Linux.

Just keep in mind guys, it heats up with Clear Linux (and Solus) nothing else is heating! Why?

It is not unique, I mentioned that already as I faced this - to a less extent - with Solus.

@ahkok maybe, it is a device with weak cooling performance for an OS designed for performance.

To be fair and close a doubt at my side, I am contacting support to check the fans and ventilation. Will update here with the finsings.

0xA1B2 commented 4 years ago

Enable Fan Control sudo gedit /etc/modprobe.d/thinkfan.conf options thinkpad_acpi fan_control=1

reboot

Check Thinkpad Mods: sudo modprobe -rv thinkpad_acpi sudo modprobe -v thinkpad_acpi

Enable CL Thermal Management sudo systemctl enable --now thermald

Check cat /proc/acpi/ibm/fan

sudo swupd bundle-add lm-sensors sensors

Slow sudo echo level 7 > /proc/acpi/ibm/fan sudo echo level 6 > /proc/acpi/ibm/fan sudo echo level 5 > /proc/acpi/ibm/fan

Max sudo echo level disengaged > /proc/acpi/ibm/fan

Auto sudo echo level auto > /proc/acpi/ibm/fan

ahkok commented 4 years ago

@0xA1B2 thanks, that's very useful!

knimer commented 4 years ago

Enable Fan Control sudo gedit /etc/modprobe.d/thinkfan.conf options thinkpad_acpi fan_control=1

reboot

Check Thinkpad Mods: sudo modprobe -rv thinkpad_acpi sudo modprobe -v thinkpad_acpi

Enable CL Thermal Management sudo systemctl enable --now thermald

Check cat /proc/acpi/ibm/fan

sudo swupd bundle-add lm-sensors sensors

Slow sudo echo level 7 > /proc/acpi/ibm/fan sudo echo level 6 > /proc/acpi/ibm/fan sudo echo level 5 > /proc/acpi/ibm/fan

Max sudo echo level disengaged > /proc/acpi/ibm/fan

Auto sudo echo level auto > /proc/acpi/ibm/fan

Can you describe more about these changes? Will performance go down because of this?

Thanks a lot

ahkok commented 4 years ago

Will performance go down because of this?

Absolutely. The max performance of the system requires a certain amount of cooling. If you change the amount of cooling, the performance will be impacted.

ahkok commented 4 years ago

Of course, setting the fan up for maximum cooling will generally allow your system to consume more power, and thus perform more computing. If you are not using the compute power, the system will be cooler.

knimer commented 4 years ago

I am checking my system and there is no:

/etc/modprobe.d/thinkfan.conf

There is a:

/usr/lib/modprobe.d/systemd.conf

The later has "options" structure.

Shall I add into this systemd.conf the:

options thinkpad_acpi fan_control=1

??

ahkok commented 4 years ago

Shall I add into this systemd.conf the:

options thinkpad_acpi fan_control=1

No, instead, you should create /etc/modprobe.d/thinkfan.conf.

Files under /usr are going to be reverted by swupd update, and you'd lose your changes.

ahkok commented 4 years ago

man stateless for reference.

knimer commented 4 years ago

When trying:

sudo echo level disengaged > /proc/acpi/ibm/fan

I get: Permission Denied

Am I missing anything?

ahkok commented 4 years ago

Common mistake: the pipe is not run as sudo, and therefore fails.

echo disengaged | sudo tee -a /proc/acpi/ibm/fan

knimer commented 4 years ago

did not work!

When: echo disengaged | sudo tee -a /proc/acpi/ibm/fan

I get: tee: /proc/acpi/ibm/fan: Invalid argument

knimer commented 4 years ago

Also when I: cat /proc/acpi/ibm/fan

I get:

status:     enabled
speed:      0
level:      auto

Is the Zero speed value normal?

ahkok commented 4 years ago

did not work!

When: echo disengaged | sudo tee -a /proc/acpi/ibm/fan

I get: tee: /proc/acpi/ibm/fan: Invalid argument

That error message suggests that "disengaged" is not a valid thing you can write to that file. You need to consult the kernel documentation to find out what input that file takes.

ahkok commented 4 years ago

Also when I: cat /proc/acpi/ibm/fan

I get:

status:       enabled
speed:        0
level:        auto

Is the Zero speed value normal?

Again, this is something the upstream kernel documentation should cover.

See this (I think) doc link:

https://kernel.googlesource.com/pub/scm/linux/kernel/git/pjt/linsched/+/1e0b5ab81e2abb8bbf7446f4a17f43a1e34944fe/Documentation/thinkpad-acpi.txt

knimer commented 4 years ago

Does the below mean a problem? Please check the warnings.

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; vendor preset: disabled)
     Active: active (running) since Wed 2020-01-15 16:00:47 +04; 3h 27min ago
   Main PID: 476 (thermald)
      Tasks: 2 (limit: 9249)
     Memory: 5.8M
     CGroup: /system.slice/thermald.service
             └─476 /usr/bin/thermald --no-daemon --dbus-enable

Jan 15 16:00:47 knimer-MW thermald[476]: [WARN]sensor id 9 : No temp sysfs for reading raw temp
Jan 15 16:00:47 knimer-MW thermald[476]: I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
Jan 15 16:00:47 knimer-MW thermald[476]: [WARN]error: could not parse file /etc/thermald/thermal-conf.xml
Jan 15 16:00:47 knimer-MW thermald[476]: [WARN]sysfs open failed
Jan 15 16:00:47 knimer-MW thermald[476]: I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
Jan 15 16:00:47 knimer-MW thermald[476]: [WARN]error: could not parse file /etc/thermald/thermal-conf.xml
Jan 15 16:00:47 knimer-MW thermald[476]: I/O warning : failed to load external entity "/etc/thermald/thermal-cpu-cdev-order.xml"
Jan 15 16:00:47 knimer-MW thermald[476]: [WARN]error: could not parse file /etc/thermald/thermal-cpu-cdev-order.xml
Jan 15 16:00:47 knimer-MW thermald[476]: I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
Jan 15 16:00:47 knimer-MW thermald[476]: [WARN]error: could not parse file /etc/thermald/thermal-conf.xml