vmatare / thinkfan

The minimalist fan control program
GNU General Public License v3.0
534 stars 61 forks source link

Fans do not start at a stated threshold... Not even sure they start at all. #192

Closed fakamaz closed 1 year ago

fakamaz commented 2 years ago

Here's an example from the config file:

tp_fan /proc/acpi/ibm/fan

hwmon /sys/devices/platform/coretemp.0/hwmon/hwmon3/temp3_input
hwmon /sys/devices/platform/coretemp.0/hwmon/hwmon3/temp4_input
hwmon /sys/devices/platform/coretemp.0/hwmon/hwmon3/temp1_input
hwmon /sys/devices/platform/coretemp.0/hwmon/hwmon3/temp5_input
hwmon /sys/devices/platform/coretemp.0/hwmon/hwmon3/temp2_input
hwmon /sys/devices/virtual/thermal/thermal_zone0/hwmon1/temp1_input
hwmon /sys/devices/virtual/thermal/thermal_zone0/hwmon1/temp2_input

(0,     0,      40)
(1,     38,     44)
(2,     42,     48)
(3,     46,     52)
(4,     50,     56)
(5,     54,     60)
(6,     58,     64)
(7,     62,     68)
(127,   66,     32767)

Status is:

thinkfan.service - simple and lightweight fan control program
     Loaded: loaded (/lib/systemd/system/thinkfan.service; enabled; vendor pres>
     Active: active (running) since Mon 2022-07-11 18:39:23 CEST; 39min ago
   Main PID: 31971 (thinkfan)
      Tasks: 1 (limit: 18863)
     Memory: 260.0K
     CGroup: /system.slice/thinkfan.service
             └─31971 /usr/sbin/thinkfan -q

Jul 11 18:39:23 ThinkPad-L430 systemd[1]: Starting simple and lightweig>
Jul 11 18:39:23 ThinkPad-L430 thinkfan[31970]: thinkfan 0.9.1 starting.>
Jul 11 18:39:23 ThinkPad-L430 systemd[1]: Started simple and lightweight>

But the temperature is still high and I have a feeling the fans do not start when stated. So how can I fix this?

vmatare commented 2 years ago

Well, first of all you should try and find out what the situation actually is. So when you feel like thinkfan isn't reacting to your temperatures the way you expect it to, run:

sensors
pkill -usr1 thinkfan
journalctl -el -u thinkfan

and post the output here.

fakamaz commented 2 years ago

Sensors:

thinkpad-isa-0000 Adapter: ISA adapter fan1: 0 RPM

BAT0-acpi-0 Adapter: ACPI interface in0: 11.84 V

coretemp-isa-0000 Adapter: ISA adapter Package id 0: +77.0°C (high = +87.0°C, crit = +105.0°C) Core 0: +75.0°C (high = +87.0°C, crit = +105.0°C) Core 1: +77.0°C (high = +87.0°C, crit = +105.0°C) Core 2: +70.0°C (high = +87.0°C, crit = +105.0°C) Core 3: +72.0°C (high = +87.0°C, crit = +105.0°C)

acpitz-acpi-0 Adapter: ACPI interface temp1: +74.0°C (crit = +88.0°C) temp2: +59.0°C (crit = +88.0°C)

pkill - produced an error when run as-is, when run as sudo - showed nothing... journalctl: -- Logs begin at Mon 2022-07-18 10:13:58 CEST, end at Mon 2022-07-18 11:33:58 C> Jul 18 10:14:09 -ThinkPad-L430 systemd[1]: Starting simple and lightweig> Jul 18 10:14:10 -ThinkPad-L430 thinkfan[1224]: thinkfan 0.9.1 starting... Jul 18 10:14:10 -ThinkPad-L430 systemd[1]: Started simple and lightweigh> Jul 18 11:33:58 -ThinkPad-L430 thinkfan[1262]: current temperatures: (75>

vmatare commented 2 years ago

That seems very strange to me. Unfortunately your log output is truncated, so I can't see what fan speed thinkfan has calculated. Can you please do the following during high temps:

sudo pkill -usr1 thinkfan
sudo journalctl --no-pager -el -u thinkfan # To make sure we get the entire log entries
sudo cat /proc/acpi/ibm/fan
sudo bash -c 'for f in /sys/module/thinkpad_acpi/parameters/*; do echo $f; cat $f; echo; done'

Normally we should see the calculated fan speed (127 at ~70°C) in the log message and the corresponding "level disengaged" in /proc/acpi/ibm/fan. If both are the case, then something must be wrong with your thinkpad_acpi kernel module. If one of those things is not the case, there must be something wrong with thinkfan.

fakamaz commented 2 years ago

Here you go...

journal -- Logs begin at Tue 2022-07-19 09:59:30 CEST, end at Tue 2022-07-19 11:12:13 CEST. -- Jul 19 09:59:41 -ThinkPad-L430 systemd[1]: Starting simple and lightweight fan control program... Jul 19 09:59:42 -ThinkPad-L430 thinkfan[1240]: thinkfan 0.9.1 starting... Jul 19 09:59:42 -ThinkPad-L430 systemd[1]: Started simple and lightweight fan control program. Jul 19 11:12:01 -ThinkPad-L430 thinkfan[1251]: current temperatures: (62, 57, 63, 61, 63, 63, 53)

fan status: enabled speed: 0 level: 7 commands: level ( is 0-7, auto, disengaged, full-speed) commands: enable, disable commands: watchdog ( is 0 (off), 1-120 (seconds))

It says 7, but I barely feel any air going out...

bash /sys/module/thinkpad_acpi/parameters/brightness_enable 2

/sys/module/thinkpad_acpi/parameters/brightness_mode 4

/sys/module/thinkpad_acpi/parameters/dbg_bluetoothemul 0

/sys/module/thinkpad_acpi/parameters/dbg_uwbemul 0

/sys/module/thinkpad_acpi/parameters/dbg_wlswemul 0

/sys/module/thinkpad_acpi/parameters/dbg_wwanemul 0

/sys/module/thinkpad_acpi/parameters/enable Y

/sys/module/thinkpad_acpi/parameters/experimental 0

/sys/module/thinkpad_acpi/parameters/fan_control Y

/sys/module/thinkpad_acpi/parameters/force_load N

/sys/module/thinkpad_acpi/parameters/id ThinkPadEC

/sys/module/thinkpad_acpi/parameters/index -536870912

/sys/module/thinkpad_acpi/parameters/software_mute Y

/sys/module/thinkpad_acpi/parameters/volume_capabilities 0

/sys/module/thinkpad_acpi/parameters/volume_control N

/sys/module/thinkpad_acpi/parameters/volume_mode 3

fakamaz commented 2 years ago

And here's on higer temps:

-- Logs begin at Thu 2022-07-21 10:15:24 CEST, end at Thu 2022-07-21 18:54:10 CEST. -- Jul 21 10:15:35 -ThinkPad-L430 systemd[1]: Starting simple and lightweight fan control program... Jul 21 10:15:36 -ThinkPad-L430 thinkfan[1298]: thinkfan 0.9.1 starting... Jul 21 10:15:36 -ThinkPad-L430 systemd[1]: Started simple and lightweight fan control program. Jul 21 18:53:57 -ThinkPad-L430 thinkfan[1333]: current temperatures: (87, 83, 87, 83, 84, 86, 55)

status: enabled speed: 0 level: disengaged commands: level ( is 0-7, auto, disengaged, full-speed) commands: enable, disable commands: watchdog ( is 0 (off), 1-120 (seconds))

/sys/module/thinkpad_acpi/parameters/brightness_enable 2

/sys/module/thinkpad_acpi/parameters/brightness_mode 4

/sys/module/thinkpad_acpi/parameters/dbg_bluetoothemul 0

/sys/module/thinkpad_acpi/parameters/dbg_uwbemul 0

/sys/module/thinkpad_acpi/parameters/dbg_wlswemul 0

/sys/module/thinkpad_acpi/parameters/dbg_wwanemul 0

/sys/module/thinkpad_acpi/parameters/enable Y

/sys/module/thinkpad_acpi/parameters/experimental 0

/sys/module/thinkpad_acpi/parameters/fan_control Y

/sys/module/thinkpad_acpi/parameters/force_load N

/sys/module/thinkpad_acpi/parameters/id ThinkPadEC

/sys/module/thinkpad_acpi/parameters/index -536870912

/sys/module/thinkpad_acpi/parameters/software_mute Y

/sys/module/thinkpad_acpi/parameters/volume_capabilities 0

/sys/module/thinkpad_acpi/parameters/volume_control N

/sys/module/thinkpad_acpi/parameters/volume_mode 3

vmatare commented 2 years ago

Ok, thanks for getting back with the info. Since you have level: 7 and level: disengaged in /proc/acpi/ibm/fan I'd say that thinkfan is doing exactly what it's supposed to do. That is, write these things into that file at the given temperatures. Whatever happens (or doesn't happen) after that is the responsibility of the thinkpad_acpi kernel module and the embedded controller firmware. Since I have never heard about such a severe bug in thinkpad_acpi, I'd guess that either something is wrong with your firmware or with your fan itself. You can try doing a BIOS update if you haven't already. Also, is your fan making any noises like whining, clacking or rattling? In that case you might have worn out bearings and will need to get a replacement fan. If it's neither of those things I fear I'm out of ideas :-/

fakamaz commented 2 years ago

Well, here's a thing. I have a dual boot, and under Windows there's another similar app which controls fan. And it's works just fine! As in it spins on a max speed and I do hear the noise of the fan...

Under Linux - nothing. So it's definitely not a fan problem.

I was thinking that maybe that Windows app could interfere with Linux, but I'm not sure if it's even possible... On the other hand it does change fan control from bios to app settings.

vmatare commented 2 years ago

So that would mean that your hardware must be OK, and the only thing left is maybe some weird quirk in the interaction of thinkpad_acpi and your firmware. A BIOS update will update the embedded controller firmware, as well, so that's definitely worth trying. Other than that, you're also running a very old version of thinkfan, so you could try using the latest release. However since I'm fairly certain that thinkfan is working correctly here, that would be a shot in the dark. Another last-resort experiment would be to use a sysfs pwm* file to control the fan instead of /proc/acpi/ibm/fan. Normally that shouldn't make a difference, but in your case all bets are off I guess...

fakamaz commented 2 years ago

Thank you for all your tips, I'll try this out and let you know. Just curious though, how do I check which version of thinkfan I'm currently using?

Also, in case of pwm - I just have to replace tp_fan with pwm in the same line, right?

Lastly, I've stumbled on the following error, trying to compile the latest build:

@-ThinkPad-L430:~/Downloads/thinkfan-1.3.1$ bash @-ThinkPad-L430:~/Downloads/thinkfan-1.3.1$ cmake -D CMAKE_BUILD_TYPE:STRING=Release CMake Warning: No source or binary directory provided. Both will be assumed to be the same as the current working directory, but note that this warning will become a fatal error in future CMake releases.

-- The C compiler identification is GNU 9.4.0 -- The CXX compiler identification is GNU 9.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found PkgConfig: /usr/bin/pkg-config (found version "1.6.3") -- Looking for pthread.h -- Looking for pthread.h - found -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE
-- Checking for module 'systemd' -- Found systemd, version 245 -- Checking for module 'openrc' -- Package 'openrc', required by 'virtual:world', not found -- Checking for module 'yaml-cpp' -- Package 'yaml-cpp', required by 'virtual:world', not found -- Checking for module 'libatasmart' -- Package 'libatasmart', required by 'virtual:world', not found CMake Error at CMakeLists.txt:60 (message): USE_YAML enabled but yaml-cpp not found. Please install yaml-cpp[-devel]!

-- Configuring incomplete, errors occurred! See also "/home/Downloads/thinkfan-1.3.1/CMakeFiles/CMakeOutput.log". See also "/home/Downloads/thinkfan-1.3.1/CMakeFiles/CMakeError.log".

vmatare commented 2 years ago

Hi @fakamaz,

  1. thinkfan should output its version when you run thinkfan -h. Be aware though that the thinkfan version 0.9.3 falsely reports its version as 0.9.1.
  2. To compile the current version, you either need to have the development files of libyaml-cpp installed or you need to disable YAML support with -DUSE_YAML=false. Recent versions support a new, more powerful YAML-based config syntax if you're interested. But your old config should also continue to work as-is, not matter of you disable YAML or not.
fakamaz commented 2 years ago

Got it to compile with -DUSE_YAML=false and the version seems to be current 1.3.1! Thanks.

Now I actually feel a slightly warm wind coming on 57/60+ C; will test it out on higher temps.

What about pwm - I just have to replace tp_fan with pwm in the same line, right?

EDIT: On high temp I've noticed now it does this:

● thinkfan.service - thinkfan 1.3.1 Loaded: loaded (/usr/local/lib/systemd/system/thinkfan.service; enabled; v> Drop-In: /etc/systemd/system/thinkfan.service.d └─override.conf Active: active (running) since Fri 2022-07-29 09:42:49 CEST; 7h ago Process: 1297 ExecStart=/usr/local/sbin/thinkfan $THINKFAN_ARGS (code=exite> Main PID: 1335 (thinkfan) Tasks: 1 (limit: 18853) Memory: 1.0M CGroup: /system.slice/thinkfan.service └─1335 /usr/local/sbin/thinkfan -b0

Jul 29 17:17:06 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:19:02 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:20:57 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:22:56 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:24:53 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:26:51 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:28:46 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:30:45 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:32:42 -ThinkPad-L430 thinkfan[1335]: Watchdog ping Jul 29 17:34:38 -ThinkPad-L430 thinkfan[1335]: Watchdog ping

Maybe watchdog is blocking the fan? As normally it'll show temperature here. And since I do have visual artefacts I assume fan is not working really...

vmatare commented 2 years ago

So, one by one:

fakamaz commented 1 year ago

Hi, and sorry for the late reply... I've checked on thinkfan status the other day, and this is what it's showing me:

-- Logs begin at Tue 2022-08-30 08:30:10 CEST, end at Tue 2022-08-30 19:15:41 CEST. -- Aug 30 08:30:23 -ThinkPad-L430 systemd[1]: Starting thinkfan 1.3.1... Aug 30 08:30:24 -ThinkPad-L430 thinkfan[1376]: ERROR: Module thinkpad_acpi doesn't seem to support fan_control Aug 30 08:30:24 -ThinkPad-L430 systemd[1]: thinkfan.service: Control process exited, code=exited, status=1/FAILURE Aug 30 08:30:24 -ThinkPad-L430 systemd[1]: thinkfan.service: Failed with result 'exit-code'. Aug 30 08:30:24 -ThinkPad-L430 systemd[1]: Failed to start thinkfan 1.3.1.

No idea what might have happened here, but I can't seem to start it all of a sudden... I did try the above, but it seems I'm not quite following how that 255 format should work.

I'm using the one from the example file here:

levels:

But it stops on the "levels" and produces this:

● thinkfan.service - thinkfan 1.3.1 Loaded: loaded (/usr/local/lib/systemd/system/thinkfan.service; enabled; vendor p> Drop-In: /etc/systemd/system/thinkfan.service.d └─override.conf Active: failed (Result: exit-code) since Tue 2022-08-30 19:32:46 CEST; 4s ago Process: 76146 ExecStart=/usr/local/sbin/thinkfan $THINKFAN_ARGS (code=exited, sta>

Aug 30 19:32:46 -ThinkPad-L430 systemd[1]: Starting thinkfan 1.3.1... Aug 30 19:32:46 -ThinkPad-L430 thinkfan[76146]: ERROR: /etc/thinkfan.conf:64: I> levels: ^ Aug 30 19:32:46 -ThinkPad-L430 systemd[1]: thinkfan.service: Control process ex> Aug 30 19:32:46 -ThinkPad-L430 systemd[1]: thinkfan.service: Failed with result> Aug 30 19:32:46 -ThinkPad-L430 systemd[1]: Failed to start thinkfan 1.3.1.