Open julienrobin28 opened 10 months ago
Are there other values under /sys/class/hwmon/hwmon2/
?
And can you explain what you are trying to do by dynamically changing the pwm values? Normally you would configure the thermal zones and cooling fan settings and then leave the system to do its thing.
Yes there is other values from /sys/class/hwmon/hwmon2/
Output of ls /sys/class/hwmon/hwmon2/
:
device fan1_fault fan1_input name of_node power pwm1 subsystem uevent
Output of ls -an /sys/class/hwmon/hwmon2/
:
total 0
drwxr-xr-x 3 0 0 0 Oct 24 20:19 .
drwxr-xr-x 3 0 0 0 Oct 24 20:19 ..
lrwxrwxrwx 1 0 0 0 Oct 24 21:31 device -> ../../../10-002f
-r--r--r-- 1 0 0 4096 Oct 24 21:31 fan1_fault
-r--r--r-- 1 0 0 4096 Oct 24 20:27 fan1_input
-r--r--r-- 1 0 0 4096 Oct 24 21:31 name
lrwxrwxrwx 1 0 0 0 Oct 24 21:31 of_node -> ../../../../../../../../../firmware/devicetree/base/soc/i2c0mux/i2c@1/emc2301@2f
drwxr-xr-x 2 0 0 0 Oct 24 21:31 power
-rw-r--r-- 1 0 0 4096 Oct 25 02:47 pwm1
lrwxrwxrwx 1 0 0 0 Oct 24 21:31 subsystem -> ../../../../../../../../../class/hwmon
-rw-r--r-- 1 0 0 4096 Oct 24 20:19 uevent
Even when pwm1
is stuck maxed out, reading of fan1_input
file still works, and shows the current RPM fan speed (which keeps updating successfully). The lm-sensors
package (sensor
command) also works:
emc2305-i2c-10-2f
Adapter: i2c-22-mux (chan_id 1)
fan1: 1160 RPM
cpu_thermal-virtual-0
Adapter: Virtual device
temp1: +44.3°C (crit = +110.0°C)
rpi_volt-isa-0000
Adapter: ISA adapter
in0: N/A
What I'm trying to do is to use fancontrol Debian package, whose goal to dynamically increase or decrease the speed of my (very fast and very noisy) fan, by periodically:
/etc/fancontrol
/etc/fancontrol
The /etc/fancontrol
file is interactively created by running pwmconfig
, which helps identifying which pwm file controls which fan, by also increasing and decreasing pwm values.
(Is there another/prefered way to change thermal zones and cooling fan settings?) For now, as a workaround, I'm using a physical potentiometer/rheostat to manually set a fixed speed to my fan.
Any suggestions, @6by9?
This is all from the mainline driver. We only added back in DT configuration because the mainline DT maintainers wouldn't agree a binding.
I'd guess it's the lump at https://github.com/torvalds/linux/blob/master/drivers/hwmon/emc2305.c#L411-L419. Compare to the equivalent in pwm_fan and it just validates the range but otherwise accepts the data. There is a blob of text describing what they're trying to achieve at https://github.com/torvalds/linux/blob/master/drivers/hwmon/emc2305.c#L68-L80 and that sounds reasonable enough.
Many thanks @6by9 for this information and having a look at this.
Too bad the traditional behavior isn't available as an option! But of course, this now makes more sense, even if this implementation unfortunately isn't compatible with fancontrol
usual way of workings with PWMs.
Just before letting you go, I have a tiny question:
pwm1
value, as it acts more like a "fixed lower speed limit" than a "current speed", how should I proceed to be able to change the "current speed" in both way (up and down)? If only you already know another approach (if not, I'll try to find out by myself).Anyway, even if not ideal for me, this may be OK for me for my particular case, by just setting a reasonable value which won't be updated after 👍 I'll have to set a value anyway, as the initial (minimum and current) pwm1
value at driver initialization is 0 (at this speed, the fan isn't noisy at all, but it's not cooling neither 😅)
Thanks again
I have a suspicion that it is a misbehaviour compared to that documented. I would have expected pwm1 to be adjustable such that it is just a lower limit.
I suspect your time window is down to the poll period of the thermal zone, and if the thermal zone has bumped up the speed then it's updated some other state which stops you changing the low point again.
You can disconnect the thermal zones by dropping fragments 104 & 105 in https://github.com/raspberrypi/linux/blob/rpi-6.1.y/arch/arm/boot/dts/overlays/i2c-fan-overlay.dts#L54-L82, and then I would expect the pwm1 control to set the speed directly.
By reading fragments 104 & 105 before removing it (I'll be doing this test just after), and searching for more information about what are those "thermal zones", I've discovered the existence of /sys/class/thermal/cooling_device0/
folder.
Which is probably the thermal zones settings both of you were talking about previously!
/sys/class/thermal/cooling_device0
works in both wayThis other sysfs is successfully able to set the fan speed up and down, by using another way to change pwm1
value in both direction, using /sys/class/thermal/cooling_device0/cur_state
(which goes from 0 to 10, according to /sys/class/thermal/cooling_device0/max_state
).
By reading /sys/class/hwmon/hwmon2/pwm1
it turns out:
/sys/class/thermal/cooling_device0/cur_state
to 10 sets the pwm to 255/sys/class/thermal/cooling_device0/cur_state
back to 0 sets the pwm back to 0fancontrol
isn't needed:I guess this is what fragment 104 & 105 in i2c-fan-overlay.dts
are doing: the /sys/class/thermal/cooling_device0/
folder is registered into /sys/class/thermal/thermal_zone0/
as a symbolic link (cdev0/
to ../cooling_device0/
), and I found out that thermal_zone0
, which is the CPU temperature, is periodically checked so that the fan speed is already periodically adjusted.
By running stress-ng --matrix 0
I indeed verified that the fan speed actually increases when the CPU is getting hotter. I wasn't aware that this was already done! My CPU wasn't working hard enough for me to notice this.
I'll do the little test about disconnecting the thermal zones by dropping fragments 104 & 105 in i2c-fan-overlay.dts
and keep you informed about /sys/class/hwmon/hwmon2/pwm1
minimum value getting locked or not after few seconds.
So I can confirm that pwm1
minimum value isn't automatically locking anymore when having removed fragments 104 & 105 in i2c-fan-overlay.dts
:
What I did / what the results are:
/sys/class/hwmon/hwmon2/pwm1
(fan accelerated to max speed)pwm1
succeeded (fan became silent)./sys/class/hwmon/hwmon2/pwm1
(fan accelerated to max speed)/sys/class/thermal/cooling_device0/cur_state
while 255 was still present into pwm1
(this locks the pwm1
file to 255, and the fan isn't slowing down no matter what is placed into cur_state
).pwm1
isn't working anymore (value is stuck to 255)This confirms @6by9 statement about my time window being down to the poll period of the thermal zone.
Note:
By removing the 2 fragments from i2c-fan-overlay.dts
, none of the /sys/class/hwmon/hwmon2/
and /sys/class/thermal/cooling_device0/
were showing anymore (lsmod
wasn't showing emc2305
driver as loaded anymore). I found a way to manually reload the driver by typing echo "emc2301" 0x2f > /sys/bus/i2c/devices/i2c-22/new_device
Doing so, both /sys/class/hwmon/hwmon2/
and /sys/class/thermal/cooling_device0/
are showing back (but cooling_device0
isn't linked anymore from thermal_zone0
, as expected for this test).
I keep available if I can do anything else; thanks again for the work.
Describe the bug
Having successfully enabled the Compute Module 4 IO Board embedded fan controller and RTC from config.txt using the following lines:
It is successfully made accessible from sysfs, to get the current fan speed at
/sys/class/hwmon/hwmon2/fan1_input
, and the PWM value (used by 4 wires fans) can also be set and read back from/sys/class/hwmon/hwmon2/pwm1
However: the bug
It is only possible to increase the value of
pwm1
. For example, you can go from 0 to 102 (the fan accelerates, and reading back the file, value is confirmed to be 102), then you can't go back to 0 (if you try, the fan won't slow down, and reading back the file, the value has been left to 102).You can then go to 255 successfully (fan accelerates even more) then, you won't be able to go back to less than 255.
Unless you are fast enough!
As soon as you go from 102 to 255, the fan accelerates, but if less than 1 second after, you go back to 102, then, the fan decelerates. You are back to 102. The more you wait, the less likely it is to work. Sometimes 3 seconds is still OK, sometimes, 2 seconds is too late...! However, even if successfully back to 102, you can't go back to 0 (if 102 has been set for too long, it became the new minimal value).
Resetting driver unlocks the value back to 0
When the fan's PWM is stuck to 255 for example, you can run the following commands:
The value is back to 0. But if you raise it, the same issue will be back.
Steps to reproduce the behaviour
I made a little script to find out how many seconds are needed for the current value to be locked as minimum value. This may be used to reproduce the issue (even with no fan connected, but you'll definitely need an emc2301!)
Example of output:
Device (s)
Raspberry Pi CM4
System
cat /etc/rpi-issue:
vcgencmd version:
uname -a:
Linux crobe-server-coudray 6.1.0-rpi4-rpi-v8 #1 SMP PREEMPT Debian 1:6.1.54-1+rpt2 (2023-10-05) aarch64 GNU/Linux
vcgencmd bootloader_version:
Logs
Nothing appears into dmesg about this device.
Output of
lsmod
:Additional context
I noticed from the kernel source code that the involved driver, which seems to be
emc2305.c
unless I'm wrong, is different between Raspberry Pi kernels and upstream kernels source code (even when using the same revision number, in this case 6.1.54 fromlinux-stable_20231004
here, orlinux-6.1.54.tar.xz
from kernel.org).This is why I prefer reporting the issue here.
Hoping this report may help!
Best regards, Julien ROBIN