tuxd3v / ats

91 stars 12 forks source link

temp/pwm maybe too simple - cycles constantly - try a PID? #3

Open dkebler opened 6 years ago

dkebler commented 6 years ago

Finally got my NAS box buttoned up and noticed that the fan comes on for about 30 second and then off for a minute endlessly.

I think your algorithm is a bit too simple. You need more like a PID controller (at least a PI).
https://en.wikipedia.org/wiki/PID_controller. Right now you just have the P part. Without at least integral control you get the kind of behavior I am experiencing at the low end.

It might be too that the fan should stay on just a tad continuously at 39 to avoid this cycling on and off. Or possibly the initial on temp (>39) should be set higher the the off temp (39)

To this end I suggest you make it easier to change the algorithm outside the code (separate function in it's own file). That way it's easier for others to write/modify their own algorithm.

I see there are some basic PID codes out there even for LUA. Anyway you should check into this as a simple proportional controller rarely works well in the real world.

dkebler commented 6 years ago

hmm. sysadmin@nas:/sys/class/thermal/thermal_zone0$ cat /sys/class/thermal/thermal_zone1/temp 45625 but the fan is off

Maybe this is a larger issue with code. Should be on at that temperature

tuxd3v commented 6 years ago

I see, a pid is a good way, I understand, the concept, and I studied it in University..

You are speaking about a lets say, "noise" in values readed, one time above, one time bellow, and it would activate or deactivate fan because of that, constantly..

That happen parcially, because the triggers are time based, it doesn't interfere too much, with on/off system( because it is time based, and not temp based..but the increase of timers or decrease is temp based, and so it interfere in the timers.. ). Its rudimentar, in a lot of aspects.. But consumes less CPU, than a PID systems..

I was mulling implementing a global trigger, which would be a diff equation of temperature.. If diff( temp ) is high, then activate actual trigger systems..something like that. Maybe a complete PID system would be better, I will make some tests, and check processing costs.

Yes that behaviour is like ats works.. Even at 60 degres it will stop fan temporarily.

Each time temp goes up, fan stop timer is short, fan run timer is larger. Each time temp goes down, fan stop timer is longer, fan run is shorter..

tuxd3v commented 6 years ago

Hello, Thanks for the feed-back,

"It might be too that the fan should stay on just a tad continuously at 39 to avoid this cycling on and off. Or possibly the initial on temp (>39) should be set higher the the off temp (39)

To this end I suggest you make it easier to change the algorithm outside the code (separate function in it's own file). That way it's easier for others to write/modify their own algorithm."

In inicial versions I had the fan to sttart at plus 40C, but then some complaned and asked to start at 35, ..

I started it at 39, but I also aknoledge, that 39 its almost the minumun were the CPU sits without doing nothing... :sa: So 40-41C would be a better start point..

Even using the current algo, I can create a /etc/ats.conf with the object configurations, that PROFILE needs.

At the moment would be the preset timers to start/stop, and the Functions to calculate PWM ( TEMP ).

But the API needs to be the same, because if not ats will crash, because it will expect some variables in ats structure types..

tuxd3v commented 5 years ago

Hello dkebler, Its now easy for users to adjust PWM, and Temp Limits, in /etc/ats.conf. Already ported all, or almost all funcionallity to a C backend, so next releases will be targeting, a better control system for the Fan, and expanding support for sysVinit systems.. :)

werner69rock commented 5 years ago

Hello tuxd3v, If I change MIN_CONTINUOUS_THERMAL_TEMP to any other value, my PWM_CTL in "/sys/class/hwmon/hwmon0/pwm1" permanently changes to 190, and when I stop the ATS service, the ATS service stops successfully, but the fan keeps turning. After that I have to reinstall ats and luarocks so that PWM_CTL works again. I am able to change MAX_PWM , MIN_PWM and ALWAYS_ON without any problems. I use 4.4.167-1213-rockchip-ayufan Kernel.

tuxd3v commented 5 years ago

Hello @werner69rock,

you can check the output of:

journalctl -u ats

or after starting the service:

ats --test

it will print the configurations it has and exit.

werner69rock commented 5 years ago

Thx for the fast answer, here are my outputs:

root@ROCK-NAS:~# journalctl -u ats -- Logs begin at Mon 2019-06-24 01:49:38 CEST, end at Mon 2019-06-24 09:37:01 CEST. -- Jun 24 01:51:51 ROCK-NAS systemd[1]: Started ATS - Active Thermal Service.Jun 24 01:52:45 ROCK-NAS systemd[1]: Stopping ATS - Active Thermal Service... Jun 24 01:52:45 ROCK-NAS systemd[1]: Stopped ATS - Active Thermal Service.Jun 24 01:52:46 ROCK-NAS systemd[1]: Started ATS - Active Thermal Service.Jun 24 01:53:51 ROCK-NAS systemd[1]: Stopping ATS - Active Thermal Service... Jun 24 01:53:51 ROCK-NAS systemd[1]: Stopped ATS - Active Thermal Service.Jun 24 01:53:51 ROCK-NAS systemd[1]: Started ATS - Active Thermal Service.

root@ROCK-NAS:~# ats --test info:'SYSTEM' Table info: 'BOARD' Table info: 'NAME' = ROCKPRO64 info: 'CPU' = RK3399 info: 'THERMAL0_CTL' = /sys/class/thermal/thermal_zone0/temp info: 'THERMAL1_CTL' = /sys/class/thermal/thermal_zone1/temp info: 'PWM_CTL' = /sys/class/hwmon/hwmon0/pwm1 info: 'MAX_CONTINUOUS_THERMAL_TEMP' = 60 info: 'MIN_CONTINUOUS_THERMAL_TEMP' = 40 info: 'MAX_PWM' = 255 info: 'MIN_PWM' = 150 info: 'ALWAYS_ON' = true info: 'PROFILE_NAME' = profile0 info: 'PROFILE' = 0 info:'Pratio' timers info: 'Pratio[ -20 - 40 [' = 0 info: 'Pratio[ 40 ]' = 150 info: 'Pratio[ 41 ]' = 155 info: 'Pratio[ 42 ]' = 160 info: 'Pratio[ 43 ]' = 165 info: 'Pratio[ 44 ]' = 171 info: 'Pratio[ 45 ]' = 176 info: 'Pratio[ 46 ]' = 181 info: 'Pratio[ 47 ]' = 186 info: 'Pratio[ 48 ]' = 192 info: 'Pratio[ 49 ]' = 197 info: 'Pratio[ 50 ]' = 202 info: 'Pratio[ 51 ]' = 207 info: 'Pratio[ 52 ]' = 213 info: 'Pratio[ 53 ]' = 218 info: 'Pratio[ 54 ]' = 223 info: 'Pratio[ 55 ]' = 228 info: 'Pratio[ 56 ]' = 234 info: 'Pratio[ 57 ]' = 239 info: 'Pratio[ 58 ]' = 244 info: 'Pratio[ 59 ]' = 249 info: 'Pratio[ 60 ]' = 255 info: 'Pratio[ 60 - 70 [' = 255 Stop ATS Service first [ service ats stop ]..

tuxd3v commented 5 years ago

Hello @werner69rock, The Service even stopped, the fan can continue to spin.. that is not a problem, it will stay in the last state when exited.. if pwm is 200 is stays in 200, if it his 40, is stays in 40..

The motif for that is that, if fan stops to be managed, it should stay in last state when ATS exited, by user interference.. If Fan is stopped when you stop ats, fan will be always off..

But the experience you are describing is wierd.. It should adjust fan accordingly with temperature.. If you stop service and test manually, like this:

systemctl stop ats
ats --test

and then stress the cpu with processing in another open shell: like this:

while true;do pwm=$( cat /sys/devices/platform/pwm-fan/hwmon/hwmon0/pwm1);if [ ${pwm} -ne 0 ];then echo "--->${pwm}"; fi; done

is this way you can check, when temps go up, if pwn change..