tuxd3v / ats

90 stars 12 forks source link

Issue: Fan driving up every 120 s for 1 second #28

Open thomas-mc-work opened 3 years ago

thomas-mc-work commented 3 years ago

I'm controlling the NAS case fan with ats. When I start the service, then every 120 seconds the FAN is being started, so I can hear it spinning up. And then it stops immediately. I can see that there is no load on the system. Also the temperatures are below my configured threshold. What could be the cause of the phenomenon? How can provide more debug info?

Here my ats -t output:

info:'SYSTEM' Table 
info:    'BOARD' Table           
info:        'NAME' = ROCKPRO64   
info:        'CPU'  = RK3399      
info:    'THERMAL0_CTL' = /sys/class/thermal/thermal_zone0/temp
info:    'THERMAL1_CTL' = /sys/class/thermal/thermal_zone1/temp
info:    'PWM_CTL'      = /sys/devices/platform/pwm-fan/hwmon/hwmon3/pwm1
info:    'MAX_CONTINUOUS_THERMAL_TEMP' = 60
info:    'MIN_CONTINUOUS_THERMAL_TEMP' = 45
info:    'MAX_PWM' = 255           
info:    'MIN_PWM' = 20            
info:    'ALWAYS_ON' = false       
info:    'PROFILE_NAME' = profile0 
info:    'PROFILE'      = 2        
info:'Pratio' timers               
info:    'Pratio[ -20 - 45 [' = 0  
info:    'Pratio[ 45 ]'       = 20 
info:    'Pratio[ 46 ]'       = 35 
info:    'Pratio[ 47 ]'       = 51 
info:    'Pratio[ 48 ]'       = 67      
info:    'Pratio[ 49 ]'       = 82      
info:    'Pratio[ 50 ]'       = 98      
info:    'Pratio[ 51 ]'       = 114       
info:    'Pratio[ 52 ]'       = 129
info:    'Pratio[ 53 ]'       = 145     
info:    'Pratio[ 54 ]'       = 161      
info:    'Pratio[ 55 ]'       = 176      
info:    'Pratio[ 56 ]'       = 192     
info:    'Pratio[ 57 ]'       = 208
info:    'Pratio[ 58 ]'       = 223       
info:    'Pratio[ 59 ]'       = 239      
info:    'Pratio[ 60 ]'       = 255      
info:    'Pratio[ 60 - 70 ['  = 255

Thanks for looking into this!

tuxd3v commented 3 years ago

Hello, info: 'MIN_PWM' = 20

minimum pwm is too low..

For a 40x40mmx10mm fan pwm minimum should be around 40 or superior.. For the Fan case that value maybe should be superior.. all fans have different behaviour, so you need to check with different pwm minimum values..greater ones..

If fan continues to have that behaviour you need to continue increasing MIN_PWM value.. until its start spining and continue spinning with some minimum speed acceptable.. :)

thomas-mc-work commented 3 years ago

Thank you for your reply!

I've increased the value up to 120, but unfortunately it didn't help.

possebaer commented 3 years ago

I just installed ats on my rockpro64 nas and had a very similar phenomenon.

For my configuration I changed the MIN_CONTINUOUS_THERMAL_TEMP to 45. My nas sits around 43,44 for the most of the time. I suspected that small temperature spikes or smth like that are the reason for the short spin up of the fan.

I watched the pwm settings and the temperature using this small endless loop in bash:

while true; do sleep 1; cat /sys/devices/platform/pwm-fan/hwmon/hwmon3/pwm1; cat /sys/class/thermal/thermal_zone0/temp; cat /sys/class/thermal/thermal_zone1/temp; done

What I saw there when the fan spinned up was following output:

0 43888 45000

190 43888 45000

40 43888 45000

40 43888 45000

And this pattern repeated -- so everytime when the pwm setting changed from 0 to the minimum value because I stepped over the min temperature, there was a small peak in the pwm setting -- here we saw 190, in some other instances I saw 130 and 190.

And then I looked into the code and I guess the reason is in ats.c, in the function setPwm( unsigned char value ). There is this to find: if( ! pwm ) {
/* When stopped, it needs more power to start...give him 0.2 seconds to rotate poles a bit, so that would be better for aplying bigger push,
* In This Way, initial peak current needed to start fan is lower.. */ * /* to force recursion, and update PWM, in case of fail PWM is set to zero bellow, so that it will try again, in second call .. */ pwm = 1; setPwm( 130 ); usleep( 200000 ); setPwm( 190 ); sleep( 1 ); }

I am not so sure if that is really necessary -- by spinning it initially up to a high speed the current will peak -- and not reduce the peak as you wanted to achieve. If it does not spin up at all with pwm value 40, this should be no problem --> the temperature will not be reduced by that, but the temperature will further increase (it will give you to some extent that missing Integral part in your loop) and the pwm setting will increase accordingly, until the pwm setting will be large enough to allow your fan to spin up.

For that reason I would just remove this functionality, and get rid of such problems without loss of any functionality.

tuxd3v commented 2 years ago

hello thomas-mc-work, possebaer,

Each time MIN_CONTINUOUS_THERMAL_TEMP is achieved it will spin the fan, With MIN_CONTINUOUS_THERMAL_TEMP = 45, You will have the fan always starting by small amounts of time( in the case cpu is in idle, because it is always setting temps bellow and above that point frequently..floating.. )..

Try to set a value greater than 45C as the minimum, because the rk3399 heats a bit, even in idle, and enclosed its a bit worst..

Check several values above 45C, start for example by 46-48C

Another thing, You should have MIN_PWM set to a certain value that for sure makes the fan spin. Yeah 120 I guess that should work for sure :), Maybe even 60-80 values( Depends on the Fan type and size ) should work, but better test, if the fan really spins for several time.. If the fan starts spinning and then stops, or if the fan moves only a bit, it means values are not high enough to make the fan spin, so better to set it to bigger ones..

The observation that possebaer, made above is correct. However, the only thing that is dangerous about motors, is if you apply power, but the motor doesn't spin it acts like a short circuit, and will drain power continuously degrading the motor and the electronics, as a last resort it will end up damaging components..

if PWM value is very low, that could happen, We try to start the fan in two cycles, One with PWM=130, changing to PWM=190 after, and then applying the user defined PWM values..because at that time, the fan is considered running..

Initially when the fan is stopped it requires a bit more of power to start running, We used PWM values of, 130 for 0.2 seconds, then 190 for 1 second, after that the fan is considered as spinning, and so we apply the user defined values of PWM to it( after that start sequence.. ). that's the explanation for that :)

thomas-mc-work commented 2 years ago

Thank you for your detailed reply! I'll try to test that as soon as possible.

thomas-mc-work commented 2 years ago

I'm sorry to say that this is still happening :-( But less often than before.

Maybe it would help to wait for a second trigger event to prevent false positives before starting to spin up?

asgeirrr commented 1 year ago

Hey, I think I hit the same issue and hopefully, I have a solution in this PR: https://github.com/tuxd3v/ats/pull/32

thomas-mc-work commented 1 year ago

Sadly it's still happening :-(

tuxd3v commented 1 year ago

Sadly it's still happening :-(

hello tomas, I never hit that problem, maybe because I tested ATS 24/7 during 1.5 years, always running in On/Off mode, and maybe because cpu temps were always around 58C, that problem never manifested..

I made also a commit after @asgeirrr, can you build a new version and see if the problem is solved? I believe it was solved in the last 2 commits, can you check?

Thanks in Advance Regards tux

thomas-mc-work commented 1 year ago

I never hit that problem, maybe because I tested ATS 24/7 during 1.5 years, always running in On/Off mode, and maybe because cpu temps were always around 58C, that problem never manifested..

No blame on you or any other author of this project. It's a great piece of software that seems to work for the most users.

I made also a commit after @asgeirrr, can you build a new version and see if the problem is solved? I believe it was solved in the last 2 commits, can you check?

Tried it and sadly still not :-(

tuxd3v commented 1 year ago

Hello thomas, many thanks for testing, :) :+1: sadly I see its still occurring.. :(

In my view and after the last commits, your fan wants to start, but after 1s of the pre-start condition, pwm applied is to low, so it stops.. I would set MIN_PWM = 80, or 130(for a large Fan), As son as value is high enough, fan will continue to spin( the 1st second is a initial push, for the fan to gain momentum..), so for a big Fan you need a higher value, for it to continue spinning.

What is your current MIN_PWM?

best regards tux

thomas-mc-work commented 1 year ago

My MIN_PWM was 40. I'll test it on 80.

thomas-mc-work commented 1 year ago

Still the same effect:

# ats -t
info:'SYSTEM' Table
info:    'BOARD' Table
info:        'NAME' = ROCKPRO64
info:        'CPU'  = RK3399
info:    'THERMAL0_CTL' = /sys/class/thermal/thermal_zone0/temp
info:    'THERMAL1_CTL' = /sys/class/thermal/thermal_zone1/temp
info:    'MAX_CONTINUOUS_THERMAL_TEMP' = 60        
info:    'MIN_CONTINUOUS_THERMAL_TEMP' = 40
info:    'MAX_PWM' = 255                                       
info:    'MIN_PWM' = 80
info:    'ALWAYS_ON' = false                        
info:    'PROFILE_NAME' = profile0
info:    'PROFILE'      = 0                                 
info:'Pratio' timers                                                                                                                  
info:    'Pratio[ -20 - 40 [' = 0                                                                                                    
info:    'Pratio[ 40 ]'       = 80                       
info:    'Pratio[ 41 ]'       = 88
info:    'Pratio[ 42 ]'       = 97
info:    'Pratio[ 43 ]'       = 106
info:    'Pratio[ 44 ]'       = 115
info:    'Pratio[ 45 ]'       = 123
info:    'Pratio[ 46 ]'       = 132
info:    'Pratio[ 47 ]'       = 141
info:    'Pratio[ 48 ]'       = 150                            
info:    'Pratio[ 49 ]'       = 158                            
info:    'Pratio[ 50 ]'       = 167                                                                                                  
info:    'Pratio[ 51 ]'       = 176        
info:    'Pratio[ 52 ]'       = 185        
info:    'Pratio[ 53 ]'       = 193
info:    'Pratio[ 54 ]'       = 202
info:    'Pratio[ 55 ]'       = 211
info:    'Pratio[ 56 ]'       = 220
info:    'Pratio[ 57 ]'       = 228
info:    'Pratio[ 58 ]'       = 237
info:    'Pratio[ 59 ]'       = 246
info:    'Pratio[ 60 ]'       = 255
info:    'Pratio[ 60 - 70 ['  = 255
Stopping for[ seconds ]............... 3
CPU Temperature[ max 70 °C ].......... 0
GPU Temperature[ max 70 °C ].......... 0
Fan PWM Duty Cycle value[ 0 - 255 ]... 190
--------------------               
Running for[ seconds ]................ 10
CPU Temperature[ max 70 °C ].......... 42
GPU Temperature[ max 70 °C ].......... 41
Fan PWM Duty Cycle value[ 0 - 255 ]... 97
--------------------               
Stopping for[ seconds ]............... 90
CPU Temperature[ max 70 °C ].......... 42
GPU Temperature[ max 70 °C ].......... 41
Fan PWM Duty Cycle value[ 0 - 255 ]... 0
--------------------               
Running for[ seconds ]................ 10
CPU Temperature[ max 70 °C ].......... 42
GPU Temperature[ max 70 °C ].......... 41
Fan PWM Duty Cycle value[ 0 - 255 ]... 97
--------------------               
Stopping for[ seconds ]............... 90
CPU Temperature[ max 70 °C ].......... 42
GPU Temperature[ max 70 °C ].......... 41                                                                                            
Fan PWM Duty Cycle value[ 0 - 255 ]... 0
--------------------            
Running for[ seconds ]................ 10
CPU Temperature[ max 70 °C ].......... 42
GPU Temperature[ max 70 °C ].......... 41
Fan PWM Duty Cycle value[ 0 - 255 ]... 97                                                                                            
--------------------
Stopping for[ seconds ]............... 90
CPU Temperature[ max 70 °C ].......... 42
GPU Temperature[ max 70 °C ].......... 41
Fan PWM Duty Cycle value[ 0 - 255 ]... 0
--------------------           
tuxd3v commented 1 year ago

thomas, I would test with big values, for example I would start with 190, then, when testing, if that value is enough for maintaining the fan spinning, I would test with 160, and so forth until I find a Value where your fan, doesn't spin any-more. In that case, your MIN_PWM would be the last working value.

I for example am using a 40x40x10 mm fan, and for this fan, the most secure MIN_PWM is 40. Bigger Fans will need more power, and it will also depends on the fan itself, just by size, doesn't necessarily mean that fans of same size needs the same amount of power, they are all different, from fan to fan, vendor to vendor, etc..