cyoung / stratux

Aviation weather and traffic receiver based on RTL-SDR.
BSD 3-Clause "New" or "Revised" License
1.04k stars 360 forks source link

fancontrol deadlock #663

Open cyoung opened 6 years ago

cyoung commented 6 years ago
  1. Stratux version: v1.4r2

  2. Stratux config:

    SDR

    • [ ] single
    • [x] dual

    GPS

    • [x] yes
    • [ ] no type:

    AHRS

    • [x] yes
    • [ ] no

    power source: EasyAcc 6000 mAh

    usb cable: integrated cable

  3. EFB app and version: N/A

    EFB platform: N/A

    EFB hardware: N/A

  4. Description of your issue: fancontrol daemon deadlocks.

If possible, enable "Replay Logs", reproduce the problem, and provide a copy of the logs in http://192.168.10.1/logs/stratux/ and http://192.168.10.1/logs/stratux.log.

Snowflake6 commented 6 years ago

FYI - I may be observing this problem in 1.4r3. My fan does briefly run at power-up, but since then it is not turning on. CPU up to 60C and no fan activity after an hour's running.

cyoung commented 6 years ago

Thanks, I think it might be some deadlock on reading /sys/class/thermal/thermal_zone0/temp. Going to leave a unit running for a while to try and reproduce it, then maybe strace to figure out what the process is doing.

Snowflake6 commented 6 years ago

Further update - I've rebooted a few times today since, and it seems to be running smoothly now. I didn't change anything on the hardware side. The software side has changed slightly as I installed the updated OLED drivers but I don't expect that should have any effect on the fan operation.

cyoung commented 6 years ago

Keep looking for it. I got it to happen once. Seems rare.

Snowflake6 commented 6 years ago

Last night I left it running in my window (trying to get my VK172 to lock in). This morning the temp was up to 60C and the fan wasn't running. I unplugged it and brought it to the office, and when I powered it up here the fan spun up and down as usual then did nothing until the temp got up over 55C... After running for half an hour or more like that, the fan suddenly started working and has been ramping up and down with temperature for the last couple of hours.s

So I guess there's something flaky in there but i'm not sure what yet. FWIW, this is running on my older "dead bug" circuit for controlling the fan. I have an AHRS/Fan CTL board on order and will swap it in when it arrives (along with a (hopefully) more stable GPYes).

cyoung commented 6 years ago

Monitor with: https://github.com/cyoung/stratux/commit/cc4127df8047aedf9c6f2a8a2d478ec871d523ad.

cyoung commented 6 years ago

Not fixed. Seems to be a wiringPi bug.

cyoung commented 6 years ago

@d-hoke - what does http://192.168.10.1:9977 look like when the fan is not running (but CPU temp is >50ºC)?

cyoung commented 6 years ago

@d-hoke -- another test, http://updates.stratux.me/builds/update-stratux-v1.4r5-c0127928af.sh

Added a failsafe temperature (for this test, once it reaches 65ºC it will give up using PWM). The theory is that wiringPi stops responding to PWM command at a certain point.

d-hoke commented 6 years ago

1of3) will try to check :9977 when/if can reproduce (not doing too much with the unit right now), its mostly just benched accruing runtime hours, but I can periodically restart to see if re-occurs, initial brief attempts have 'delay' (see 2of3) but when duty reaches about 70 fan begins to operate (and I think that probably happens within about 6-7 30sec intervals, if I recall code correctly). (With the earlier failures I let it go way longer than that, time-length prob. doc'd in other now closed issue.)

2of3)have previously observed that on my unit(s) (I only have access to one at-the-moment), the fan does not seem to operate until PWM at 60-70. On 'this' unit, it seems to hover between 70-80. (I think on one of the others it would start to operate at about 550 when I was playing with it via GPIO util (range 1-1024 ).)

3of3)Don't know when/if I'll have a chance to try your failsafe test build.

d-hoke commented 6 years ago

obtained a failure, when found (fan not running) :9977 checked and showed: {"TempTarget":50,"TempCurrent":65.528,"PWMDutyMin":1,"PWMDutyMax":100,"PWMDutyCurrent":100,"PWMPin":1} (the above was manually typed (circumstances make direct copy difficult), but should be mostly correct) (next check after entering above, temp was to 66.066)

(***might be important... FWIW, this was after a soft 'reboot' via web interface, hadn't previously made any association, don't know if it could be related or not, all other (multiple) attempts prior this morning were power-off/power-on restarts.)

ssh'd in... systemctl status fancontrol reports 'active (running)' (other stuff) systemctl stop fancontrol ps -A | grep fancontrol kill -9 thatpid gpio pwm 1 500 gpio pwm 1 900 **fan did NOT start operating (in weeks earlier attempts it DID start) gpio readall -the 'V' column for pin 1 shows zero ('0') (I think that's not good, right?) gpio mode 1 pwm -the fan immediately started... - not sure what this would indicate... gpio readall -the 'V' column for pin1 shows one ('1') (but the service is currently down, so this is just from gpio utility - important part I'd guess is that setting the PWM value did not work here until after* I request mode be set to PWM - any possibility that PWM mode for that pin being somehow 'lost'?) systemctl start fancontrol -apparently started and took control of pin... watching... -:9977 showed PWMDuty up to 60, at that point I could hear audible frequence (PWM at level for my ears?) but fan not running, went to 70, fan still not running, went to 90 while typing this, fan had started running before that check... went to 100 fan running hard (temp was still @ 51.1 but apparently dropping)

cyoung commented 6 years ago

Did you do this test with the updated version?

d-hoke commented 6 years ago

No. (sorry, not having done an update yet, that's unfamiliar territory to me and would require extra time, stop/starts/reboots I can fit in, as well as risk of 'bricking' unit currently generally functioning for my development purposes)

d-hoke commented 6 years ago

on a subsequent soft restart fancontrol was functioning

CraXgt commented 6 years ago

Same observation here on 1.4r5

{"TempTarget":50,"TempCurrent":69.294,"PWMDutyMin":1,"PWMDutyMax":10,"PWMDutyCurrent":4,"PWMPin":1}

No spinning fan, although PWMDutyCurrent reads our a value correctly.

Edit: After reboot, it seems to work correctly.

fast240z commented 5 years ago

I am experiencing issues with fancontrol dead locking under the latest master build. The fan runs at boot up and will run for a short period if I kill the process and run fancontrol manually. The behavior occurs regardless of reboot.

EDIT - issue appears to exist in v1.4r5 img, but does not exist if flashing a previous version of stratux and updating to v1.4r5 via .sh file. I installed v1.4r3 and verified the fan worked properly (came on at 55c) and then updated to v1.4r5 and verified the fan still worked fine. If flashing v1.4r5 directly from img, the fan will consistently fail to work.

craytron commented 5 years ago

First post here and I don't get github yet. I have had fan lock problems since day one on my system described below. I think this issue is still open with the last comment on Aug 31 I assume this year (2018).

Pi 3B, Version 1.4r4, AHRS, no GPS, single external antenna via Stratus ESG transponder, good power (steady red), Stratux mounted in glove box. Fan burst normal on all boot ups but, only a 50% chance that the fan controller actually works. UI reboot will usually restore the fan controller. Sometimes 2 reboots are required. Looking to help if I can or, maybe this issue is closed and a fix is available.