andrewjfreyer / monitor

Distributed advertisement-based BTLE presence detection reported via mqtt
1.64k stars 194 forks source link

./support/btle: line 356: printf: write error: Broken pipe #236

Open mschwiet opened 5 years ago

mschwiet commented 5 years ago

Started monitor service as a daemon on RPi 3B+ with the -V option. Trying to track 2 android phones. Monitor completes the initial scan and finds the phones, however, it appears the no further scans are completed. In the log below, monitor was started at ~10:45. Both phones left the area at ~16:00 and returned at ~21:00. As you can see, no activity until the error noted above. (MAC addresses removed.)

Is this a program error or my error?

Output from journalctl -u monitor -r:

Aug 25 01:19:42 autopi.home.lan bash[7308]: ./support/btle: line 356: printf: write error: Broken pipe
Aug 24 10:45:01 autopi.home.lan bash[7308]: [+] 0.2.197 24-08-2019 10:45:01 AM [CMD-INFO]        **** completed arrival scan ****
Aug 24 10:44:58 autopi.home.lan bash[7308]:  }
Aug 24 10:44:58 autopi.home.lan bash[7308]:     "version":"0.2.197"
Aug 24 10:44:58 autopi.home.lan bash[7308]:     "timestamp":"Sat Aug 24 2019 10:44:57 GMT-0400 (EDT)",
Aug 24 10:44:58 autopi.home.lan bash[7308]:     "retained":"false",
Aug 24 10:44:58 autopi.home.lan bash[7308]:     "type":"KNOWN_MAC",
Aug 24 10:44:58 autopi.home.lan bash[7308]:     "manufacturer":"zte corporation",
Aug 24 10:44:58 autopi.home.lan bash[7308]:     "name":"sherry_phone ",
Aug 24 10:44:58 autopi.home.lan bash[7308]:     "confidence":"100",
Aug 24 10:44:58 autopi.home.lan bash[7308]:     "id":"XX:XX:XX:XX:XX:XX",
Aug 24 10:44:58 autopi.home.lan bash[7308]:  {
Aug 24 10:44:58 autopi.home.lan bash[7308]: [+] 0.2.197 24-08-2019 10:44:58 AM [CMD-MQTT]        monitor/autopi.home.lan/sherry_phone
Aug 24 10:44:58 autopi.home.lan bash[7308]: [+] 0.2.197 24-08-2019 10:44:58 AM [CMD-NAME]        XX:XX:XX:XX:XX:XX SHERRY PHONE ZTE  zte corporation
Aug 24 10:44:57 autopi.home.lan bash[7308]: [+] 0.2.197 24-08-2019 10:44:57 AM [CMD-SCAN]        (No. 1) XX:XX:XX:XX:XX:XX arrival?
Aug 24 10:44:55 autopi.home.lan bash[7308]:  }
Aug 24 10:44:55 autopi.home.lan bash[7308]:     "version":"0.2.197"
Aug 24 10:44:55 autopi.home.lan bash[7308]:     "timestamp":"Sat Aug 24 2019 10:44:54 GMT-0400 (EDT)",
Aug 24 10:44:55 autopi.home.lan bash[7308]:     "retained":"false",
Aug 24 10:44:55 autopi.home.lan bash[7308]:     "type":"KNOWN_MAC",
Aug 24 10:44:55 autopi.home.lan bash[7308]:     "manufacturer":"LG Electronics Mobile Communications",
Aug 24 10:44:55 autopi.home.lan bash[7308]:     "name":"mark_phone ",
Aug 24 10:44:55 autopi.home.lan bash[7308]:     "confidence":"100",
Aug 24 10:44:55 autopi.home.lan bash[7308]:     "id":"XX:XX:XX:XX:XX:XX",
Aug 24 10:44:55 autopi.home.lan bash[7308]:  {
Aug 24 10:44:55 autopi.home.lan bash[7308]: [+] 0.2.197 24-08-2019 10:44:55 AM [CMD-MQTT]        monitor/autopi.home.lan/mark_phone
Aug 24 10:44:54 autopi.home.lan bash[7308]: [+] 0.2.197 24-08-2019 10:44:54 AM [CMD-NAME]        XX:XX:XX:XX:XX:XX MarkV20  LG Electronics Mobile Communications
Aug 24 10:44:53 autopi.home.lan bash[7308]: [+] 0.2.197 24-08-2019 10:44:53 AM [CMD-SCAN]        (No. 1) XX:XX:XX:XX:XX:XX arrival?
Aug 24 10:44:53 autopi.home.lan bash[7308]: [+] 0.2.197 24-08-2019 10:44:53 AM [CMD-INFO]        **** started arrival scan [x1 max rep] ****
Aug 24 10:44:52 autopi.home.lan bash[7308]: > beacon database time trigger pid = 7403
Aug 24 10:44:52 autopi.home.lan bash[7308]: > packet listener pid = 7401
Aug 24 10:44:52 autopi.home.lan bash[7308]: > mqtt listener pid = 7399
Aug 24 10:44:52 autopi.home.lan bash[7308]: > btle listener pid = 7397
Aug 24 10:44:52 autopi.home.lan bash[7308]: > btle text pid = 7396
Aug 24 10:44:52 autopi.home.lan bash[7308]: > btle scan pid = 7395
Aug 24 10:44:52 autopi.home.lan bash[7308]: > log listener pid = 7394
Aug 24 10:44:52 autopi.home.lan bash[7308]: > XX:XX:XX:XX:XX:XX confidence topic: monitor/autopi.home.lan/sherry_phone (has not previously connected to hci0)
Aug 24 10:44:52 autopi.home.lan bash[7308]: > XX:XX:XX:XX:XX:XX confidence topic: monitor/autopi.home.lan/mark_phone (has not previously connected to hci0)
Aug 24 10:44:49 autopi.home.lan bash[7308]: > mqtt trigger: monitor/scan/DEPART
Aug 24 10:44:49 autopi.home.lan bash[7308]: > mqtt trigger: monitor/scan/ARRIVE
Aug 24 10:44:49 autopi.home.lan bash[7308]: > preference: selected HCI device = hci0
Aug 24 10:44:49 autopi.home.lan bash[7308]: > preference: maximum sequential depart scan attempts = 2
Aug 24 10:44:49 autopi.home.lan bash[7308]: > preference: maximum sequential arrive scan attempts = 1
Aug 24 10:44:49 autopi.home.lan bash[7308]: > preference: regex filter for manufacturers to reject = Google
Aug 24 10:44:49 autopi.home.lan bash[7308]: > preference: regex filter for manufacturers to accept = LG Electronics Mobile Communications|zte corporation
Aug 24 10:44:49 autopi.home.lan bash[7308]: > preference: regex filter for flags to reject = NONE
Aug 24 10:44:49 autopi.home.lan bash[7308]: > preference: regex filter for flags to accept = .*
Aug 24 10:44:49 autopi.home.lan bash[7308]: > preference: minimum time between the same type of scan = 10
Aug 24 10:44:49 autopi.home.lan bash[7308]: > warning: for security purposes, please consider changing 'password' in: mqtt_preferences
Aug 24 10:44:49 autopi.home.lan bash[7308]: > warning: for security purposes, please consider changing 'username' in: mqtt_preferences
Aug 24 10:44:49 autopi.home.lan bash[7308]: > warning: verbose logging is enabled. this setting is only for informational and debugging purposes
Aug 24 10:44:49 autopi.home.lan bash[7308]: > starting monitor.sh (v. 0.2.197)...
Aug 24 10:44:49 autopi.home.lan systemd[1]: Started Monitor Service.

Thanks for any insight you can give.

Mark

andrewjfreyer commented 4 years ago

The issue relates to a closed pipe that should not have closed. Do you have a particularly noiseless bluetooth environment (e.g., not a lot of nearby neighbors?). I have pushed a fix 0.2.200 that may address what you experienced, but I cannot duplicate on my end.

Closing to reopen upon more information or an update.

daniele-athome commented 4 years ago

I had this bug too and with latest version doesn't seem to happen (for now at least: 6 hours running).

daniele-athome commented 4 years ago

Correction. After a while it stops working altogether (that is, it doesn't act on receiving arrival/departure MQTT messages):

gen 29 16:11:34 rasp bash[20004]: ./support/btle: riga 399: printf: errore in scrittura: Pipe interrotta
gen 29 16:31:47 rasp bash[20004]: ./support/btle: riga 399: printf: errore in scrittura: Pipe interrotta
gen 29 16:53:51 rasp bash[20004]: ./support/btle: riga 399: printf: errore in scrittura: Pipe interrotta
gen 29 17:07:56 rasp bash[20004]: ./support/btle: riga 399: printf: errore in scrittura: Pipe interrotta
codegrau commented 4 years ago

After upgrading From stretch to buster I experience the same errors.

sharondagan commented 4 years ago

happens to me as well - after a short while (5-8 hours), it ceases sending statuses. I'm using dietpi/buster - Feb 25 13:41:38 Presence bash[447]: ./support/btle: line 399: printf: write error: Broken pipe

mikeage commented 4 years ago

Same thing on raspbian stretch for me; both master (23b18aa5a5b7747da73244289628a5305bbf2c2f) and beta (b7188693c9aa776dbe6f4e8e962fbb019192dd49)

sbowater commented 4 years ago

I'm seeing similar "Broken pipe" errors in the log (multiple times over the course of the day) but it doesn't appear to have any unfavorable impact on the operation of the script -- it keeps running and processing.

Mar 01 09:55:55 rpi-family-room bash[27450]: ./support/btle: line 399: printf: write error: Broken pipe

hoangtridung commented 4 years ago

The same issue. The output of: "sudo systemctl status monitor": monitor.service - Monitor Service Loaded: loaded (/etc/systemd/system/monitor.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2020-03-15 03:17:38 GMT; 7h ago Main PID: 478 (bash) Tasks: 21 (limit: 2200) Memory: 19.4M CGroup: /system.slice/monitor.service ├─ 478 /bin/bash /home/pi/monitor/monitor.sh & ├─ 582 /bin/bash /home/pi/monitor/monitor.sh & ├─ 583 /bin/bash /home/pi/monitor/monitor.sh & ├─ 584 /bin/bash /home/pi/monitor/monitor.sh & ├─ 586 /bin/bash /home/pi/monitor/monitor.sh & ├─ 588 /bin/bash /home/pi/monitor/monitor.sh & ├─ 590 /bin/bash /home/pi/monitor/monitor.sh & ├─ 592 /bin/bash /home/pi/monitor/monitor.sh & ├─ 624 /bin/bash /home/pi/monitor/monitor.sh & ├─ 626 /usr/bin/mosquitto_sub -I PA -v -F %t|%p -q 2 -L mqtt://mosquitto 30011970 192.168.103.4 1883 monitor/# --will-topic monitor/PA/status --wi ├─1539 /bin/bash /home/pi/monitor/monitor.sh & ├─1541 timeout --signal SIGINT 90 stdbuf -oL -eL btmon ├─1542 btmon ├─1611 /bin/bash /home/pi/monitor/monitor.sh & ├─1612 timeout --signal SIGINT 120 stdbuf -oL -eL hcidump -i hci0 --raw ├─1613 hcidump -i hci0 --raw ├─1615 sleep 20 ├─1627 /bin/bash /home/pi/monitor/monitor.sh & ├─1628 timeout --signal SIGINT 60 hcitool -i hci0 lescan ├─1629 hcitool -i hci0 lescan └─1630 sleep 1

Mar 15 10:20:20 PA bash[478]: [+] 0.2.200 15-03-2020 10:20:20 am [CMD-RSSI] PUBL C4:7C:8D:6A:98:2A RSSI: -72 dBm (slow movement approach | 22 dBm) Mar 15 10:20:45 PA bash[478]: [+] 0.2.200 15-03-2020 10:20:45 am [DEL-RAND] RAND 49:D2:FF:EE:1C:C1 expired after 158 seconds Mar 15 10:24:32 PA bash[478]: ./support/btle: line 399: printf: write error: Broken pipe Mar 15 10:25:46 PA bash[478]: [+] 0.2.200 15-03-2020 10:25:46 am [CMD-RSSI] PUBL C4:7C:8D:6A:98:5A RSSI: -98 dBm (slow movement depart | 27 dBm) Mar 15 10:31:12 PA bash[478]: [+] 0.2.200 15-03-2020 10:31:12 am [CMD-RSSI] PUBL C4:7C:8D:6A:98:5A RSSI: -101 dBm (slow movement depart | 27 dBm) Mar 15 10:33:59 PA bash[478]: [+] 0.2.200 15-03-2020 10:33:59 am [CMD-NAME] F0:0F:EC:58:1F:C0 Huawei_nova_3e Unknown Mar 15 10:33:59 PA bash[478]: [+] 0.2.200 15-03-2020 10:33:59 am [CMD-NAME] 18:65:90:1A:F4:CD Ly_iPhone7S Apple Inc Mar 15 10:33:59 PA bash[478]: [+] 0.2.200 15-03-2020 10:33:59 am [CMD-NAME] 48:2C:A0:EA:75:3E Xiaomi_redmi Xiaomi Communications Co Ltd Mar 15 10:36:05 PA bash[478]: [+] 0.2.200 15-03-2020 10:36:05 am [DEL-RAND] RAND 4F:B3:AD:98:E8:4A expired after 166 seconds Mar 15 10:36:33 PA bash[478]: ./support/btle: line 399: printf: write error: Broken pipe

MaiorDomus commented 4 years ago

Same thing here. Happens on a Raspberry Pi 3 - using the supplied installation instructions - as well as on a Jetson Nano, with the latest version of MQTT build from source (master). Tried both on the master branch of monitor though.

andrewjfreyer commented 4 years ago

Still working on this one guys, stay tuned.

drgnomage commented 4 years ago

I'm having the same issue on a Raspberry Pi 4.

./support/btle: line 352: printf: write error: Broken pipe

It also seems to lead to the device becoming unknown.

[+] 0.2.200 21-04-2020 08:34:34 pm [CMD-NAME] AA:BB:CC:DD:EE:FF OnePlus Two Unknown

Not sure what unknown means in this context but it seems to not function correctly for a few minutes then work correctly.

[+] 0.2.200 21-04-2020 08:45:23 pm [CMD-NAME] AA:BB:CC:DD:EE:FF OnePlus Two Unknown [+] 0.2.200 21-04-2020 08:48:31 pm [CMD-NAME] AA:BB:CC:DD:EE:FF OnePlus 2 Unknown [+] 0.2.200 21-04-2020 08:48:31 pm [CMD-MQTT] monitor/rpi.glitchbusters.info/oneplus_two { ... confidence : 100 ... }

My known_static_addresses file contains only:

AA:BB:CC:DD:EE:FF OnePlus Two

So I'm not sure where it got OnePlus 2 from.

rccoleman commented 4 years ago

I'm starting to get this too with the current beta and a fresh install.

Jun 15 20:02:16 fr-beacon bash[542]: ./support/btle: line 399: printf: write error: Broken pipe

I had a previous OS install on the same RPI3, did a manual install, and it worked fine (no broken pipe errors). The SD card was really, really slow, so I did a fresh install based on the instructions (starting from flashing) and now I'm getting broken pipe errors periodically. It still seems to work, though, and updates my status properly.

I also started launching with "-r" (periodically scanning) around the same time, so could be related to that. I'll back that out and see if the errors go away, but I can live with the errors if it's still functional.

Edit: I've tried both with and without periodic scanning (-r) and I still periodically get these "broken pipe" errors. I'm running two separate RPI3s in two separate locations (can't see each other) and both report the error from time to time. Both were installed from scratch on new SD cards using the instructions in the monitor.sh docs. Here's a short log snippet that shows the frequency:

Jun 16 18:33:03 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:33:03 pm [CMD-RSSI]        BEAC 3F:48:4E:04:2A:18 RSSI: -101 dBm (initial reading | 99 dBm) 
Jun 16 18:34:39 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:34:39 pm [DEL-RAND]        RAND 75:04:01:39:A1:8B expired after 156 seconds 
Jun 16 18:35:59 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:35:59 pm [DEL-RAND]        RAND 01:58:1E:63:78:C1 expired after 199 seconds 
Jun 16 18:37:23 fr-beacon bash[484]: ./support/btle: line 352: printf: write error: Broken pipe
Jun 16 18:38:59 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:38:59 pm [DEL-RAND]        RAND 75:04:01:39:A1:8B expired after 155 seconds 
Jun 16 18:39:32 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:39:32 pm [DEL-RAND]        RAND 56:3B:F6:11:40:F2 expired after 187 seconds 
Jun 16 18:39:33 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:39:33 pm [DEL-RAND]        RAND 44:92:4A:77:3A:B8 expired after 181 seconds 
Jun 16 18:39:36 fr-beacon bash[484]: ./support/btle: line 399: printf: write error: Broken pipe
Jun 16 18:41:59 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:41:59 pm [DEL-RAND]        RAND 66:A4:0F:2B:AB:6B expired after 195 seconds 
Jun 16 18:42:19 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:42:19 pm [DEL-RAND]        RAND 7E:6D:C4:CD:0C:F8 expired after 158 seconds 
Jun 16 18:42:19 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:42:19 pm [DEL-RAND]        RAND 7A:1A:CA:7B:BA:6E expired after 161 seconds 
Jun 16 18:44:19 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:44:19 pm [DEL-RAND]        RAND 48:D5:CC:12:B9:11 expired after 151 seconds 
Jun 16 18:45:39 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:45:39 pm [DEL-RAND]        RAND 44:40:A2:AC:08:35 expired after 164 seconds 
Jun 16 18:46:06 fr-beacon bash[484]: ./support/btle: line 399: printf: write error: Broken pipe
Jun 16 18:46:11 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:46:11 pm [CMD-RSSI]        PUBL 90:9C:4A:B5:C5:F0 RSSI: -62 dBm (initial reading | 138 dBm) 
Jun 16 18:46:59 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:46:59 pm [DEL-RAND]        RAND 0E:1F:90:5F:F4:F7 expired after 182 seconds 
Jun 16 18:47:39 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:47:39 pm [DEL-RAND]        RAND 7E:6D:C4:CD:0C:F8 expired after 155 seconds 
Jun 16 18:54:39 fr-beacon bash[484]: [+] 0.2.200 16-06-2020 06:54:39 pm [DEL-RAND]        RAND 7E:6D:C4:CD:0C:F8 expired after 186 seconds 
brianegge commented 4 years ago

I'm getting the error on my rpi3 buster but not pi zero w stretch.

Gelisob commented 3 years ago

My pipe also gets too often too broken. Running latest edge version (Current version: 8b27737) on rpi3b. Homeassistant os latest version. The non-edge version wasnt able to do much at all, edge runs but breaks them pipes a lot and ends up in crashed state. Well, stuck state, doesnt reboot itself, needs restart by hand. image

Diddlik commented 3 years ago

Hello guys, I have the same error... grafik

finkleandeinhorn commented 3 years ago

I also have this on Ubuntu... any way to prevent it, or do we know what "causes" the pipe to break?

gedger commented 2 years ago

Did this ever get sorted, as I'm also having the problem now. If not any recommendations for an alternative as this project seems to be dead now?

Weissnix4711 commented 2 years ago

Did this ever get sorted, as I'm also having the problem now. If not any recommendations for an alternative as this project seems to be dead now?

Nope, and maybe try room assistant

gewinh commented 2 years ago

Same issue here. Running latest monitor version on Pi3 with 32bit ubuntu 22.04 server. Seems to work fine for a few hours then stops publishing to the mqtt server (on the remote Hass machine pi4). From the pi3 logs, monitor is still running and detecting the bluetooth devices but using mqtt explorer the topic from the monitor machine is not being published anymore to the mqtt server. Restarting the monitor service on the pi3 fixes it and the topic reappears on the mqtt server but disappears again after a few hours. The pi3 has a cable connected ethernet as well as the remote mqtt server, so it is not a wifi issue. I see the broken pipe error in the logs and assume that this is a symptom/cause of the problem.

gedger commented 2 years ago

I did do some debug on this, haven't found a solution but in the process of trying to work out what is going on I added a trap for the pipe error for debug. A side effect of this is that as soon as the pipe error occurs it exits which causes a restart and clears the problem....for a while. I'll have another look when the colder weather arrives but as a none recommended work around...

Add the following to monitor.sh

# ----------------------------------------------------------------------------------------
# ERROR ROUTUNE              
# ----------------------------------------------------------------------------------------
err_exit() {
    echo "Error on $1"
    clean
}

and then in support/btle file add

    trap 'err_exit $LINENO $ERR' SIGPIPE ERR

before the while true; do loop in the function btle_listener ()

gewinh commented 2 years ago

Thanks I'll give it a try. Much more professional than my stop gap of crontab @hourly systemctl restart monitor.


From: gedger @.> Sent: Friday, July 22, 2022 5:47 PM To: andrewjfreyer/monitor @.> Cc: gewinh @.>; Comment @.> Subject: Re: [andrewjfreyer/monitor] ./support/btle: line 356: printf: write error: Broken pipe (#236)

I did do some debug on this, haven't found a solution but in the process of trying to work out what is going on I added a trap for the pipe error for debug. A side effect of this is that as soon as the pipe error occurs it exits which causes a restart and clears the problem....for a while. I'll have another look when the colder weather arrives but as a none recommended work around...

Add the following to monitor.sh

----------------------------------------------------------------------------------------

ERROR ROUTUNE

----------------------------------------------------------------------------------------

err_exit() { echo "Error on $1" clean }

and then in support/btle file add

    trap 'err_exit $LINENO $ERR' SIGPIPE ERR

before the while true; do loop in the function btle_listener ()

— Reply to this email directly, view it on GitHubhttps://github.com/andrewjfreyer/monitor/issues/236#issuecomment-1192390065, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKHKSOC5CZNNLP267WQJWUTVVJU3HANCNFSM4IPJEYAA. You are receiving this because you commented.Message ID: @.***>

drinfernoo commented 1 year ago

Any word on this?

msbc42 commented 1 year ago

Same problem on rpi4

shiomax commented 1 year ago

Seems like the pipe just closes if nothing is written to it for too long. There are comments in the code suggesting attempts to try and fix it like #MAINTAIN PACKET PIPE. (I´m unsure what the PC too fast mentions are supposed to mean I´d think quite the opposite should be the problem)

I removed the printf "%s\n" "" > packet_pipe completely that where in there before then

In support/btle I added the following

# ----------------------------------------------------------------------------------------
# PIPE KEEP ALIVE
# ----------------------------------------------------------------------------------------
pipe_keep_alive() {
    while true; do
        #MAINTAIN PACKET PIPE
        printf "%s\n" "" > packet_pipe

        #PREVENT LOOPING
        sleep 1s
    done

    #REPORT ERROR
    (>&2 echo "error! irrecoverable pipe_keep_alive error")
}

And in monitor.sh I added this (there are a lot of similar calls around line 850

pipe_keep_alive &
pipe_keep_alive_pid="$!"
echo "> pipe keep alive pid = $pipe_keep_alive_pid" >> .pids
$PREF_VERBOSE_LOGGING && echo "> pipe keep alive pid = $btle_listener_pid" 
disown "$pipe_keep_alive_pid"

Essentially, this will start yet another process that sends an empty message to the pipe in predictable 1s intervals. Before it sent a whole bunch of empty messages in very rapid succession and then nothing... repeat. Until the one time it does not send anything for too long and the pipe gets closed.

Its harder to fix it inside the loops themselves. The loops that read the bluetooth information are blocking. So you´d have to play with timeouts and adjust them to what works while making sure you´re not somehow loosing information (I´m not familiar with a lot of the bluetooth commands so I did not attempt this).

So far seems promising works for longer than it did before. But it´s only been like 10 hours. We will see how it goes.

EDIT: It´s 2 days later now it´s still up and running. So, it seems that this does indeed work fine. Might be a bit of an overkill solution. But one that works without restarting the service constantly. Before it crashed within a few hours for me. Might be more stable for people that have more bluetooth devices around (or that one thing that is always doing stuff) without needing this or restarting the script.

labasu-helpme-ronda commented 1 year ago

Seems like the pipe just closes if nothing is written to it for too long. There are comments in the code suggesting attempts to try and fix it like #MAINTAIN PACKET PIPE. (I´m unsure what the PC too fast mentions are supposed to mean I´d think quite the opposite should be the problem)

I removed the printf "%s\n" "" > packet_pipe completely that where in there before then

In support/btle I added the following

# ----------------------------------------------------------------------------------------
# PIPE KEEP ALIVE
# ----------------------------------------------------------------------------------------
pipe_keep_alive() {
  while true; do
      #MAINTAIN PACKET PIPE
      printf "%s\n" "" > packet_pipe

      #PREVENT LOOPING
      sleep 1s
  done

  #REPORT ERROR
  (>&2 echo "error! irrecoverable pipe_keep_alive error")
}

And in monitor.sh I added this (there are a lot of similar calls around line 850

pipe_keep_alive &
pipe_keep_alive_pid="$!"
echo "> pipe keep alive pid = $pipe_keep_alive_pid" >> .pids
$PREF_VERBOSE_LOGGING && echo "> pipe keep alive pid = $btle_listener_pid" 
disown "$pipe_keep_alive_pid"

Essentially, this will start yet another process that sends an empty message to the pipe in predictable 1s intervals. Before it sent a whole bunch of empty messages in very rapid succession and then nothing... repeat. Until the one time it does not send anything for too long and the pipe gets closed.

Its harder to fix it inside the loops themselves. The loops that read the bluetooth information are blocking. So you´d have to play with timeouts and adjust them to what works while making sure you´re not somehow loosing information (I´m not familiar with a lot of the bluetooth commands so I did not attempt this).

So far seems promising works for longer than it did before. But it´s only been like 10 hours. We will see how it goes.

EDIT: It´s 2 days later now it´s still up and running. So, it seems that this does indeed work fine. Might be a bit of an overkill solution. But one that works without restarting the service constantly. Before it crashed within a few hours for me. Might be more stable for people that have more bluetooth devices around (or that one thing that is always doing stuff) without needing this or restarting the script.

Are you still having success with your changes? I recently setup 3 pi's for this wonderful monitor project. 2 of the Pi's bomb out with line 359 and 399 pipe errors. 1 of the Pi's runs perfectly.

shiomax commented 1 year ago

Are you still having success with your changes? I recently setup 3 pi's for this wonderful monitor project. 2 of the Pi's bomb out with line 359 and 399 pipe errors. 1 of the Pi's runs perfectly.

It´s still running for me, at least the last 24h. I don´t see further back in home assistance and I´m not really using the presence data for anything anymore right now. I wanted to use it to detect when I´m leaving and turn stuff off. But it often detected my phone is gone when it´s idle on a table until I interact with my phone again. I still kept it around in case it´s useful for something else in the future.