binsentsu / am43-ctrl

Node Util for controlling an AM43 Blinds Drive Cover, either over MQTT or via a HTTP API
44 stars 25 forks source link

Device Connection Failure Improvements #31

Closed rocket4321 closed 3 years ago

rocket4321 commented 3 years ago

Issue: My connection would intermittently fail and never recover. Logs would contain errors, but errors are present regardless of operational status.

Logs: Mar 21 10:47:22 pi2 node[26878]: 2021-03-21T17:47:22.189Z am43* all expected devices connected, stopping scan Mar 21 10:47:30 pi2 node[26878]: noble warning: unknown handle 68 disconnected! Mar 21 10:47:37 pi2 node[26878]: noble warning: unknown handle 64 disconnected! Mar 21 10:47:44 pi2 node[26878]: noble warning: unknown handle 63 disconnected! Mar 21 10:47:50 pi2 node[26878]: noble warning: unknown handle 68 disconnected! Mar 21 10:49:56 pi2 node[26878]: noble warning: unknown handle 69 disconnected! Mar 21 10:49:57 pi2 node[26878]: noble warning: unknown handle 68 disconnected!

Setup: I have 3 am43 curtain motors and connecting with a raspbery pi 2 with bluetooth usb. I have also tested using a Pi 3B and Pi 4. OS: Raspbian Lite

Failure Identification: Examining either the HTTP or MQTT telemetry for the devices would still indicate a misleading 'Online' status. But the latest update of "lastconnect" would never increment, so a stale value would be a clear indicator after a delay. Also, the motor would stop responding to bluetooth commands.

Recovery: A simple restart of am43-ctrl service on host would resolve it. This was required about every 12 to 36 hours for me.

Code Improvements:

Adds watchdog for program exit if no valid connection within a user-defined time period (disabled by default) Adds watchdog for no device connection within 10 cycles at startup, otherwise exit Allows control over device polling interval, or defaults to current random 10-20 min cycle

Since code improvements:

Connection is reliable or auto restarts (due to systemd config). Polling interval was moved to 5 minutes with no known impacts, so loss of connection is more noticable and short-lived.

However, I assume any deviation from the current (random 10-20 min) polling interval will have varied results for each environment. But, I feel allowing the user to control this interval provides a necessary improvement.

binsentsu commented 3 years ago

thanks!