Open ar1nd4m opened 4 months ago
It looks like the adapter is not accepting any connections: Error: Error while opening socket
, don't think this is something that can be fixed from z2m.
Thanks for looking into the issue - I think the problem is created by the socket (as it was indeed offline for a few hours) - but it is exacerbated by the issue that zigbee2mqtt does not retry.
If you see, the last log on that entry was on 2024-06-19 17:13:28 ... however it was the latest logs I pulled up on 2024-06-23 - So atleast 4 days passed without zigbee2mqtt reloading.
I think either the HA add-on retry logic is busted or something needs to happen in zigbee2mqtt to make sure HA doesn't mark it as a permanent failure.
I looked a little bit into https://developers.home-assistant.io/docs/add-ons/configuration/ but I don't see a retry strategy listed there.
Would it be possible to not kill the docker job if the socket is not available - but instead catch that exception and retry after a fixed timeout ? That way the add-on keeps running and is self recoverable ?
@Nerivec recently introduced the watchdog feature in https://github.com/Koenkk/zigbee2mqtt/pull/23043, I've now added support for this for the HA addon. For this you will have to use the edge
version of the addon. Docs:
Thanks - Let me try it right away! Should I close this issue and re-open another one if it doesn't work as expected (to keep your issue-list clean) ?
Let's keep it here
What happened?
My SLZB-06 was offline for a few days, when it got rebooted the zigbee2mqtt docker container managed by HA (addon) did not restart and get it online.
What did you expect to happen?
I expected that if the device is online, HA will automatically restart the docker container.
I think the right way to fix this is to make zigbee2mqtt not crash but wait - or change HA so that it does not stop retrying after a significant number of retries.
How to reproduce it (minimal and precise)
Have a SLZB-06 connected over LAN to zigbee2mqtt (running as an addon in HA).
Disconnect the power from SLZB-06 and let zigbee2mqtt fail it will keep trying for a few times and then stop. Give it about 4 hours.
After this, connect the SLZB-06 and the system does not recover back.
Zigbee2MQTT version
1.38.0 commit: unknown
Adapter firmware version
20210708
Adapter
zstack
Setup
Add-on HA
Debug log