Koenkk / zigbee-herdsman

A Node.js Zigbee library
MIT License
456 stars 277 forks source link

Z2M 1.35.2 stops with "Adapter disconnected, stopping" after few minutes of uptime (Sonoff-E / EZSP v12 / FW 7.3.2.0 build 212) #910

Closed romarysonrier closed 2 weeks ago

romarysonrier commented 3 months ago

1./ A first, i was running Z2M 1.35.1 withon a 2 devices zigbee network :

Radio Adapter is SONOFF -E / EFR32 MGM21 (upgraded to EZSP v12 / FW 7.3.2.0 build 212). Radio link is strong and healthy

The ZLinky_TIC device is using a lot of polling to report some measurement related to my Electric Meter (Linky). The most simple way to monitor how well the polling is working is to track how often is updated a TIMEDATE value from the ZLinky_TIC device.

While running Z2M 1.35.1, i was experiencing some unexpected random freeze of any polling activities from Z2M, after some hours of uptime. Z2M never stopped or crashed. Only data from Reportable attributes were still updated. Network scan was stuck and not responding once the polling issue was observed.

When working, the polling did offers lower refresh rate for ZLinky_TICthan expected ( 60 sec requested), TIMEDATE from Electric Meter was updated every 70sec to 12 min, average was near 2 min. And at least a day the whole netflow was suffering from lost of all polled attributes.

I struggled to identify a clear root cause at this point, could also be hardware since i am using DIY case.

2./ But, after upgrading Z2M 1.35.2, the 'abnormal' behaviour changed:

info  2024-02-05 14:15:36: MQTT publish: topic 'zigbee2mqtt-grenier/ENEDIS Linky Portail', payload '{...} ,"update_available":null}'
debug 2024-02-05 14:16:01: Received Zigbee message from 'ENEDIS Linky Portail', type 'attributeReport', cluster 'haElectricalMeasurement', data '{"apparentPower":217}' from endpoint 1 with groupID 0
info  2024-02-05 14:16:02: MQTT publish: topic 'zigbee2mqtt-grenier/ENEDIS Linky Portail', payload '{...} ,"update_available":null}'
error 2024-02-05 14:16:09: Adapter disconnected, stopping
debug 2024-02-05 14:16:09: Saving state to file /opt/zigbee2mqtt/data/state.json
info  2024-02-05 14:16:09: MQTT publish: topic 'zigbee2mqtt-grenier/bridge/state', payload '{"state":"offline"}'
info  2024-02-05 14:16:09: Disconnecting from MQTT server
info  2024-02-05 14:16:09: Stopping zigbee-herdsman...
error 2024-02-05 14:16:09: Failed to stop Zigbee2MQTT

One hypothesis to explain the abnormal behavior is that the intesive use of polling done by the ZLinky_TIC is causing some issue with protocol EZSP v12 implementation.

3./ Upgrade to dev branch didn't help, same behaviour as Z2M 1.35.2

Remarks on HW used :

Nerivec commented 3 months ago

https://github.com/Koenkk/zigbee2mqtt/issues/21198

After some investigation in that issue, it seems yours might be the same. The presence of the LiXee devices in both reports suggests something...

I'll continue answering in that issue if you don't mind joining there, to keep things organized (unless we eventually find different problems). If you have a herdsman debug log of the crash, and can post it there, that'd be great.

romarysonrier commented 3 months ago

here is the debug report lixee-2.log

first suspicious line is this error: zigbee-herdsman:adapter:ezsp:uart Unexpected packet sequence 6 | 7 +12ms

romarysonrier commented 3 months ago

BTW : A cannot update adapter FW to 7.4 until end of the week to see if the issue may be FW related.

ps moving to the initial thread : https://github.com/Koenkk/zigbee2mqtt/issues/21198