Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.74k stars 1.64k forks source link

Entire network going offline until I unplug/replug coordinator #22452

Open angrycatmeowmeow opened 4 months ago

angrycatmeowmeow commented 4 months ago

What happened?

Entire zigbee network is going offline. Z2M dashboard shows only routers offline, endpoints are also offline but don't show offline. Unplugging coordinator and replugging fixes it. I updated Z2M to latest stable and updated coordinator to 20230923 per another issue I found as my firmware was from early 2022. This issue said the current latest per the docs, 20230507, is considered unstable so I went for 20230923. Happens every few (6-12) hours.

What did you expect to happen?

No response

How to reproduce it (minimal and precise)

No response

Zigbee2MQTT version

1.37.0

Adapter firmware version

20230923

Adapter

Sonoff Dongle-P

Setup

HAOS Odroid N2+ (HA Blue) add-on

Debug log

log.log

Luigi8723 commented 4 months ago

I think i have the same issue. Out of the blue all my devices are going offline. After i replug the z2m adapter its working again.

Now i have updated to the latest z2m version and now i see ones a day my z2m system restarting (the watchdog is restarting it). Seems to be something within z2m?

angrycatmeowmeow commented 4 months ago

I agree. I haven't made any noteworthy changes to my setup, my wifi and zigbee channels are separated, and there are no other sources of interference I can think of, especially none that I've recently introduced. I'm used to seeing MAC errors when interference is the cause and that's usually an easy fix. The errors I'm getting are not MAC errors and it's taking out my entire network at the same time.

It is happening randomly, sometimes within 6 hours and sometimes after two days. I'm hesitant to leave debug logging on for that long.

https://www.reddit.com/r/homeassistant/comments/1bii6j1/comment/l365jut/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

leonardpitzu commented 3 months ago

I have the same issue - half my network is down and actions take an eternity to happen. From clicking a button till the light goes on is seconds to tens of seconds. I am using Conbee II.

andrzej67 commented 3 months ago

Same here. I use Sonoff Dongle-P with 20220219 firmware

Coldness00 commented 3 months ago

Same issue also. I thought I was going mad. Running with SLZB-06 Firmware 2.0.18 on LAN and Zigbee on 20230507 Running Docker with Synology NAS

samuele2723 commented 2 months ago

following. i have 20230507 firmware, but i read isn't the latest? what is the latest 20230923?

samuele2723 commented 2 months ago

found the tread on latest firmware, i think i'm going to test https://github.com/Koenkk/Z-Stack-firmware/discussions/496

xsienix commented 1 month ago

Here a SLZB-06 also. Firmware change don't brings nothing.

angrycatmeowmeow commented 1 month ago

Mine will go offline 3x in one day, then be absolutely rock solid for a month straight. Restarting Z2M fixes it, so I don't think it's interference.

xsienix commented 1 month ago

Mine will go offline 3x in one day, then be absolutely rock solid for a month straight. Restarting Z2M fixes it, so I don't think it's interference.

Here the SLZB goes down, z2m try to ping the devices without success. Restart of the SLZB is needed. I have only 3 devices and was starting to build the zigbee network,

psa-anddev commented 1 month ago

I'm running Zigbee2MQTT version 1.39.1-1 as a HA add-on. I have a Conbee II coordinator. What I notice is that every couple of days, Z2M hangs and no device will work. When I try to restart it, it will take a while and it will crash with an error saying it cannot connect to /dev/ttyACM0. When I unplug/replug the coordinator and restart it, then I have it working for another couple of days. I think this might be the same issue as it's been reported here but if it's not, I'll be happy to open a new one.

fsedarkalex commented 4 weeks ago

Same issue. Intensifies with a growing network I think. Using ZBDongle-P also. Firmware 20230507

I had it about once a month with ~40 devices Now I have ~80 devices and have this issue about 2-3 times a month

No EGLO/AwoX devices in the said network.

I have a second network (same coordinator) with only EGLO/AwoX Devices and a few wall switches. This never crashed so far and it is only 15 devices tall

Will now try this: https://github.com/Koenkk/Z-Stack-firmware/discussions/505

psa-anddev commented 3 days ago

After some investigation, I figured out that my problem was not the one portrayed here. Turns out it was Home Assistant which for some reason decided to randomly change the path to the device. Unplugging and replugging it would result in the device recovering the path that Z2M was expecting. I've been keeping both Z2M and HA updated and I haven't encounter the problem anymore. It's been almost a week working flawlessly.

If you are encountering this problem and you are using HomeAssistant (with Home Assistant OS), please, SSH into the machine at a point where the network is no longer accessible and check if your device's path is the same that it is configured for Z2M. Also, check if unplugging and replugging it changes the path to the one expected. If all of these conditions are true, make sure to update Home Assistant to the latest version as well as Z2M.