Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge πŸŒ‰, get rid of your proprietary Zigbee bridges πŸ”¨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
12.15k stars 1.68k forks source link

State only updates when manually refreshed on large network #22931

Open mortenmoulder opened 5 months ago

mortenmoulder commented 5 months ago

What happened?

At my work we have a 140+ device network with 50/50 split between switches and power plugs. One plug is bound (via OnOff binding) to one power plug. The power plug is used as a state for various things in the office, so getting the proper state is actually important to us.

We have experienced a few of the plugs do not update their on/off state, unless we press the refresh button. When changed via Home Assistant or MQTT it updates the state perfectly. If I publish an MQTT message to zigbee2mqtt/FRIENDLY_NAME/get with payload {"state": ""} the state also updates (a hack we could implement, but would rather not).

We cannot reproduce the issue consistently. We have not updated to the latest Ember firmware - yet.

I realise this is a weird issue, seeing that 99% of the plugs report the state correctly, which also means debugging is hard. Let me know if you want me to test things out.

Plug: https://www.zigbee2mqtt.io/devices/SPLZB-141.html Switch: https://www.zigbee2mqtt.io/devices/SBTZB-110.html

What did you expect to happen?

No response

How to reproduce it (minimal and precise)

No response

Zigbee2MQTT version

1.35.3

Adapter firmware version

6.10.3.0 build 297

Adapter

Sonoff Dongle-E

Setup

Docker on x86

Debug log

No response

LaurentChardin commented 5 months ago

One plug is bound (via OnOff binding) to one power plug

Do you mean: One switch is bound (via OnOff binding) to one power plug ?

mortenmoulder commented 5 months ago

Do you mean: One switch is bound (via OnOff binding) to one power plug ?

Yes, one switch is bound to one plug. Typo.

LaurentChardin commented 5 months ago

We have experienced a few of the plugs do not update their on/off state

So what you mean is:

Is that correct ?

mortenmoulder commented 5 months ago

@LaurentChardin Yes, that is completely correct.

LaurentChardin commented 5 months ago

Ok ... since you have defined a binding between each switch and plug, it means they can work directly together, aka without even the coordinator on the mesh network. So this is fine.

Now, it could be that the plug is failing to have its notification status event reaching the coordinator: did you check the status of your mesh if you have holes with low LQI ? there might be some routing issues that prevent the notification event to reach the coordinator. From your description, i would try to have an idea of the mesh network sanity.

With that many routers, it should help to have a quite stable network, but you never know. it could explain the unpredictability of this behavior.

Anything found in the logs ? (like route issue warnings)

mortenmoulder commented 5 months ago

@LaurentChardin Yes, I think the plug with the lowest LQI is about 150, so it should all be good. But that was my main concern too - that some other router in the path somehow won't report this plug's state correctly. Very hard to actually debug.

No logs except info with power output and updates

LaurentChardin commented 5 months ago

@mortenmoulder It looks like your issue is really Zigbee-ish, meaning I don't think Z2M can actually help much here.

https://community.home-assistant.io/t/zigbee-networks-how-to-guide-for-avoiding-interference-optimize-using-zigbee-router-devices-repeaters-extenders-to-get-a-stable-network-with-best-possible-range-and-coverage/515752

There could be some actions you could check:

You have a lot of routers (plugs) but in the end, this can be quite dependent of your physical setup.

mortenmoulder commented 5 months ago

@LaurentChardin I'm not really sure either. It could be Zigbee but it could also be Zigbee2MQTT.

We save all logs and can view them through Grafana, so it's quite easy to filter out the regular updates and focus on the odd ones. Nothing looks suspicious, unfortunately.

Our physical setup is somewhat unique and funny. The plugs control the power going to each employee's desk. Each room/area has four desks, and in the middle underneath the floor, there are four outlets with one plug in each. The plug we are currently monitoring is relatively close to the coordinator, however, it is most likely routed through a bunch of other plugs - of which each could be the culprit. Or not?

I would hate to implement the send MQTT message to /get with an empty state to refresh the state hack, but if that's what it takes.. It just sucks we have to do that to 70+ plugs and "overload" the network even more.

Hopefully someone stumbles upon this issue and has a fix πŸ™

LaurentChardin commented 5 months ago

@mortenmoulder Do your plugs implement as well haElectricalMeasurement based reportings ? like ActivePower, RmsCurrent etc ?

Just asking because, if you have a lot of them, and some devices are very very chatty : they could somehow create pressure on your network because we would get flooded with messages, bouncing around with your routers... trying to get a route for each of them.

Since you are in a working area, i guess you have some strong Wifi needs on the same 2.4Ghz band. You might want to be sure your Zigbee channel is not too close from the usual Wifi channel.

But as you said, if the issue is not deterministic, and kinda random : those are tricky to debug. Especially if you dont see anything in the logs, it kinda bails out any issue between z2m and your MQTT broker. because you would see them.

mortenmoulder commented 5 months ago

Do your plugs implement as well haElectricalMeasurement based reportings ? like ActivePower, RmsCurrent etc ?

They sure do: https://github.com/Koenkk/zigbee-herdsman-converters/blob/master/src/devices/develco.ts#L522-L545

Just asking because, if you have a lot of them, and some devices are very very chatty : they could somehow create pressure on your network because we would get flooded with messages, bouncing around with your routers... trying to get a route for each of them.

I would say we get around 10 reports per second - if not more. The log is unreadable and the Zigbee2MQTT UI is barely working.

Since you are in a working area, i guess you have some strong Wifi needs on the same 2.4Ghz band. You might want to be sure your Zigbee channel is not too close from the usual Wifi channel.

We actually don't have a lot of 2.4 GHz WiFi coverage. I believe we have an IoT network but primarily 5 or 6 GHz.

What's funny is that it's consistently occurring for one device. It's not like it's random plugs around the office.

mortenmoulder commented 5 months ago

Update: EZSP v13 (Ember) update and Zigbee2MQTT 1.38.0 did not work.

Lmax12 commented 1 month ago

I'm having simular issues with a switch that can be operated with a manual button. The state is also not updated. Only when i press the refresh button it shows. Its also a fairly large network with 138 devices, of which 103 are routers. Zigbee2mqtt version is 1.40.1. Coordinator verions is 20240316 on a UZG-01 (CC2652P7 chipset)

When i move the switch to another Z2M instance i have running, the state is updated on the fly.

It used to work fine, but stopped working a week or 3 ago. Did you find any solution?

Lmax12 commented 1 month ago

I found a solution/work-around on another site. Posting it here to help others:

I changed the "Min Rep Change" for the "OnOff" attribute in the device settings page from "0" to "1".