Koenkk / Z-Stack-firmware

Compilation instructions and hex files for Z-Stack firmwares
MIT License
2.3k stars 640 forks source link

Broadcast is braking the IEEE 802.15.4 network layer. #443

Closed MattWestb closed 1 year ago

MattWestb commented 1 year ago

Patching broadcast is set like this in the firmware:

+// Increase the max number of boardcasts, the default broadcast delivery time is 3 seconds
+// with the value below this will allow for 1 broadcast every 0.15 second
+#define MAX_BCAST 30

I have tying finding what is shall being but have not finding it.

The 802.15.4 network is protecting braodast from storming the network by limiting the broadcast to 8 in one 9 seconds time frame. If one router is getting more broadcasts and and the 8 have not timing out its shall ignoring the new frames = frame is lost silent.

Some references: https://community.silabs.com/s/article/guidelines-for-large-dense-networks-with-emberznet-pro?language=no

Broadcasts:

In a network of any appreciable size, broadcasts should be avoided or minimized wherever possible, and radius of propagation should be limited in cases where the nodes of interest are known to be in close proximity (within a limited number of hops) of the sender. (See note above about hop count being slightly higher in these neighbors due to density.) Furthermore, the ZigBee Pro stack limits the broadcast traffic to about 8 broadcasts in any 9-second window, so any NWK layer broadcast activity (APS broadcasts, APS multicasts, route discoveries, ZDO announcements or broadcast requests, PAN ID / channel / NWK key update notifications) by any node, even one with a 1-hop radius, will count against this limit, and if any node encounters more than this amount of broadcast traffic, it will discard the packet because it can't be tracked against the list of known broadcasts, and we don't want to risk repeating some old broadcast to propagate it unnecessarily.

Trying sending broadcastt at a rate of 60 / 9 seconds i ending loosing 52 broadcast after the first Zigbee router and is not making any good then its only blocking the network and braking APS broadcasts, APS multicasts, route discoveries, ZDO announcements or broadcast requests, PAN ID / channel / NWK key update notifications functionality in the mesh network !!!

I knowing its not popular but this in making all TI firmware not being Zigbee 3 compatible and is only making bad thing but i hope getting it fixed so users is getting there system broken of bad firmware settings like this one.

Pleas read and learning how IEEE 802.15.4 network broadcast storm is implanted and how is and shall working with Zigbee 3 certificated devices.

Koenkk commented 1 year ago
MattWestb commented 1 year ago

If reading the foundation of routing in Zigbee then you understanding way its locked in the Zigbee stack then the under layer is limited to 8 broadcast in 9 seconds in all devices.

If one router (= all FFD in the 15.4 network) is sending more broadcasts the standard then all neighbors is throwing all frames that is over the limit and then is your network have collapsing then broadcast is not working and unicast is not working then the routing is not working then the broadcast is broken and cant discovering devices.

Its like try driving one double dicker bus under one bridge that is 4.4 M high and you can try but its not working and if you is understanding how its working you is not trying or you is getting one cabriolet version of your double dicker bus.

If doing this kind of patching and not understanding how the different layers is working then you is braking the system.

Then i only using Zigbee 3 certified devices (also the coordinator) i cant doing the test.

Better you is trying sending 60 broadcast in your production network for more then 3 hops and sniffing how the network is reacting and if its possible for some hours so the network is working also with end devices is jumping and rejoining with different parent so the coordinator must broadcast for finding it in the network.

TheJulianJES commented 1 year ago

If one router is getting more broadcasts and and the 8 have not timing out its shall ignoring the new frames = frame is lost silent.

Trying sending broadcastt at a rate of 60 / 9 seconds i ending loosing 52 broadcast after the first Zigbee router and is not making any good then its only blocking the network and braking APS broadcasts, APS multicasts, route discoveries, ZDO announcements or broadcast requests, PAN ID / channel / NWK key update notifications functionality in the mesh network !!!

Isn't this somewhat conflicting information? Let's say the coordinator uses the default broadcast limit, but fully "saturates it" by sending as much as possible. All other routers are also "saturated" at that point, as they use the same limit. So what you're saying with "all multicast/broadcast functionality breaks" also happens then?

If one router (= all FFD in the 15.4 network) is sending more broadcasts the standard then all neighbors is throwing all frames that is over the limit and then is your network have collapsing then broadcast is not working and unicast is not working

Why would unicast messages break if the extra broadcast frames are all silently discarded by other routers? (What you're saying could also happen with the default limit?)

Could you reproduce an issue due to this? The context you are given is more theoretical and Im wondering what happens in practice.

I didn't observe any issues so far. Other ZHA devs are also running with an increased broadcast table size on EZSP coordinators. Although the "extra broadcast" limit is rarely used, HA can send "too many broadcast" requests for the default limit when dragging in the color wheel for a Zigbee light group.

limited to 8 broadcast in 9 seconds in all devices

In my testing, this doesn't really apply (at least not how you think). All Hue bulbs (Silicon Labs based) that can "hear my coordinator" respond to way more than 20 broadcast requests in a 9 second window. It's possible that they don't re-broadcast those frames (didn't test this), but the increased limit still has a benefit here.

In generally, broadcast requests seem to be processed way faster in my test networks and my production network than what the default limit is for.

(Removing the increased limit will cause issues)

MattWestb commented 1 year ago

Its can braking unicast then "APS broadcasts, APS multicasts, route discoveries, ZDO announcements or broadcast" is not working OK the routing is not working and the mesh cant finding / manager the device and is not knowing where they is.

With the "saturated" is not possible with Zigbee 3 certified devices that all shall having the standard but if you is patching the firmware on all your routers you can getting it working excellent if not going in some other problems.

Broadcast table configured in EZSP firmware that is holding the listener for broadcast = group commands so if setting it to 10 you can have 10 Zigbee light groups and the 11 you is making is not working then the coordinator is not listening on it. its not the outgoing broadcast table that is doing the 9/8 limiting.

The max broadcast in EZSP is not possible changing in the firmware is locked in simplicity studio as written in the linked dock. (i think its some old wording in the linked document then its little old but its not important then its the logic that is it).

The limited is is for outgoing broadcast frames so more then likely is working with all your HUE BT lights that is in the first level (in radio range from the coordinator) but they is not relaying then to next level then getting more the 8 in the outgoing table.

Interesting sniffing how its working nit not so easy setting up one test for getting it working then the second level shall not being in radio range of the level 0 for it working.

Also EZSP and modern Zogbee stacks is not reseeding broadcast 4 times if they using passive acks (its sending the broadcast and listening if the neighbors is reseeding it = OK then it dos not repeating the broadcast and if not its doing classic repeating sending the frame). But is having routes that is not using passive broadcast ack then you is getting the 9/8 problem.

In EZSP is the limit working then using IKEA light controllers with older firmware its not possible changing light level and color to of then the network is blocking (the round wireless dimmer is also doing it then is direct paired to one light). and is having EZSP stack in both devices.

The speed is depending of the router in the network and they is using jitter for not sending all at the same time and they must possessing the frame for resending it it the next hope. Some stack / hardware is faster some is slower.

TheJulianJES commented 1 year ago

Broadcast table configured in EZSP firmware that is holding the listener for broadcast = group commands so if setting it to 10 you can have 10 Zigbee light groups and the 11 you is making is not working then the coordinator is not listening on it. its not the outgoing broadcast table that is doing the 9/8 limiting.

No, EZSP_CONFIG_BROADCAST_TABLE_SIZE is defined as "The maximum number of broadcasts during a single broadcast timeout period" and it indeed does that for me (only increasable at compile time).

I guess you're talking about EZSP_CONFIG_MULTICAST_TABLE_SIZE

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

MattWestb commented 10 months ago

In EZSP is the EZSP_CONFIG_MULTICAST_TABLE_SIZE how many Zigbee groups the coordinator can listener to then all Zigbee groups must being configured or the host system is not getting the broadcasted frames (Alexei have saying that many times and i very sure Pudly is knowing it 2).