home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.94k stars 30.99k forks source link

Automations issue/ ZHA Network busy errors after migrating to Skyconnect dongle #86411

Open jason1980p opened 1 year ago

jason1980p commented 1 year ago

The problem

After migrating to Home Assistant Skyconnect usb dongle I've been running into network busy errors. I currently have the dongle connected to a usb extension cable connected to R-Pie4 .

What version of Home Assistant Core has the issue?

Home Assistant 2023.1.6

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Automation

Link to integration documentation on our website

https://www.home-assistant.io/docs/automation/

Diagnostics information

No response

Example YAML snippet

alias: "Pico: Master Bathroom remote"
description: ""
use_blueprint:
  path: stephack/core-pico.yaml
  input:
    pico_remote: a58ddd4ab05559d05de8267f82dd7c49
    top_on:
      - service: light.turn_on
        data:
          brightness_step_pct: 100
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006
    bottom_off_release:
      - service: light.turn_off
        data: {}
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006
    up_raise:
      - service: light.turn_on
        data:
          brightness_step_pct: 20
        target:
          entity_id:
            - light.light_unknown_master_bathroom_lights_zha_group_0x0006
    down_lower:
      - service: light.turn_on
        data:
          brightness_step_pct: -20
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006

Anything in the logs that might be useful for us?

Logger: homeassistant.components.automation.pico_master_bedroom_remote
Source: components/zha/light.py:292
Integration: Automation (documentation, issues)
First occurred: January 21, 2023 at 8:37:18 PM (7 occurrences)
Last logged: 7:21:23 PM

Pico: Master Bathroom remote: Choose at step 1: choice 1: Choose at step 1: choice 1: Error executing script. Unexpected error for call_service at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 451, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 684, in _async_call_service_step
    await service_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1755, in async_call
    task.result()
  File "/usr/src/homeassistant/homeassistant/core.py", line 1792, in _execute_service
    await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service
    await service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 958, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 570, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 978, in async_turn_on
    await super().async_turn_on(**kwargs)
  File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 292, in async_turn_on
    result = await self._level_channel.move_to_level_with_on_off(
  File "/usr/local/lib/python3.10/site-packages/zigpy/zcl/__init__.py", line 324, in request
    return await self._endpoint.request(
  File "/usr/local/lib/python3.10/site-packages/zigpy/group.py", line 57, in request
    await self.application.send_packet(
  File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 782, in send_packet
    raise zigpy.exceptions.DeliveryError(
zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>

Additional information

No response

ABEIDO commented 1 year ago

Hi, been looking into a lot of posts regarding this.

I had a Conbee II which work fine and i wanted to test the HA Skyconnect. To add matter and support HA at the same time. Had no real issues with Conbee.

After full reset of my ZHA (no backup at all) i redid my network manually. 46 devices of which are 30+ IKEA lights/outlets and the rest are sensors from Hue, Aqara, Ikea and Frient. Now im getting Network Busy 161 error from time to time from diffrent automations i have since before skyconnect.

For example i have an automation that dims lights thats triggerd on playstatus on mediaplayer:

Actions

service: light.turn_on
data:
  brightness_pct: 10
  transition: 3
target:
  entity_id:
    - light.livingroom
    - light.hallway_table

light.livingroom = ZHA group with 8 devices rest = single devices

And this causes Network Busy 161 error, am i exeeding the limit with these actions (thought limit was 10 devices)?

If so do ConBee II handle this better, because automations is the same and it never happend before SkyConnect dongle?

ChristophHoltmann commented 1 year ago

@ABEIDO I had the same idea and also switched from Conbee II to ZHA and Skyconnect some time ago. I had the same problems and after much back and forth I followed @cityeyes suggestion (just don't use ZHA groups, only HA groups). Maybe not the desired solution, but since then I have no more "Network Busy 161" errors. I have about 40 devices (mostly lights) and the delay is acceptable most of the time, even if I turn on several lights at the same time (without using ZHA groups).

ABEIDO commented 1 year ago

@ABEIDO I had the same idea and also switched from Conbee II to ZHA and Skyconnect some time ago. I had the same problems and after much back and forth I followed @cityeyes suggestion (just don't use ZHA groups, only HA groups). Maybe not the desired solution, but since then I have no more "Network Busy 161" errors. I have about 40 devices (mostly lights) and the delay is acceptable most of the time, even if I turn on several lights at the same time (without using ZHA groups).

Yes i been thinking about it but as i see i have 3 options and im thinking mostly of option 2:

  1. Switch to HA Groups - Feels like its a step back.

  2. Go back to Conbee II - Feels like its a step back also especially as SkyConnect is newer. And a massive hassle to redo the network.

  3. Hope for fix

cityeyes commented 12 months ago

Yes i been thinking about it but as i see i have 3 options and im thinking mostly of option 2:

  1. Switch to HA Groups - Feels like its a step back.
  2. Go back to Conbee II - Feels like its a step back also especially as SkyConnect is newer. And a massive hassle to redo the network.
  3. Hope for fix

Yep, I'm still in the same boat. Things work, but not as well as they did on my earlier device. I really do miss the instant-on that I'd get with my earlier Zigbee device when I was able to use ZHA groups properly, and even though things work, more or less, with standard HA groups, reaction times are slower and I do still get errors occasionally when controlling the entire house (not as often as with the new dongle + ZHA groups, but still more often than when I controlled all of my lights in ZHA groups on the older dongle).

I don't want to give up the additional features that the new dongle brings, but it's still a bit of a bummer when stuff doesn't work perfectly/instantly when controlling large groups of lights.

ABEIDO commented 12 months ago

Yes i been thinking about it but as i see i have 3 options and im thinking mostly of option 2:

  1. Switch to HA Groups - Feels like its a step back.
  2. Go back to Conbee II - Feels like its a step back also especially as SkyConnect is newer. And a massive hassle to redo the network.
  3. Hope for fix

Yep, I'm still in the same boat. Things work, but not as well as they did on my earlier device. I really do miss the instant-on that I'd get with my earlier Zigbee device when I was able to use ZHA groups properly, and even though things work, more or less, with standard HA groups, reaction times are slower and I do still get errors occasionally when controlling the entire house (not as often as with the new dongle + ZHA groups, but still more often than when I controlled all of my lights in ZHA groups on the older dongle).

I don't want to give up the additional features that the new dongle brings, but it's still a bit of a bummer when stuff doesn't work perfectly/instantly when controlling large groups of lights.

As for now i switched to HA groups, and yeah its a bit slower. And as many say ZHA groups is the way to go but as it is with SkyConnect its not behaving optimal.

I cannot really find any diffrence on the setup except the dongle itself. I have the same extension cord, channel, location of the device and so on, its just when using Skyconnect(no multicontrol) ZHA groups its acting up and not with the ConbeeII and as my automation showed , i only turn on one ZHA group together with a indivdual light once and i get error(not really spamming in my book).

Im on the way to migrate my HA from RPI to MicroPC so during that i most likely will go back to ConbeeII, atleast to continue ts.

sjors-lemniscap commented 11 months ago

The under network layer 802.15.4 have broadcast storm protection so all routers is only handling 9 broadcast in 8 seconds and if its more they ignoring them.

I found this the best explanation from @MattWestb. I'm running the SkyConnect 7.3.2.0 beta firmware and created an easy script where 2 lights, which I created a ZHA group for, blink on and off for 5 times each with a duration of 1 second. This would multicast 5x off and 5x on to the ZHA group.

At the 9th time I receive the network busy error meaning that the broadcast storm protection kicked-in. As stated before Conbee II and other EmberZNet Serial Protocol (EZSP) controllers might not stick to the protocol specification, however Skyconnect does and I can understand if they decide to not lift the broadcast storm protection as this might cause overloading the Zigbee network.

Decided for now to create smaller ZHA groups, and where needed a HA group to "bypass" the multicast storm protection since HA groups are sending commands to each individual light respectively.

MattWestb commented 11 months ago

All EZSP firmware is having the same broadcast setting then its locked in the GSDK but can being changed but need one special patch from Silabs and then it cant being Zigbee certified then its out of standard. TI coordinator firmware is patched of Z2M and is going outside standard but its making problems with routing the broadcast is not working and you is getting no route to devices that is commingling with unicast.

wernerhp commented 10 months ago

Getting EmberStatus.NETWORK_BUSY: 161 when calling a HA Helper Group that contains two Zigbee Groups (Kitchen 6 downlights; Sculler 2 downlights Running Sky Connect on standard firmware on a Home Assistant Blue.

I also have an 8 downlight group in the Living Room and two Sonoff Basic ZBR3s in different locations to aid in routing, but devices drop off very frequently. Such a pain. Any advice?

I also get EmberStatus.DELIVERY_FAILED: 102 when controlling some of the lights individually. According to ZHA's Network view, the device is offline, but it's on the same circuit where other lights work, so what gives.

codyc1515 commented 10 months ago

I also faced this issue trying to set the colour then turn off two lights in the same group.

dmulcahey commented 10 months ago

Getting EmberStatus.NETWORK_BUSY: 161 when calling a HA Helper Group that contains two Zigbee Groups (Kitchen 6 downlights; Sculler 2 downlights Running Sky Connect on standard firmware on a Home Assistant Blue.

I also have an 8 downlight group in the Living Room and two Sonoff Basic ZBR3s in different locations to aid in routing, but devices drop off very frequently. Such a pain. Any advice?

I also get EmberStatus.DELIVERY_FAILED: 102 when controlling some of the lights individually. According to ZHA's Network view, the device is offline, but it's on the same circuit where other lights work, so what gives.

Don’t add groups to groups like this. Create an additional zigbee group and add all devices to it for this sort of thing.

dmulcahey commented 10 months ago

I also faced this issue trying to set the colour then turn off two lights in the same group.

Put ZHA in debug mode and make the error happen. Then disable debug mode and attach the downloaded log file here.

MattWestb commented 10 months ago

Something spooky have happening with broadcast in latest bug fix release then i dont using groups so much then IKEA have taking it away in all first gen controllers but still the system si doing much brodcast for routing discovery and so on. Now i getting this in the log more times a day im my production system:

Logger: zigpy.topology
Source: /usr/local/lib/python3.11/site-packages/zigpy/topology.py:84
First occurred: 10:12:37 (3 occurrences)
Last logged: 22:36:49

Topology scan failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/zigpy/topology.py", line 78, in _scan_loop
    await self.scan()
  File "/usr/local/lib/python3.11/site-packages/zigpy/topology.py", line 96, in scan
    await self._scan_task
  File "/usr/local/lib/python3.11/site-packages/zigpy/topology.py", line 221, in _scan
    await self._find_unknown_devices(neighbors=self.neighbors, routes=self.routes)
  File "/usr/local/lib/python3.11/site-packages/zigpy/topology.py", line 253, in _find_unknown_devices
    await self._app._discover_unknown_device(nwk)
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 945, in _discover_unknown_device
    return await zigpy.zdo.broadcast(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/device.py", line 623, in broadcast
    return await app.broadcast(
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 921, in broadcast
    await self.send_packet(
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 912, in send_packet
    raise zigpy.exceptions.DeliveryError(
zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>
HenrikBClausen commented 9 months ago

I had the same issue with the zha and the network becoming congested - most often without an obvious cause. Following some of the advice above, the only thing I have done is to flash the SkyConnect to the latest beta version v7.4.0.0.

The way to get there is not trivial, but following the instructions for running the add-on for ssh access and flashing the SkyConnect has made a huge difference to the stabillity of my Zigbee network.

ABEIDO commented 9 months ago

I had the same issue with the zha and the network becoming congested - most often without an obvious cause. Following some of the advice above, the only thing I have done is to flash the SkyConnect to the latest beta version v7.4.0.0.

The way to get there is not trivial, but following the instructions for running the add-on for ssh access and flashing the SkyConnect has made a huge difference to the stabillity of my Zigbee network.

Any reason of not doing it the easy way via webflasher? https://skyconnect.home-assistant.io/firmware-update/

What version did you upgrade from?

wernerhp commented 9 months ago

I disabled Multiprotocol support and it seems to have made a difference.

codyc1515 commented 9 months ago

I had the same issue with the zha and the network becoming congested - most often without an obvious cause. Following some of the advice above, the only thing I have done is to flash the SkyConnect to the latest beta version v7.4.0.0. The way to get there is not trivial, but following the instructions for running the add-on for ssh access and flashing the SkyConnect has made a huge difference to the stabillity of my Zigbee network.

Any reason of not doing it the easy way via webflasher? https://skyconnect.home-assistant.io/firmware-update/

What version did you upgrade from?

I was on 7.3.1.0 but was never prompted to upgrade to latest version 7.3.2.0. Anyway, I have tried your recommendation and upgraded to 7.4.0.0. To do this, you need to download the relevant .gbl file and choose the Change Firmware option in that Web UI.

wernerhp commented 9 months ago

I'm also on 7.3.1. Will give 7.4.0 a shot

ABEIDO commented 9 months ago

I'm on 7.3.2.0. For you guys updating to the 7.4.0.0b please come back with feedback here if you can

evelant commented 7 months ago

I'm on 7.4.0.0 skyconnect firmware (zigbee only, no multiprotocol) and am facing this same issue. It even happens when controlling a single ZHA group manually from the iOS app with no automations involved. I suspect that the hass UI isn't doing any sort of debouncing so if your finger moves 10 pixels on the color chooser it seems like it spams the network with 20+ broadcasts and locks things up. My network is solid, no device is more than 10ft from a repeater and I've made sure my channel is clear and not conflicting with my wifi. I'll have to give plain groups a try.

This issue is quite frustrating! It's very easy to trigger and usually results in lights being in all sorts of inconsistent states. I've avoided wifi devices because I thought zigbee would be more reliable but I've had nothing but trouble with reliability. I've tried zha and z2m with skyconnect, sonoff-p, and sonoff-e. Different unreliabilities with all of them. None of them have worked reliably like zwave does. Might be time to give up on zigbee and set up a dedicated ssid on an isolated subnet to make wifi devices secure.

evelant commented 5 months ago

So I tried eliminating zigbee groups in favor of plain ha groups. It made the problem infinitely worse. I couldn't reliably control any groups that way, bulbs always failed with the EmberStatus.NETWORK_BUSY almost every single time a command was sent to a group which resulted in only a couple members responding.

Perhaps related to this, I think it would be useful to document the options on the ZHA integration configuration UI. I've searched high and low and nobody seems to know what "Enable enhanced brightness slider during light transition" or "Group members assume state of group" actually do. I did figure out that "Enable enhanced light color/temperature transition from an off-state" makes the network busy problem worse because apparently what it does when enabled is send multiple commands over the air for a turn_on in an attempt to make turning lights on with color/temp more consistent across different manufacturer behaviors.

puddly commented 5 months ago

@evelant This issue is a year old and hasn't seen any activity in four months.

Can you please create a new issue and fill out the issue template? It'll be easier to figure out what specifically is wrong with your network.

evelant commented 5 months ago

@puddly OK I will file a new issue. This issue thread seems valuable however, there is a lot of context here and I seem to always turn this one up again when searching while trying to dig into my (seemingly endless) zigbee problems.

pjcarly commented 5 months ago

Honestly, it's not because the issue is over a year old, that it has been fixed. Having to make a new issue, just feels like extra bureaucracy while at the same time losing a lot of context, interactions, and people tracking this issue.

This feels completely unnecessary.

I still encounter this issue multiple times a week, so i'm still monitoring.

th3cube commented 5 months ago

All EZSP firmware is having the same broadcast setting then its locked in the GSDK but can being changed but need one special patch from Silabs and then it cant being Zigbee certified then its out of standard. TI coordinator firmware is patched of Z2M and is going outside standard but its making problems with routing the broadcast is not working and you is getting no route to devices that is commingling with unicast.

I also think it’s a limitation on the firmware which is following an IEee standard to prevent bc mc storms. As soon as I migrated my erf32 sl labs network to a conbee II stick the limitations where gone. No lockups and transitioning color on a zha group went without problems. Same setup and rules just another stick coordinator. I even migrated and did not setup a new network

jclendineng commented 5 months ago

So is this potentially an issue with EZSP, and if so what dongles can we try? I still have a solid network but light bulbs (zigbee, hardwired) fall off all the time.

puddly commented 5 months ago

@evelant

So I tried eliminating zigbee groups in favor of plain ha groups. It made the problem infinitely worse. I couldn't reliably control any groups that way, bulbs always failed with the EmberStatus.NETWORK_BUSY almost every single time a command was sent to a group

It sounds like in your case, the network may be experiencing broadcast traffic that isn't coming from the coordinator. Do you have a second adapter that you can use for packet capture? There's firmware available for Conbees and most other EZSP sticks can do this natively.

evelant commented 5 months ago

Yes, I will set up a stick and do some packet capture when I have time (probably the next few days). Until then fiddling with some of the ezsp settings last night seems to have made my network more stable. I set the following which I think should be within the capabilities of the zbdongle-e with emberznet v7.4.2.0:

zha:
  zigpy_config:
    source_routing: true
    ezsp_config:
      CONFIG_MAX_END_DEVICE_CHILDREN: 0
      CONFIG_APS_UNICAST_MESSAGE_COUNT: 64
      CONFIG_SOURCE_ROUTE_TABLE_SIZE: 200
      CONFIG_ROUTE_TABLE_SIZE: 16
      CONFIG_ADDRESS_TABLE_SIZE: 32
      CONFIG_PACKET_BUFFER_COUNT: 250
      CONFIG_BINDING_TABLE_SIZE: 32
      CONFIG_NEIGHBOR_TABLE_SIZE: 26
      CONFIG_TRUST_CENTER_ADDRESS_CACHE_SIZE: 10
      CONFIG_MULTICAST_TABLE_SIZE: 32
puddly commented 5 months ago

@evelant Remove all of your config (except for CONFIG_MAX_END_DEVICE_CHILDREN, since you've already set it it probably won't be good to un-set it) and try one of these firmwares: https://github.com/NabuCasa/silabs-firmware-builder/pull/71#issuecomment-2153304143.

I've attached builds for the SkyConnect, Sonoff stick, and the Yellow.

evelant commented 5 months ago

I'm already running the latest ncp-uart-hw-v7.4.2.0-zbdonglee-230400.gbl firmware. I purposely set CONFIG_MAX_END_DEVICE_CHILDREN=0 because I have a lot of strong routers. Almost no device is more than 10ft line of sight to a good router. Some of my devices (freaking buggy Sengled) love to try to route directly to the coordinator above all else even if the LQI is terrible so disallowing that behavior helps.

My config before yesterday was the above but with only CONFIG_MAX_END_DEVICE_CHILDREN: 0 and source_routing: true. None of the other settings. I ran it that way on this firmware for a long time and still had the NETWORK_BUSY issue. I just added the rest of those config parameters last night after doing a bit of research in the EmberZNet manual and they seem to have improved behavior so far but it has only been a day.

I'm slowly building some understanding of the zigbee protocol and the ezsp stack but I have little embedded dev experience so it's slow progress haha.

edit: Ah I see now that the builds at https://github.com/darkxst/silabs-firmware-builder do not seem to have your modified config for EMBER_APS_UNICAST_MESSAGE_COUNT and EMBER_BROADCAST_TABLE_SIZE (or at least I couldn't find them). I'll give your customized build a try and remove all settings except source_routing and MAX_END_DEVICE_CHILDREN. Thanks!

puddly commented 5 months ago

v7.4.2.0 is not the firmware version; it's the EmberZNet version that the firmware is based on. The builds I posted above are not something you can replicate by tweaking config unless you re-compile the firmware as well.

evelant commented 5 months ago

OK I got it flashed, adjusted my config as you suggested, and got my network back up. I'll report back here when I've got a feel for whether the build with those adjusted parameters helps with my issues.

cityeyes commented 5 months ago

@evelant Remove all of your config (except for CONFIG_MAX_END_DEVICE_CHILDREN, since you've already set it it probably won't be good to un-set it) and try one of these firmwares: NabuCasa/silabs-firmware-builder#71 (comment).

I've attached builds for the SkyConnect, Sonoff stick, and the Yellow.

Is this modified firmware available for the Sonoff Multiprotocol FW variant as well? I'd love to give it a shot if so!

puddly commented 5 months ago

@cityeyes Unfortunately no. Multiprotocol is sort of frozen at the last version that doesn't crash (as much) for most people and won't be receiving updates for the foreseeable future. I suggest you migrate to normal Zigbee firmware and use a second adapter for Thread (you may already have one in your home through an Apple or Google device).

cityeyes commented 5 months ago

@cityeyes Unfortunately no. Multiprotocol is sort of frozen at the last version that doesn't crash (as much) for most people and won't be receiving updates for the foreseeable future. I suggest you migrate to normal Zigbee firmware and use a second adapter for Thread (you may already have one in your home through an Apple device).

Oh damn, well thanks nonetheless! I was thinking of migrating back to standard Zigbee anyway since the only Thread devices I have in my house currently are my Google Homes. Appreciate it!

th3cube commented 5 months ago

@evelant Remove all of your config (except for CONFIG_MAX_END_DEVICE_CHILDREN, since you've already set it it probably won't be good to un-set it) and try one of these firmwares: NabuCasa/silabs-firmware-builder#71 (comment).

I've attached builds for the SkyConnect, Sonoff stick, and the Yellow.

Is the patched firmware also available for sticks like the SLZB-06M which can be run over Network? I would like to test on mine as it would have benefits for my network because of placement in a different more central room. This is the stick I was using before but swapped it back to my old conbee II because of these problems

https://darkxst.github.io/silabs-firmware-builder/

puddly commented 5 months ago

Is the patched firmware also available for sticks like the SLZB-06M which can be run over Network?

It'll be a lot of effort to build firmwares for every stick out there 😄. I'm focusing on getting feedback from people running common sticks (SkyConnect and the Sonoff stick).


The firmware builder repository contains documentation for setting things up and you can try to write a manifest file for your specific device based on the config in darkxst's builder repo (based on our old firmware builder) and build one yourself. This discussion is a little off-topic for this specific issue, however.

jclendineng commented 5 months ago

Not a huge deal but the firmware shows up as 7.4.0.0 when flashed vs 7.4.2.0

Edit: Something didn't stop correctly, my fault. I restarted HA, then disabled ZHA and waited about 5 minutes then flashed and it went smoothly. Let's see how this performs! I removed all my custom zha settings as stated above minus the child devices.

th3cube commented 5 months ago

Is the patched firmware also available for sticks like the SLZB-06M which can be run over Network?

It'll be a lot of effort to build firmwares for every stick out there 😄. I'm focusing on getting feedback from people running common sticks (SkyConnect and the Sonoff stick).

The firmware builder repository contains documentation for setting things up and you can try to write a manifest file for your specific device based on the config in darkxst's builder repo (based on our old firmware builder) and build one yourself. This discussion is a little off-topic for this specific issue, however.

If we assume that this solves the problem. Will it be available for all EFR32MG21 based sticks, or will everyone have to cook it for themselves? Just wondering since I don't know anything about FW development. I'm just an end user trying to help and fix my own stuff. I wonder at what level this fix will be implemented. somehow this thread really came back to life again😀

jclendineng commented 5 months ago

FYI 4 days in and it appears to be VASTLY improved. I will report back in a week or so for a better view but network has been much better. 0 drop offs and (I don't know if this did anything for network quality) but congestion has gone way down.

evelant commented 5 months ago

@puddly The build you provided appears to be a massive improvement. I haven't noticed any more NETWORK_BUSY, devices dropping, or slow devices. My network finally seems to work well!

puddly commented 5 months ago

@jclendineng @evelant Thanks for the feedback! If anyone else having similar issues wants to give the firmware a try, let me know how it goes.

cityeyes commented 5 months ago

@jclendineng @evelant Thanks for the feedback! If anyone else having similar issues wants to give the firmware a try, let me know how it goes.

Worked great for me! I've been using it for a week or so with no issues whatsoever on the standard Zigbee firmware (not multiprotocol) on a Sonoff dongle.

jclendineng commented 5 months ago

I had 1 zigbee bulb drop off since my last report, but that's still a vast improvement. 2 weeks or so with 1 drop is great. Bulbs are all routers and hardwired, I have no battery routers so bulb potentially was overloaded? In any case, network still very strong, no more "congestion" messages.

Edit. I downloaded diagnostics and energy scan looks amazing as well. I'm on channel 25 and it's 39% congested which is better than the 80% it was prior to your firmware update, no idea why your firmware would improve network performance but I haven't changed anything on my end.

sjors-lemniscap commented 5 months ago

Works absolutely amazing here as well, is this something that will be implemented in the official firmware as well or do we have to keep applying a patch over every new official firmware release?

puddly commented 5 months ago

Thanks!

no idea why your firmware would improve network performance but I haven't changed anything on my end.

It shouldn't have any impact whatsoever, especially on an energy scan. The results of them will change based on time of day, your physical position near the coordinator, how humid it is, if your neighbors are streaming something over WiFi, and just at random for no reason at all.

This change will only impact sending group commands and other types of broadcasts, nothing else. If you were sending a lot in the past, it could be that you were affecting the firmware's own management broadcasts by using up all the available "slots". This should help with that case.

is this something that will be implemented in the official firmware as well

Yes, the repo I linked to earlier is for the official firmware.

th3cube commented 5 months ago

I can see big improvements in calling zha zigbee groups. For example, if i would move the color picker over a group it would give me network_busy almost immediately. With this FW this behavior has improved drastically. Also, my HUE remotes would "freeze" when I did quadruple presses to trigger automations. That improved as well.

evelant commented 5 months ago

@puddly I have noticed that some of my bulbs are not responding at random to group commands. The strange thing is some of the bulbs in a group will turn on but not all of them. Usually like 5 of 8 turn on, 3 stay off. It seems random which bulbs it is. Not sure if this is an issue with the new firmware or something else.

jclendineng commented 5 months ago

The most I have in a group is 6 but no issues here, do the bulbs control if you do them individually? Double checked the zigbee group to make sure nothing dropped? Sometimes the group will drop bulbs if the bulb goes offline.

puddly commented 5 months ago

I have noticed that some of my bulbs are not responding at random to group commands.

Once a group command is sent out it's sort of at the mercy of the network. It'll be relayed and rebroadcast by most devices on your network, the coordinator is no longer involved beyond sending it out.

I would make sure your network channel is free of noise.

jtbandes commented 5 months ago

Thanks for these updated builds — I just gave it a try and had good results! My setup is:

In the past I had repeated NETWORK_BUSY errors when using the Zigbee groups in Home Assistant for toggling lights and changing colors and such, so I had kinda stopped using them from Home Assistant (I only use them for binding the Inovelli switches directly to bulbs). With the 7.4.2.0 build it looks to be a lot more reliable, at least in my initial testing!

I recorded a before/after video to demonstrate (note that all the bulbs used in this demo are located <10 feet from the zigbee radio):

BeforeAfter
https://github.com/home-assistant/core/assets/14237/11a1614c-2a99-4fac-9db2-2e4a2e6c12a4 https://github.com/home-assistant/core/assets/14237/b2a61f18-f789-4236-8a82-78a329fdf913