home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
74.04k stars 31.07k forks source link

Automations issue/ ZHA Network busy errors after migrating to Skyconnect dongle #86411

Open jason1980p opened 1 year ago

jason1980p commented 1 year ago

The problem

After migrating to Home Assistant Skyconnect usb dongle I've been running into network busy errors. I currently have the dongle connected to a usb extension cable connected to R-Pie4 .

What version of Home Assistant Core has the issue?

Home Assistant 2023.1.6

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Automation

Link to integration documentation on our website

https://www.home-assistant.io/docs/automation/

Diagnostics information

No response

Example YAML snippet

alias: "Pico: Master Bathroom remote"
description: ""
use_blueprint:
  path: stephack/core-pico.yaml
  input:
    pico_remote: a58ddd4ab05559d05de8267f82dd7c49
    top_on:
      - service: light.turn_on
        data:
          brightness_step_pct: 100
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006
    bottom_off_release:
      - service: light.turn_off
        data: {}
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006
    up_raise:
      - service: light.turn_on
        data:
          brightness_step_pct: 20
        target:
          entity_id:
            - light.light_unknown_master_bathroom_lights_zha_group_0x0006
    down_lower:
      - service: light.turn_on
        data:
          brightness_step_pct: -20
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006

Anything in the logs that might be useful for us?

Logger: homeassistant.components.automation.pico_master_bedroom_remote
Source: components/zha/light.py:292
Integration: Automation (documentation, issues)
First occurred: January 21, 2023 at 8:37:18 PM (7 occurrences)
Last logged: 7:21:23 PM

Pico: Master Bathroom remote: Choose at step 1: choice 1: Choose at step 1: choice 1: Error executing script. Unexpected error for call_service at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 451, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 684, in _async_call_service_step
    await service_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1755, in async_call
    task.result()
  File "/usr/src/homeassistant/homeassistant/core.py", line 1792, in _execute_service
    await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service
    await service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 958, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 570, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 978, in async_turn_on
    await super().async_turn_on(**kwargs)
  File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 292, in async_turn_on
    result = await self._level_channel.move_to_level_with_on_off(
  File "/usr/local/lib/python3.10/site-packages/zigpy/zcl/__init__.py", line 324, in request
    return await self._endpoint.request(
  File "/usr/local/lib/python3.10/site-packages/zigpy/group.py", line 57, in request
    await self.application.send_packet(
  File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 782, in send_packet
    raise zigpy.exceptions.DeliveryError(
zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>

Additional information

No response

home-assistant[bot] commented 1 year ago

Hey there @dmulcahey, @adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration (zha) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `zha` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Change the title of the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign zha` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


zha documentation zha source (message by IssueLinks)

puddly commented 1 year ago

The error means:

A message cannot be sent because the network is currently overloaded.

Your automation is referencing ZHA group entities. Are you rapidly sending messages to ZHA groups with it? If so, that's the cause of the error message. How fast are you sending them?

jason1980p commented 1 year ago

Yes I use zha groups. example I have 4 lamps in my living room, I have a zigbee bulb in each. I have them all set in a zha group. I control the group lights via a automation using the Lutron Pico remote. when I press the on button it sends a on request to the group instead of each individual light bulb same when turning the lights off.

fakethinkpad85 commented 1 year ago

Having the same issue, also using ZHA groups to control 2-4 lights at once depending on the group. And its not getting called often, in the example below i call the ZHA groups once (turn of all lights), its maybe 10 groups with 2-4 lights in each group. but the automation is only calling each light group once.

The Automation actually turns off All lights (light service) by Area, meaning the Area could potentially contain both the ZHA group and the induvidual Light in that ZHA group, i usally Hide all the induvidual light entities but they are still in the same Area that the group is.

image

image

Watch - Away: Choose at step 1: choice 1: Error executing script. Unexpected error for call_service at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161> Watch - Away: Error executing script. Unexpected error for choose at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161> While executing automation automation.watch_away Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 451, in _async_step await getattr(self, handler)() File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 684, in _async_call_service_step await service_task File "/usr/src/homeassistant/homeassistant/core.py", line 1755, in async_call task.result() File "/usr/src/homeassistant/homeassistant/core.py", line 1792, in _execute_service await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)( File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service await service.entity_service_call( File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call future.result() # pop exception if have File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 958, in async_request_call await coro File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call await result File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 581, in async_handle_light_off_service await light.async_turn_off(**filter_turn_off_params(light, params)) File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 986, in async_turn_off await super().async_turn_off(**kwargs) File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 417, in async_turn_off result = await self._on_off_channel.off() File "/usr/local/lib/python3.10/site-packages/zigpy/zcl/__init__.py", line 324, in request return await self._endpoint.request( File "/usr/local/lib/python3.10/site-packages/zigpy/group.py", line 57, in request await self.application.send_packet( File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 782, in send_packet raise zigpy.exceptions.DeliveryError( zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>


Another set of errors that might be related,

`Logger: homeassistant.components.zha.core.channels.base Source: components/zha/core/channels/base.py:486 Integration: Zigbee Home Automation (documentation, issues) First occurred: January 26, 2023 at 16:49:07 (32 occurrences) Last logged: January 26, 2023 at 16:49:10

[0xF4E3:1:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] [0xB7E9:1:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] [0xC402:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] [0x7588:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] [0xC402:1:0x0008]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')]`

Edit: actually forgot to include the logs..

SneakieGargamel commented 1 year ago

I had the same issue and found this post. I am now using helpers to create groups instead of ZHA groups and it seems to work a lot better.

fakethinkpad85 commented 1 year ago

Thanks for the reply @SneakieGargamel I personally might give that a try, have heard from multiple people that this might solve the problem. But still, if it's a generally known issue it should be solved as it does make sense to try and keep the ZHA id requests at a minimum and they obviously created ZHA groups for a reason.

dmulcahey commented 1 year ago

If you are consistently trying to control several groups at once you should create an additional group with the members of the other groups.

multicast / broadcast traffic has limits on EZSP. See the broadcast section here: https://community.silabs.com/s/article/guidelines-for-large-dense-networks-with-emberznet-pro?language=en_US

(most relevant statement is 8 in a 9s window)

keep in mind this means all broadcasts including ones initiated by the stack itself (not just your group messages)

fakethinkpad85 commented 1 year ago

@dmulcahey that is kind of interesting, so do I understand correctly that using the independent lights Id´s will only worsen the result as it will probable more than 8 requests (~50 in my case); shouldn't creating a Helper group containing the same Id's cause the same issue?

dmulcahey commented 1 year ago

Helper groups send individual device messages so they are no different than just addressing lights individually. You can prove this by enabling debug logs and watching the Zigpy / bellows logs. Zigpy groups are meant to cut down Zigbee traffic and they send a single message to the group Id. again this can be seen in the nwk traffic. What you should not do is attempt to address several Zigbee groups at the same time. This will flood the network. If that is something you do consistently you should create an additional zigbee group with the members of the individual groups and use that.

That article / the section I mentioned is specifically talking about broadcast messaging not about messaging individual devices.

fakethinkpad85 commented 1 year ago

@dmulcahey Ah I see, appreciate the explanation. Sorry to bother you but as bit of a nerd I guess I need to know how some things function. You mentioned messages to a Zigpy group is meant to reduce traffic, but dousent the controller need to relay that message to each of the device Id of that group causing the same amount of end-to-end traffic as sending it to each device to begin with?

Or is the group Ids somehow stored on the end-device, surely they can't be?

dmulcahey commented 1 year ago

The coordinator sends a multicast message to the group id so only 1 message is sent by the coordinator. Enable debug logs and you can follow along

dmulcahey commented 1 year ago

Also, you can read the group cluster stuff in here if you want: https://zigbeealliance.org/wp-content/uploads/2019/12/07-5123-06-zigbee-cluster-library-specification.pdf#page126

MattWestb commented 1 year ago

The 8 broadcast / 9 seconds is not one EZSP limit is one limit in the underlying 802.15.4 network and its shall being the same for all Zigbee 3 coordinator then forming on Zigbee 3 network. If using one TI CC-2531 with HA 1.X firmware is one other thing. Some IT coordinator Zigbee 3 firmware have being patched with very high broadcast limit but its useless then all routers that is Zigbee certificated is silent doping the 9 package in 9 seconds window. The limit is made for blocking broadcast storms that can blocking the network and its the same with other protocols like Thread that is using 802.15.4 under there own stack.

TheAlphaLaw commented 1 year ago

I have the same issues with ZHA groups, with SkyConnect. Deconz with the Conbee II works 100% better, regretting the migration of 100+ devices without testing.

smartqasa commented 1 year ago

After trying virtually everything, I finally gave up on ZHA and migrated to Zigbee2MQTT (Z2M). Migrating is painful and Z2M is not perfect but it performs much better than ZHA in environments that include 40+ lighting devices such as mine.

atr00 commented 1 year ago

same issue with ZHA and SkyConnect... my group of 2 lights triggers this problem but neither a help group nor the individual lights will provoke it. Z2M groups (using Sonoff Dongle-E so also eszp) work flawlessly as well.

dmulcahey commented 1 year ago

@TheAlphaLaw @atr00 can you enable debug mode, run the actions that cause the issue and then attach the full logs please?

atr00 commented 1 year ago

@dmulcahey There you go: zha_logs.txt There are both successful and failed actions in the log.

Note that, even then the action is successful, it is very slow (it takes 2 to 3 seconds for the lights to change their color... whereas individually or in a helper group it is almost instantaneous.)

dmulcahey commented 1 year ago

Thanks. This is really helpful.

atr00 commented 1 year ago

Happy to help! If you need more tests, just ask me.

dmulcahey commented 1 year ago

With debug mode on you should also get periodic counter dumps in the logs. Can you attach that too?

Your log cuts off right when the command to read them is sent.

atr00 commented 1 year ago

I am not sure what they are... let me know if there is is what you need here: home-assistant_2023-03-08T12-53-49.370Z.log

puddly commented 1 year ago

@atr00 I suspect you have an automation that needs debouncing, as you're sending a lot of group requests at once:

# To turn on a single light, you'd need to send at most one `on()` command, and maybe one `move_to_color()`
2023-03-08 20:52:47.617 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:48.460 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:48.598 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=8912, color_y=2621, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:49.176 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:49.433 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=11272, color_y=48954, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:50.017 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:50.194 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=29097, color_y=33881, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:50.948 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=16318, color_y=6029, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:51.623 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()

# `NETWORK_BUSY` immediately after sending the 10th request in 5 seconds
2023-03-08 20:52:52.559 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=25230, color_y=10157, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:52.574 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 118]

2023-03-08 20:52:53.023 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:53.034 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 125]

2023-03-08 20:52:55.077 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:53.088 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 126]

...

2023-03-08 20:53:04.598 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:05.604 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=11272, color_y=48954, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:06.809 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:07.841 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=8912, color_y=2621, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:07.884 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:08.664 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:08.836 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=16318, color_y=6029, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:09.631 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=25230, color_y=10157, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:09.968 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:10.904 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=29097, color_y=33881, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:11.292 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:11.963 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:13.916 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()

Group requests are network-wide broadcasts that are bounced back and forth between all devices to make sure every possible group member has a chance to "hear" the message. If you send this many at once, the network will become too busy, as the error indicates. The limit on the number of group requests/broadcasts is hard-coded in the firmware and can't be raised.

You may find it better to switch to a normal light group where each bulb is individually contacted concurrently, as this won't have the same limitation.

MattWestb commented 1 year ago

The under network layer 802.15.4 have broadcast storm protection so all routers is only handling 9 broadcast in 8 seconds and if its more they ignoring them. Ziogbee groups can being used for individual lights but is the best being used with middle and large groups of lights and its not getting so large problems if having routing problem in the network.

atr00 commented 1 year ago

@puddly I'm actually not even using any automation here. I'm going to the coordinator device where I can see my 2-bulb zigbee light group and I change the color from there. I'm not sure the issue is only related to automation. And when I don't purposely jam the network by sending consecutive fast requests, the response is very slow (it take about 3 seconds between the click and the actual color change). When I change individual lights or a helper group I am not facing any issue. When I use Z2M using a dongle with the same chip (ERF32), it works flawlessly, whether on individual lights or on a Z2M zigbee group.

@MattWestb if you see my answer here, this is not happening with Z2M, it's specific with ZHA and SkyConnect. My group has 2 bulbs in it. There's slowness as well on a single broadcast command.

TheJulianJES commented 1 year ago

the response is very slow (it take about 3 seconds between the click and the actual color change

So this is only when using ZHA groups, right? I also remember a "lag" when changing colors on ZHA groups (using the UI picker) and EZSP coordinators in my testing. There are two packets sent: OnOff -> on and Color -> set_xy_color even when just changing a light color that is already on (this is somewhat because ZHA can't be 100% sure that the light is actually on and the service is called light.turn_on, so there's always either some on message or move_to_level_with_on_off being sent with "just a color change". This also follows the behavior of other integrations). However, TI coordinators do not have this lag. They basically send both packets at almost the exact same time. EZSP coordinators (when used with bellows) have a delay of about a second or so for me (until the color change shows up).

There were some changes to this with https://github.com/zigpy/bellows/pull/402, but there's still a difference in regards to how fast ZHA light groups react with TI coordinators (zigpy-znp) vs with EZSP coordinators (bellows).

atr00 commented 1 year ago

That’s right, only when using ZHA groups and I indeed used the color picker (but the same happens when I use my button automation). So I suspect both issues are actually one.

puddly commented 1 year ago

When I use Z2M using a dongle with the same chip (ERF32), it works flawlessly, whether on individual lights or on a Z2M zigbee group.

Do you use Z2M's default channel of 11, or did you manually change it to 15 to match ZHA? If they're using different channels, that's likely why you're seeing a difference.

atr00 commented 1 year ago

I use 11 on Z2M and 15 on ZHA. However, does it really matter as helper groups or individual lights work well in ZHA? Again, the issue is only when using ZHA groups. Other situations are fine.

TheJulianJES commented 1 year ago

You're not using Z2M with SkyConnect (or an EZSP-based stick), but with a TI stick (CC2652), right?

puddly commented 1 year ago

Again, the issue is only when using ZHA groups. Other situations are fine.

Without identical conditions, you can't rule out environmental factors like channel 15 congestion, coordinator placement, USB extension cable quality, interference, etc.

Try changing the ZHA network's channel to 11 and see what happens (download a coordinator backup JSON, edit the channel, then perform a radio migration but upload that modified JSON file). If you reboot your light bulbs, some will join the network on the new channel. Otherwise, you'll have to reset them.


After checking, Z2M instructs the radio to send the packet in exactly the same way as ZHA does (radius=0, non_member_radius=3), so any differences should theoretically be entirely due to your environment or the coordinator itself.

atr00 commented 1 year ago

@TheJulianJES I am using another EZSP with Z2M (Sonoff zbdongle-e which has the same EFR32 chip as SkyConnect). @puddly ok I will try that tonight.

atr00 commented 1 year ago

bad news: I've switched to channel 11 (I tried 12 as well...) but the slowness is still present and consecutive commands will still trigger the jam. Again, it's working very well when I send command to individual lights or when I send commands to a helper group. With the helper group, I can send fast consecutive commands and it won't jam the network, respond is very fast. The command I use on my helper group is also turn_on and then set_color.

Do zigbee groups work better when there are a lot of lights in the group or it should not matter? In theory, is it better to do a zigbee group or a helper group?

@TheJulianJES what did you end up doing? It's a shame that zigbee grounds don't work well with SkyConnect. I am pretty sure the issue is linked to what you mentioned.

dmulcahey commented 1 year ago

@atr00 can you provide a comprehensive list of Zigbee devices on your mesh please.

atr00 commented 1 year ago

@dmulcahey It’s a new network, I only have 2 Philips hue bulbs (the ones in the group), 1 aqara door sensor, 1 aqara motion sensor and 1 aqara temp+humidity sensor. The coordinator is a new SkyConnect. LQI >= 220.

dmulcahey commented 1 year ago

@atr00 I suspect you have an automation that needs debouncing, as you're sending a lot of group requests at once:

# To turn on a single light, you'd need to send at most one `on()` command, and maybe one `move_to_color()`
2023-03-08 20:52:47.617 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:48.460 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:48.598 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=8912, color_y=2621, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:49.176 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:49.433 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=11272, color_y=48954, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:50.017 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:50.194 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=29097, color_y=33881, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:50.948 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=16318, color_y=6029, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:51.623 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()

# `NETWORK_BUSY` immediately after sending the 10th request in 5 seconds
2023-03-08 20:52:52.559 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=25230, color_y=10157, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:52.574 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 118]

2023-03-08 20:52:53.023 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:53.034 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 125]

2023-03-08 20:52:55.077 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:53.088 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 126]

...

2023-03-08 20:53:04.598 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:05.604 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=11272, color_y=48954, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:06.809 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:07.841 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=8912, color_y=2621, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:07.884 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:08.664 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:08.836 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=16318, color_y=6029, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:09.631 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=25230, color_y=10157, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:09.968 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:10.904 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=29097, color_y=33881, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:11.292 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:11.963 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:13.916 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()

Group requests are network-wide broadcasts that are bounced back and forth between all devices to make sure every possible group member has a chance to "hear" the message. If you send this many at once, the network will become too busy, as the error indicates. The limit on the number of group requests/broadcasts is hard-coded in the firmware and can't be raised.

You may find it better to switch to a normal light group where each bulb is individually contacted concurrently, as this won't have the same limitation.

@atr00 how did you produce this log? the only thing that makes sense at this point is that you spam toggled something intentionally to illustrate the network busy itself? I'm just trying to understand what caused the immense amount of broadcast traffic in such a short period of time. Are you going into the color picker and bouncing between several colors in rapid succession?

atr00 commented 1 year ago

@dmulcahey yes that’s exactly what I did. The same action does not produce this when I do it on the helper group with the same lights.

dmulcahey commented 1 year ago

as has been stated several times: The helper group sends unicast messages to each member individually and isn't the same as a Zigbee group. Comparing what they do is irrelevant / invalid. Groups use multicast traffic to address multiple devices with a single message but those messages are essentially sprayed out over the network. There are some differences between how ZHA addresses lights and how Z2M does and I think that increases the number of multicasts we send from a single action but that shouldn't be the cause of this (although it will make it easier to happen). Given you have such a small network we can try something interesting to see if things are logically adding up.

zha:
  zigpy_config:
    ezsp_config:
      CONFIG_MAX_HOPS: 2

add the max hops config to your configuration.yaml. Above is an example. This is NOT something we will keep in your configuration long term but I want to see if it affects what you are seeing / doing. Apply this and restart. Try again and let us know what you see.

EDIT

The more I think about it I am not even sure how this could have an impact. There aren't any other devices to relay the messages. Hrm...

atr00 commented 1 year ago

@dmulcahey I tried and indeed it didn't have any impact

atr00 commented 1 year ago

@dmulcahey is there anything else I can try?

jclendineng commented 1 year ago

Watching this as I just moved all my zha devices over and set up lighting automations and am seeing the exact same thing, ~50 lights and ~12 zha groups created in the zigbee config

jclendineng commented 1 year ago

I wanted to update with my findings, I fixed this by doing a couple things, not sure which one fixed it. I was using multi protocol firmware and I flashed stock then flashed the latest firmware. I then switched from channel 15 to 20 and re-paired all devices. I have 0 issues like this now and everything is snappy. Not sure if it was a channel change or if there are some issues in multi protocol but it's all good now.

atr00 commented 1 year ago

@jclendineng which firmware did you flash it with?

jclendineng commented 1 year ago

flash

I flashed the latest 7.2.2.2 for yellow but you would flash the one for sky connect. It's in the beta folder. I think that may have fixed me.

Edit. The default sky connect or yellow firmware is 6.x and very old so the latest firmware would have a lot of qol and bug fixes so you should definitely try it. It was very easy, took me a few minutes to figure out how to flash but it's not too bad.

RoyHP commented 1 year ago

I'm having this same issue. Attached logs. SkyConnect with a ZHA group addressing 9 Hue bulbs. I use the HASS HomeKit integration, and as soon as I drag the dimmer slider in iOS for the ZHA group for the lights, it dims briefly and then it becomes unresponsive and the only way to fix it is to delete the homekit bridge and create a new one. I believe the slider events are not debounced at all in the pipeline of HASS homekit bridge/ZHA/SkyConnect and immediately makes them unresponsive. Asking siri to turn them on and off works fine but that's a single action as opposed to dragging the dimmer slider. home-assistant_homekit_2023-04-23T23-09-00.031Z.log

dieneuser commented 1 year ago

Hello,

I also have the problem of network-busy-errors with ZHA groups. I would flash the Yellow with the latest beta NabuCasa_Yellow_EZSP_v7.3.0.0 as @jclendineng suggests. Is there anywhere I can read the actual firmware version of the Silicon Labs EZSP module in the Yellow? I can't find any info in the ZHA integration.

smartmatic commented 1 year ago

I have the same issue. I have one group with 3 GU10 spots. I use the group to change the color with one message. The group is triggered via a Friends of Hue Zigbee Green Power Switch.

Logger: homeassistant.components.automation.wc_foh_spots Source: components/zha/light.py:336 Integration: Automation (documentation, issues) First occurred: 11:17:32 (3 occurrences) Last logged: 11:17:32

WC: FOH Spots: Choose at step 1: choice 4: Error executing script. Unexpected error for call_service at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161> WC: FOH Spots: Error executing script. Unexpected error for choose at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161> While executing automation automation.wc_foh_spots Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 452, in _async_step await getattr(self, handler)() File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 685, in _async_call_service_step await service_task File "/usr/src/homeassistant/homeassistant/core.py", line 1910, in async_call task.result() File "/usr/src/homeassistant/homeassistant/core.py", line 1950, in _execute_service await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)( File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 226, in handle_service await service.entity_service_call( File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 811, in entity_service_call future.result() # pop exception if have ^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1034, in async_request_call await coro File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 851, in _handle_entity_call await result File "/usr/src/homeassistant/homeassistant/components/light/init.py", line 582, in async_handle_light_on_service await light.async_turn_on(filter_turn_on_params(light, params)) File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 1197, in async_turn_on await super().async_turn_on(kwargs) File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 336, in async_turn_on result = await self._level_cluster_handler.move_to_level_with_on_off( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/zigpy/zcl/init.py", line 331, in request return await self._endpoint.request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/zigpy/group.py", line 57, in request await self.application.send_packet( File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 812, in send_packet raise zigpy.exceptions.DeliveryError( zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>

cityeyes commented 1 year ago

Not sure if it's with the multiprotocol plugin, or just the radio, but I'm using a Sonoff ZBDongle-E, which is also based on the EFR32MG21 Multiprotocol SoC and am also running into this issue regularly after switching over from the original Sonoff Dongle.

It all tracks, though, that the problems would start when I moved to multiprotocol, from what other are saying about the way it behaves compared to the original one.

For single-item control I have no issues, but when controlling a number of items in a group a few times in quick succession, I get the message in the same way I did before when the items weren't in a Zigbee group.

I've already tried changing channels a few times, but it's currently definitely on the best for my network. Just flashed the newest firmware on my device today and I was able to get it to trigger by changing the color of a group of bulbs a few times in a ten second period, so it's still going on.

puddly commented 1 year ago

Group commands aren't appropriate if you're going to be sending them frequently like this. With a clear Zigbee channel, you can easily control many dozens of bulbs nearly instantaneously with just a normal light group, which won't suffer from this same "too many requests at once" problem.

This isn't something that can really be fixed either because it's an inherent limitation of the protocol: the network is busy and the firmware is preventing you from transmitting more data until the existing group commands finish bouncing around. That being said, there are some things you can tweak to reduce the various broadcast radiuses, which will potentially make it into this upcoming beta. The original problem will still be there, however: groups aren't made to be used frequently enough to hit these limitations.

cityeyes commented 1 year ago

normal light group

I was always under the impression that Zigbee groups were the way to go, but it looks like this worked, at least as far as I can tell. Previously, I'd been having this same error pop up when I controlled the lights individually and it was only solved when I changed to a Zigbee group, so I wasn't aware this could be a solution.

Unfortunately, this means I'm going to have to re-create all of my Zigbee groups in HA standard. Small price to pay to get everything working properly again. Thanks for cluing me in on the difference between standard groups and Zigbee groups.

Edit: Okay, "solved" is an overstatement, unfortunately. Zigbee Groups solved a lot of problems before I switched to multiprotocol, in that the changes would be simultaneous across devices.

Now that I've got everything on HA groups instead of Zigbee groups, I'm not getting the network busy errors, but full-home control takes a long time. I'll tell HA to turn all of my lights on, or change them to a specific color, and it'll go around my house, one by one, changing and turning on lights, and sometimes it'll end before they all change and the light will throw back an error until I turn them off and on again.

ZHA groups seem to be the best way to deal with fixtures with more than one light and entire-home control, and I'm not entirely sure why it worked before I switched to multiprotocol and why it's having so many issues now.