home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.67k stars 30.8k forks source link

ZHA problem since last update 2023.4.2 : Zigbee channel 15 utilization is 99.67%! #91160

Closed TeddyLafrite closed 1 year ago

TeddyLafrite commented 1 year ago

The problem

Since last update on 2 differents installation HA (Rpi4 and SkyConnect) :

Logger: zigpy.application Source: components/zha/core/gateway.py:205 First occurred: 11:17:36 (2 occurrences) Last logged: 11:17:36

Zigbee channel 15 utilization is 99.67%! If you are having problems joining new devices, are missing sensor updates, or have issues keeping devices joined, ensure your coordinator is away from interference sources such as USB 3.0 devices, SSDs, WiFi routers, etc.

Never had problems since 6 month with this 2 installations.

What version of Home Assistant Core has the issue?

2023.4.2

What was the last working version of Home Assistant Core?

2023.4.2

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ZHA

Link to integration documentation on our website

No response

Diagnostics information

2023-04-10 11:16:54.884 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration asusrouter which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-04-10 11:16:54.885 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration ble_monitor which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-04-10 11:16:54.885 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration bodymiscale which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-04-10 11:16:54.888 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration tuya_local which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-04-10 11:16:54.891 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration hacs which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-04-10 11:16:54.894 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration livebox which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-04-10 11:16:54.896 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration alarmo which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-04-10 11:16:54.898 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration scheduler which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-04-10 11:17:26.329 WARNING (MainThread) [homeassistant.components.binary_sensor] Setup of binary_sensor platform ble_monitor is taking over 10 seconds. 2023-04-10 11:17:26.379 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Home Assistant Versions' for version integration not ready yet: Timeout of 10 seconds was reached while fetching version for supervisor; Retrying in background 2023-04-10 11:17:36.225 WARNING (MainThread) [zigpy.application] Zigbee channel 15 utilization is 99.67%! 2023-04-10 11:17:36.233 WARNING (MainThread) [zigpy.application] If you are having problems joining new devices, are missing sensor updates, or have issues keeping devices joined, ensure your coordinator is away from interference sources such as USB 3.0 devices, SSDs, WiFi routers, etc. 2023-04-10 11:17:47.341 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xBDA4:1:0x0702]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:47.482 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xBDA4:1:0x0b04]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:48.539 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x5D1E:1:0x0102]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:49.308 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x02B6:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:49.327 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x02B6:1:0x0b04]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:49.593 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x49BF:1:0x0102]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:50.185 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x2D49:1:0x0b04]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:50.196 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x019C:1:0x0102]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:50.326 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x02B6:1:0x0702]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:50.784 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x2D49:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:50.944 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x8C51:1:0x0702]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:51.163 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x3570:1:0x0008]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:51.218 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x3570:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:51.287 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x2D49:1:0x0702]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:51.798 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x8C51:1:0x0b04]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:17:51.810 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x8C51:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] 2023-04-10 11:23:11.969 DEBUG (MainThread) [bellows.ezsp.protocol] Send command readCounters: () 2023-04-10 11:23:11.974 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'465121a9a52a53d67e' 2023-04-10 11:23:11.995 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'6551a1a9a52a63b092943c25aa559249d64f24abeece0b8bffc66389897e14a7e3cdde6f85ffc7dbd5d2698c4623a9ec763ba5ea758241984c2613b1e070381c0e07bbe5ca659f479a4d9e4f9ff7c3d9d46a35a25190482495207e' 2023-04-10 11:23:11.995 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'87009f7e' 2023-04-10 11:23:11.998 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received readCounters: [[630, 203, 118, 0, 0, 330, 3, 3, 108, 2, 0, 117, 43, 8, 0, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 533, 0, 0, 0, 0, 0, 0, 0, 0]] 2023-04-10 11:23:12.000 DEBUG (MainThread) [bellows.ezsp.protocol] Send command getValue: (<EzspValueId.VALUE_FREE_BUFFERS: 3>,) 2023-04-10 11:23:12.003 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'575621a9fe2a1627057e' 2023-04-10 11:23:12.011 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'7656a1a9fe2a15b3a5dfb37e' 2023-04-10 11:23:12.011 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'8070787e' 2023-04-10 11:23:12.013 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received getValue: [<EzspStatus.SUCCESS: 0>, b'\xfc'] 2023-04-10 11:23:12.014 DEBUG (MainThread) [bellows.zigbee.application] Free buffers status EzspStatus.SUCCESS, value: 252 2023-04-10 11:23:12.014 DEBUG (MainThread) [bellows.zigbee.application] ezsp_counters: [MAC_RX_BROADCAST = 630, MAC_TX_BROADCAST = 203, MAC_RX_UNICAST = 118, MAC_TX_UNICAST_SUCCESS = 0, MAC_TX_UNICAST_RETRY = 0, MAC_TX_UNICAST_FAILED = 330, APS_DATA_RX_BROADCAST = 3, APS_DATA_TX_BROADCAST = 3, APS_DATA_RX_UNICAST = 108, APS_DATA_TX_UNICAST_SUCCESS = 2, APS_DATA_TX_UNICAST_RETRY = 0, APS_DATA_TX_UNICAST_FAILED = 117, ROUTE_DISCOVERY_INITIATED = 43, NEIGHBOR_ADDED = 8, NEIGHBOR_REMOVED = 0, NEIGHBOR_STALE = 10, JOIN_INDICATION = 0, CHILD_REMOVED = 0, ASH_OVERFLOW_ERROR = 0, ASH_FRAMING_ERROR = 0, ASH_OVERRUN_ERROR = 0, NWK_FRAME_COUNTER_FAILURE = 0, APS_FRAME_COUNTER_FAILURE = 0, UTILITY = 0, APS_LINK_KEY_NOT_AUTHORIZED = 0, NWK_DECRYPTION_FAILURE = 0, APS_DECRYPTION_FAILURE = 0, ALLOCATE_PACKET_BUFFER_FAILURE = 0, RELAYED_UNICAST = 0, PHY_TO_MAC_QUEUE_LIMIT_REACHED = 0, PACKET_VALIDATE_LIBRARY_DROPPED_COUNT = 0, TYPE_NWK_RETRY_OVERFLOW = 0, PHY_CCA_FAIL_COUNT = 533, BROADCAST_TABLE_FULL = 0, PTA_LO_PRI_REQUESTED = 0, PTA_HI_PRI_REQUESTED = 0, PTA_LO_PRI_DENIED = 0, PTA_HI_PRI_DENIED = 0, PTA_LO_PRI_TX_ABORTED = 0, PTA_HI_PRI_TX_ABORTED = 0, ADDRESS_CONFLICT_SENT = 0, EZSP_FREE_BUFFERS = 252]

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

home-assistant[bot] commented 1 year ago

Hey there @dmulcahey, @adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration (zha) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `zha` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign zha` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


zha documentation zha source (message by IssueLinks)

puddly commented 1 year ago

As the warning says, you have interference problems. PHY_CCA_FAIL_COUNT = 533 confirms that the radio is dropping packets because there is too much noise.

What devices are near your coordinator? Is it plugged in with the provided USB cable and positioned away from interference sources outlined in the warning?

Adminiuga commented 1 year ago

@ShuaWilson you have the same problem. Because of too much RF noise, the ConBee is refusing to start. This is a common problem and was discussed at length in the past.

TeddyLafrite commented 1 year ago

Hi Thanks for reply. I don't make any change in 6 months so that's curious. Have 7 router device for 17 device at all. The Skyconnect has 1 meter usb cable

TheJulianJES commented 1 year ago

The warning message was only introduced recently. You had the issues before, but no warning message was printed.

ShuaWilson commented 1 year ago

@ShuaWilson you have the same problem. Because of too much RF noise, the ConBee is refusing to start. This is a common problem and was discussed at length in the past.

Thank you, you are correct, i am extremely new to HA.

avu3 commented 1 year ago

I have the same issue as @TeddyLafrite up, running, stable system. No issues at all. Since upgrading to 2023.4 I have sensors dropping out. Power cycling them seemed to fix it for a while, except for the latest today which would drop after just a few check ins of being paired.

I tried moving the RPi my HA is using to different locations to rule out WiFi and USB interference. Its in a RPi 3 with SkyConnect and the supplied USB Cable.

I suspect that some combination of circumstance introduced in 2023.4 causing the message to appear.
Personally I'm giving it another few days and then I'll probably roll back to 2023.3, which was stable for me.

TheJulianJES commented 1 year ago

I suspect that some combination of circumstance introduced in 2023.4 causing the message to appear.

Just to explain this again: this message was only added in 2023.4.0 (https://github.com/zigpy/zigpy/pull/1183/). You likely already had interference issues with 2023.3, but the message didn't exist back then, so regardless of how much interference you had, you would not see the warning in your logs.

avu3 commented 1 year ago

I suspect that some combination of circumstance introduced in 2023.4 causing the message to appear.

Just to explain this again: this message was only added in 2023.4.0 (zigpy/zigpy#1183). You likely already had interference issues with 2023.3, but the message didn't exist back then, so regardless of how much interference you had, you would not see the warning in your logs.

Understood. I don't think there's anything amiss with the new log entry. I think 2023.4.0 introduced something that's causing the traffic on the Zigbee network. The log entry just notes its been detected.

Or maybe I always had congestion on the Zigbee network and never knew and its working in spite of it. And something unrelated changed in 2023.4.0 that's causing my issue of sensors dropping out and not staying connected.

I rolled back to 2023.3.6 and it resolved my issue. I no longer have sensors dropping out.

I only commented to mention I had the same case as originally posted.

MattWestb commented 1 year ago

Only for info i was getting the warning on my production system:

2023-04-06 11:20:42.150 WARNING (MainThread) [zigpy.application] Zigbee channel 25 utilization is 92.06%!

I have the 2.4 WiFi on Ch 11/6 and its not ideal but the neighbors is using the lower Chs. I have IKEA TF and DG on Ch 20 and my thread test networks on Ch 15 but dont looks having mush problems on them.

In the production network is one IKEA CWS3 that is having very bad surroundings (shielded off) and one 10W that is down the mess with the TV and have all RF interferes and is not responding and can need rebooting some time then the automatons is not working.

I think this function is great for diagnose the network then having strange problems !!!

Diag without link key info. config_entry-zha-998bc857058111eb90e2ad6d9c6e46a8.json.txt

Dickey01 commented 1 year ago

Had the same problems. Use a enlarged USB cable. Problems solved.

al13nus commented 1 year ago

Getting also Zigbee channel 15 utilization is 98.22%! and zigbee network is losing it. Already using USB cable but my area is pretty crowded and all 2,4 GHz Channels are >98% utilization. My wifi is at Channel 6 but it doens't help. Any ideas how to approach this?

dweston commented 1 year ago

I'm setting up a new Home Assistant Yellow PoE (PCB v1.2) and encountering the same issue: `Logger: zigpy.application Source: components/zha/core/gateway.py:205 First occurred: 12:34:12 (2 occurrences) Last logged: 12:34:12

Zigbee channel 15 utilization is 96.20%! If you are having problems joining new devices, are missing sensor updates, or have issues keeping devices joined, ensure your coordinator is away from interference sources such as USB 3.0 devices, SSDs, WiFi routers, etc.`

As the Yellow is being powered PoE on a cable that has the device some distance away from any other interference sources and the Zigbee coordinator is inbuilt, should I consider this is a hardware bug?

outrun0506 commented 1 year ago

Same here with same message. Everything worked fine before 2023.4.4 Update.

So i think, this is not only a Problem for adding the Error-message to HA.

I'm using SkyConnect as Stick with USB-Extension

Here my log output

Logger: homeassistant.components.zha.core.channels.base Source: components/zha/core/channels/base.py:490 Integration: Zigbee Home Automation (documentation, issues) First occurred: 14. April 2023 um 23:12:59 (35 occurrences) Last logged: 07:08:47

[0x33AF:11:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] [0xA0AF:11:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] [0xA373:11:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), TimeoutError()] [0xA0AF:11:0x0008]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] [0xA0AF:11:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')]

Edit: Here ist the export that @puddly asked for. config_entry-zha-a4b67464e650547b32fe0a2744d7dfdc.json.txt I had a look into the export, and it looks like Channel15 is already full. Additionally my setup working again, with no change... and works the last few days as a Charme, like it was before... Really strange.

yaggermr commented 1 year ago

Same issue never had a problem until I switched to skyconnect. I have always used a 20 ft extension cable.

ydogandjiev commented 1 year ago

I'm running into the same issue even after moving my Home Assistant Yellow away from my router and other electronics. I'm using the built-in Zigbee radio so putting it on a USB extension cable is not possible. Just ordered an external Sonoff Zigbee stick to try with a USB extension cord. If that helps, that would unfortunately put into question the hardware design of Home Assistant Yellow (but at least I will stop regretting that I migrated my Zigbee network from SmartThings to Home Assistant). 🤔

puddly commented 1 year ago

To anyone chiming in: please edit your comment and attach a ZHA diagnostics file. It will take a few seconds to download:

image

As the Yellow is being powered PoE on a cable that has the device some distance away from any other interference sources and the Zigbee coordinator is inbuilt, should I consider this is a hardware bug?

No. There are 16 Zigbee channels for a network to exist on (11 through 26). This message just says that channel 15 for you is congested. Channel migration will be in the next release but for now, make sure that your Yellow is positioned away from WiFi access points, mesh systems, USB 3.0 devices, SSDs, refrigerators, etc.

Everything worked fine before 2023.4.4 Update.

You can see for yourself that nothing relevant changed between 2023.4.3 and 2023.4.4. Even if there were ZHA changes, none can affect network operation like this: noise is entirely an environmental problem and can appear even if you don't change anything yourself, due to new neighbors, reconfiguring WiFi access points, etc. Plus, this warning was only added in 2023.4.0, so you would have not received it in the past no matter how noisy your environment.

TeddyLafrite commented 1 year ago

For information channel for my WiFi 2.4 is 11

Energy scan "11": 12.244260188723507, "12": 62.257682586134884, "13": 36.830390267097734, "14": 92.95959997754716, "15": 97.39286236923465, "16": 96.19660508390695, "17": 68.14622793558128, "18": 68.14622793558128, "19": 19.00785284282869, "20": 97.97769123383605, "21": 98.93395819824867, "22": 98.21983128611214, "23": 98.21983128611214, "24": 98.62178092672917, "25": 95.69133648577223, "26": 62.257682586134884

TeddyLafrite commented 1 year ago

So I can change Zigbee Channel for 11 (12.24%) and reappair all devices Wifi Channel 11 is far from Zigbee Channel 11

image

Joo01 commented 1 year ago

Hi puddly, You wrote: "Channel migration will be in the next release..." Do I understand you correctly, in the next release it will be possible to change the Zigbee channel in HA without having to 'reappair' (new pairing) the devices? If so, that would be great 👍 !!

georaspi commented 1 year ago

Same issue here since the update. My setup0 was solid for years and now I have sensors missing updates and ZHA crashes.

Config file from ZHA diagnostics attached as well.

config_entry-zha-6f8f5e9038592462910f8cfe62e8883e.json.txt

joshs85 commented 1 year ago

same issue here. Never a problem until the last update now my sensors just stop responding.. same message in the logs.

TJLTM commented 1 year ago

I was having no luck even when I turned off my wifi and putting the skyconnect on the end of a 1M extension cable. I used my USB breakout and Oscop and saw a lot of noise on the line still. My bright idea was to use a USB ground loop isolation device between the extension cable and port. one of these guys and now i have clean responsive zigbee.

MattWestb commented 1 year ago

@TJLTM You need explaining 2 things.

  1. Wot do you meaning with lot of noise on the line is its the 5V / GND or the ballsed data lines in the USB ?
  2. What is the power supply/s of your system.

For number one i see one large problems its the 180mA that is little week for one very sensitive radio devices. For number 2 is most power supply dubbed isolated so you can never getting ground loop as long you is not having 2 PSU that is not isolated (have grounding in the power connector) or have making the cabling very strange (that is nearly impossible). Most likely is your problem coming from one bad voltage regulator on your ma inboard that is not delivering clean power to the devices. Then is the best putting some more caps on the 5V line or of its external PSU putting one better that is not doing strange things.

One working version is the ground isolator and one active USB hub with one good PSU and all shall working OK and no limits.

Sirlabs have making reference designs and if all is doing like that blue print shall the device being good resistant from interference also form the power but if the chip is getting 5V DC with 1V HF AC overlay-ed it cant working OK.

Also this message is one warming its not one error but can being interesting if having other strange problems in the sstsem.

For our @puddly I have restarting my production system 3 more times (updating for the 2 bug fixes and one updated TRV quirk) and 2 times without complaining and one with and then im one normal bad user i having 110% overlapping with My WiFi so i think its little strange that is not complaining more (i have all possible devices on 5.8 Ghz but many is using 4.2 Ghz). I only having 2 devices that sometime have connection problems but is device / environment reflated and i can moving the network after next ZHA update (I hope).

bvlaicu commented 1 year ago

@puddly @Adminiuga

Channel migration will be in the next release

Will that require re-pairing the devices?

And regarding the channel utilization, any plans to add a sensor for it? Or better yet, one for the channel number and one for the utilization. This would help people track the utilization when they make changes to their setup.

Thanks for all the hard work!

alpat59 commented 1 year ago

Same problem here... starting from HA 2023.4.6. Around 20 zigbee devices, connected to Sky connect coordinator by channel 15, not working more from many hours .. very bad I'm receiving the message "Zigbee channel 15 utilization is 99.18%!" with % continuously increasing! Bu at the same time, there are very strange behaviours:

joshs85 commented 1 year ago

I'm also having issues since this update with covers... I have a group with two shades in it and if I send the open or close command to the group, it will only open or close one of them most of the time... some times it gets both. This behavior used to work perfectly before the update.

puddly commented 1 year ago

@alpat59

just one single device (aquara temp&hum sensor) is still working without any reason

So the coordinator can receive.

if I physically switch on/off an ikea zigbee light bulb, the related entity is changing on/off in sync

Same here, it is able to receive. However, you being unable to control devices means that it cannot send.

This points to the coordinator itself being near an interference source. I suggest you physically move it around, use a longer USB extension cable, and follow the suggestions in the warning.

As mentioned above: this warning was recently added, but it's there to notify you of a problem, it's not itself causing one. Think of it like the "check engine" light in your car. Downgrading will hide the warning (because it isn't present in 2023.3.x) but it won't fix the problem that's causing the warning.

This is entirely an environmental problem and the whole reason behind the warning being logged: so you can be aware of what you can do to fix it. Channel migration will be in the next release (likely available only via a service call) and will let you migrate most of your powered devices and Zigbee 3.0 sensors. However, it is a lot simpler and safer to try to first move WiFi routers, WiFi network channels, USB 3.0 devices, SSDs, and other 2.4GHz devices out of the way.

georaspi commented 1 year ago

It does not

@alpat59

just one single device (aquara temp&hum sensor) is still working without any reason

So the coordinator can receive.

if I physically switch on/off an ikea zigbee light bulb, the related entity is changing on/off in sync

Same here, it is able to receive. However, you being unable to control devices means that it cannot send.

This points to the coordinator itself being near an interference source. I suggest you physically move it around, use a longer USB extension cable, and follow the suggestions in the warning.

As mentioned above: this warning was recently added, but it's there to notify you of a problem, it's not itself causing one. Think of it like the "check engine" light in your car. Downgrading will hide the warning (because it isn't present in 2023.3.x) but it won't fix the problem that's causing the warning.

This is entirely an environmental problem and the whole reason behind the warning being logged: so you can be aware of what you can do to fix it. Channel migration will be in the next release (likely available only via a service call) and will let you migrate most of your powered devices and Zigbee 3.0 sensors. However, it is a lot simpler and safer to try to first move WiFi routers, WiFi network channels, USB 3.0 devices, SSDs, and other 2.4GHz devices out of the way.

It actually has definitely something to do with this update. Because tens of people (including me) have exactly the same h/w setup prior and only after the update start having issues. I have since reverted back to 2023.3.6 and everything works like a charm again as it was for more than 4 years now. Absolutely no h/w changes, nothing at all. Only the HASS update.

So if someone could help fix these issues, it would be greatly appreciated.

puddly commented 1 year ago

It actually has definitely something to do with this update.

To help figure out why downgrading makes any difference for you, please upload here or email me Core debug logs (https://www.home-assistant.io/integrations/zha/#debug-logging) of both 2023.3.6 and 2023.4.6 starting up and running for about ten minutes. Make sure to try to control devices with both.

The diagnostic info you uploaded (different from a debug log!) is for 2023.3.6, not 2023.4.x, so it don't contain information about your environment.

dweston commented 1 year ago

For what it may be worth, here is the debug log https://pastebin.com/dl/tF2a1exA from a recent reboot of a Home Assistant Yellow PoE (HAYPoE) running: Home Assistant 2023.4.6 Supervisor 2023.04.1 Operating System 10.1

The HAYPoE is situated isolated on its own, powered via PoE from a Netgear JGS516PE Switch. It is more than a meter away from any other device. The nearest WiFi AP would be more than 10meters away.

michaelhomeassistant commented 1 year ago

Hello, the energy scan is not displayed in my config. How can 1 change that?

bvlaicu commented 1 year ago

If anyone else in interested in monitoring the channel utilization, here are some sensors based on the ZHA diagnostics api. You can find <your-zha-coordinator-config-entry-id> in the diagnostics JSON. Go to the ZHA service - > Download diagnostics to get the JSON and look inside for data -> config_entry -> entry_id. You will also need a long lived token to hit the HA api. You should set it in secrets.yaml to be safe. The longLivedToken entry should be like Bearer <actual-token-here>

  - platform: rest
    resource: https://<your-ha-host>:<your-ha-port>/api/diagnostics/config_entry/<your-zha-coordinator-config-entry-id>
    # unique_id: zha_diagnostics
    name: zha_diagnostics
    # value_template: '{{ value_json.data.energy_scan[0] }}'
    value_template: 'OK'
    json_attributes_path: "$.data"
    json_attributes:
      - "energy_scan"
      - "application_state"
    method: GET
    scan_interval: 600
    headers:
      Authorization: !secret longLivedToken
      Content-Type: application/json
  - platform: template
    sensors:
      zha_channel:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'application_state')['network_info']['channel'] }}"
        friendly_name: ZHA Channel
      zha_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')[states('sensor.zha_channel')] }}"
        friendly_name: ZHA Channel Utilization
        unit_of_measurement: '%'
      zha_channel_11_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['11'] }}"
        friendly_name: ZHA Channel 11 Utilization
        unit_of_measurement: '%'
      zha_channel_12_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['12'] }}"
        friendly_name: ZHA Channel 12 Utilization
        unit_of_measurement: '%'
      zha_channel_13_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['13'] }}"
        friendly_name: ZHA Channel 13 Utilization
        unit_of_measurement: '%'
      zha_channel_14_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['14'] }}"
        friendly_name: ZHA Channel 14 Utilization
        unit_of_measurement: '%'
      zha_channel_15_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['15'] }}"
        friendly_name: ZHA Channel 15 Utilization
        unit_of_measurement: '%'
      zha_channel_16_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['16'] }}"
        friendly_name: ZHA Channel 16 Utilization
        unit_of_measurement: '%'
      zha_channel_17_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['17'] }}"
        friendly_name: ZHA Channel 17 Utilization
        unit_of_measurement: '%'
      zha_channel_18_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['18'] }}"
        friendly_name: ZHA Channel 18 Utilization
        unit_of_measurement: '%'
      zha_channel_19_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['19'] }}"
        friendly_name: ZHA Channel 19 Utilization
        unit_of_measurement: '%'
      zha_channel_20_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['20'] }}"
        friendly_name: ZHA Channel 20 Utilization
        unit_of_measurement: '%'
      zha_channel_21_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['21'] }}"
        friendly_name: ZHA Channel 21 Utilization
        unit_of_measurement: '%'
      zha_channel_22_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['22'] }}"
        friendly_name: ZHA Channel 22 Utilization
        unit_of_measurement: '%'
      zha_channel_23_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['23'] }}"
        friendly_name: ZHA Channel 23 Utilization
        unit_of_measurement: '%'
      zha_channel_24_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['24'] }}"
        friendly_name: ZHA Channel 24 Utilization
        unit_of_measurement: '%'
      zha_channel_25_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['25'] }}"
        friendly_name: ZHA Channel 25 Utilization
        unit_of_measurement: '%'
      zha_channel_26_utilization:
        value_template: "{{ state_attr('sensor.zha_diagnostics', 'energy_scan')['26'] }}"
        friendly_name: ZHA Channel 26 Utilization
        unit_of_measurement: '%'
JeffSteinbok commented 1 year ago

I'm not sure if the notifications and the disconnects are related.

I noticed much more frequent disconnects since 2023.4 and it did warn me that I was at 75% or so. I moved the receiver, and it now says I have about 20% capacity used. But I'm still seeing disconnects.

@puddly, any chance something else Changed recently unrelated to this message?

puddly commented 1 year ago

I'm not sure if the notifications and the disconnects are related.

They can be, or they are independent. If interference is preventing the coordinator from receiving updates or sending data back to devices, it would cause some to eventually disconnect. If it's just localized around the coordinator but the devices are connected via other routers, you may have a different issue.

What specific devices are these and what routers they are joined through? How far away are they from their parent router?

any chance something else changed recently unrelated to this message?

Not really. Network connectivity, disconnects, etc. is not something that is managed by software. It's handled by the coordinator's firmware, the firmware running on each of your devices, environmental conditions (e.g. noise, temperature swings causing batteries to change their voltage), and the way all these devices talk to one another.

JeffSteinbok commented 1 year ago

What specific devices are these and what routers they are joined through? How far away are they from their parent router?

Aquara temperature & humidity sensors, two of them. One is about 20ft, and the other maybe 25ft away from the only router (a USB Zigbee stick). I moved the stick closer to the floor and the network congestion went away, but still saw a drop.

Trolann commented 1 year ago

It actually has definitely something to do with this update.

To help figure out why downgrading makes any difference for you, please upload here or email me Core debug logs (https://www.home-assistant.io/integrations/zha/#debug-logging) of both 2023.3.6 and 2023.4.6 starting up and running for about ten minutes. Make sure to try to control devices with both.

The diagnostic info you uploaded (different from a debug log!) is for 2023.3.6, not 2023.4.x, so it don't contain information about your environment.

Do you still need this @puddly? I downgraded to 3.6 after issues like others have reported with ZHA integration where everything Zigbee disappeared and it looked as if I had to start over.

I was on HassOS and tried restoring from backup on a 2023.3.6 version and it didn't help at all. I then tried taking my 2023.4.1 (maybe 4.2 can't remember) backup .tar and copying the config to a 2023.3.6 container (NOT HASSOS), bringing it back online, It took about an hour to get everything to populate but everything came back up and I've left it on 2023.3.6.

I can try either a VM with one of my backups or changing the container version around and grabbing logs. Lmk. I'm ok with linux but not a hass expert; would need to know what logs.

puddly commented 1 year ago

Aquara temperature & humidity sensors, two of them.

Older generation Aqara sensors suffer from two problems: they join via the first parent router they detect, which often times is very far away from them. I'm not sure if your setup contains just a coordinator and two sensors, or also includes a USB-powered router (i.e. a second coordinator running router firmware). In either case, make sure they have chosen an appropriate parent and re-join them to the network if they have not.

The second problem is that their battery reporting is unreliable. The displayed percentage will not meaningfully change over the lifetime of the sensor and older stock occasionally ships with old batteries. When the battery depletes, these sensors often start behaving erratically or becoming unresponsive. Try fresh batteries as well.

would need to know what logs.

Of course. The main /config/home-assistant.log with both versions, after letting HA start up and run for 15-20 minutes each time.

killarema commented 1 year ago

Я по русски. Сейчас гит меня за мое кнопкодавство отпустит буду все решать. Пока имею ограничения. Всем спасибо

JeffSteinbok commented 1 year ago

Older generation Aqara sensors suffer from two problems: they join via the first parent router they detect, which often times is very far away from them. I'm not sure if your setup contains just a coordinator and two sensors, or also includes a USB-powered router (i.e. a second coordinator running router firmware). In either case, make sure they have chosen an appropriate parent and re-join them to the network if they have not.

I only have one router, so this wouldn't be an issue.

The second problem is that their battery reporting is unreliable. The displayed percentage will not meaningfully change over the lifetime of the sensor and older stock occasionally ships with old batteries. When the battery depletes, these sensors often start behaving erratically or becoming unresponsive. Try fresh batteries as well.

ACK - will do that.

Of course. The main /config/home-assistant.log with both versions, after letting HA start up and run for 15-20 minutes each time.

ACK - I'll see when I have time to get that for you. Note, went back to 2023.4.6 and things are super-stable now. Does this log have sensitive data?

puddly commented 1 year ago

Does this log have sensitive data?

The log does contain sensitive data (location data and network keys) so you're welcome to email it to me (puddly3@gmail.com) if you don't want to upload it here. I forgot to mention to enable ZHA debug logging in your config file as well.

JeffSteinbok commented 1 year ago

Ok. Challenge is finding the time to upgrade again, just to downgrade back down. I had to downgrade for another integration causing issues.

If there is anyone still on the newer version seeing this, maybe they could?

ejpenney commented 1 year ago

Personally I had a ton of issues with devices dropping, becoming unavailable after upgrading to 2023.4, found this thread and subscribed. The vast majority of my issues were Aqara sensors, so I pushed forward, re-connecting as needed and the issues have ... more or less gone away. Admittedly, I spent more than an hour each with some of these devices, factory resetting, adding, testing, re-adding. Throwing things...

Something did change, but the network eventually recovered. I'm back down to the same sorts of unreliability/instability I'm used to. Aqara devices fall off the network like they always have, at this point I'm writing this off as 2.4Ghz interference from nearby WiFi routers that automatically channel switch. Unfortunately Channel 15 appears to be inconveniently placed with regards to WiFi interference.

I also thought it was interesting to note that I read on the ZHA documentation channel 15 is the default channel because it's the most widely compatible. It sounds like it's possible to change channels, but risky and may require a lot of re-adding? That also assumes all your devices are compatible with alternative channels. I'm a little unclear from the discussion in this thread, is an improvement to this process forthcoming?

There's really only so much interference prevention we can do before badly behaved WiFi routers (our own and our neighbor's) end up winning. I like the new messaging though, it gives us a neat window into our environment and can help make some intelligent decisions about channel assignment.

georaspi commented 1 year ago

ACK - I'll see when I have time to get that for you. Note, went back to 2023.4.6 and things are super-stable now. Does this log have sensitive data?

You mean 2023.3.6 right? This is the last stable version I can get too...

JeffSteinbok commented 1 year ago

ACK - I'll see when I have time to get that for you. Note, went back to 2023.4.6 and things are super-stable now. Does this log have sensitive data?

You mean 2023.3.6 right? This is the last stable version I can get too...

No. 2023.4.6 is working fine for me; has no disconnects for days. It broke in 2023.5.0 and was very flaky.

JeffSteinbok commented 1 year ago

I suspect this is a duplicate of #92999 - I have a SiliconLabs coordinator.

MattWestb commented 1 year ago

I suspect this is a duplicate of #92999 - I have a SiliconLabs coordinator.

Some problems yes if using old firmware in the coordinator but it was also fixing some lost config parameters for newer firmware 2 so worth testing if having problems.

theglus commented 1 year ago

I'm having the same issue with 2023.5.4, but it's saying 95.69% utilization.

puddly commented 1 year ago

Closing because this is not a bug, the warning is indicating an environmental problem that was present before this warning message was added. A channel migration button will be in the next major HA release to assist with this.

Note, there was an independent problem affecting the HUSBZB-1 that was recently fixed.