Open Antriel opened 3 weeks ago
Hey there @dmulcahey, @adminiuga, @puddly, @thejulianjes, mind taking a look at this issue as it has been labeled with an integration (zha
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
zha documentation zha source (message by IssueLinks)
I am getting a very similar issue about every 10 minutes, where after upgrade I get the NCP Entered Failed State Error and another one for Task exception was never retrieved. This last one has different stacktrace inside:
`Logger: homeassistant Source: components/zha/core/cluster_handlers/init.py:82 First occurred: 1:51:15 PM (1 occurrences) Last logged: 1:51:15 PM
Error doing job: Task exception was never retrieved (None) Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 682, in async_device_initialized await self._async_device_rejoined(zha_device) File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 726, in _async_device_rejoined await zha_device.async_configure() File "/usr/src/homeassistant/homeassistant/components/zha/core/device.py", line 604, in async_configure await self.identify_ch.trigger_effect( File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/init.py", line 82, in wrapper return await RETRYABLE_REQUEST_DECORATOR(func)(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/zigpy/util.py", line 131, in retry return await func() ^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/zigpy/zcl/init.py", line 377, in request return await self._endpoint.request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/zigpy/endpoint.py", line 265, in request return await self.device.request( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/zigpy/device.py", line 339, in request await send_request() File "/usr/local/lib/python3.12/site-packages/zigpy/application.py", line 841, in request await self.send_packet( File "/usr/local/lib/python3.12/site-packages/bellows/zigbee/application.py", line 857, in send_packet await self._ezsp.setExtendedTimeout(device.ieee, True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'setExtendedTimeout'`
By the way, I'm using ezsp radio on a Tasmota flashed Sonoff Zigbee Bridge
having same problem - any progress on this one??
After this issue happens, I get:
WARNING (MainThread) [homeassistant.helpers.entity] Update of binary_sensor.rpi_power_status is taking over 10 seconds
, and then every 30 seconds this line repeats:WARNING (MainThread) [homeassistant.components.binary_sensor] Updating rpi_power binary_sensor took longer than the scheduled update interval 0:00:30
.
Both this and #116862 are not issues that can be fixed.
From the log above, it looks like something is stuttering Home Assistant's event loop by 30+ seconds: any delay of more than 6 seconds will cause the Zigbee radio to assume the connection between it and the computer has been broken and the radio will reset.
You likely have some rogue custom integration, addon, or filesystem problem that causes huge delays.
I assume the rpi_power_status
is just a consequence of staying in invalid state, I only included it for completeness, as I don't know what could be important.
The assumption that the zigbee radio connection has been broken, and HA wanting to reset it... I suppose that's fine (if not, that's secondary issue), but it shouldn't be left in invalid state without it being able to properly restart itself. I'm not sure if it's just some unhandled exception, or the actual driver bugs out, or w/e.
Parts of my zigbee network are pretty unstable at times, and maybe that's causing the issue. Some end devices need around 4 dedicated routers to hop through, and they often end up unresponsive after a few days – I'm not sure if that might be the source of this issue. Regardless, it shouldn't break the entire network.
Is there any way to diagnose which integration/addon/filesystem problem could be the cause? I only have 2 unofficial addons that use bluetooth and don't have support for anything zigbee, pretty much everything is else official HA stuff, or some widely used ones like Plex. I don't see anything but ZHA that could interfere with zigbee.
The problem
Since 2024.6.3 I sometimes (around once a day) find my HA unresponsive. I can connect via app/web, but it gets stuck on loading. I had this error before: https://github.com/home-assistant/core/issues/116862, and the last update fixed that issue, but unfortunately I now get another error. When it happens, no zigbee device/automations work, although non-zigbee devices keep being able to talk to HA (I can see data history for them afterwards). The only thing that helps is either disconnecting and reconnecting the zigbee USB (SLZB-06m), or rebooting the server (RPI5) by replugging the power.
What version of Home Assistant Core has the issue?
core-2024.6.3
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
zha
Link to integration documentation on our website
https://www.home-assistant.io/integrations/zha/
Diagnostics information
config_entry-zha-3887042cefb3201b61d2be45b6b9e134.json
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
After this issue happens, I get:
WARNING (MainThread) [homeassistant.helpers.entity] Update of binary_sensor.rpi_power_status is taking over 10 seconds
, and then every 30 seconds this line repeats:WARNING (MainThread) [homeassistant.components.binary_sensor] Updating rpi_power binary_sensor took longer than the scheduled update interval 0:00:30
.