Closed thefunkygibbon closed 2 months ago
Hey there @dmulcahey, @adminiuga, @puddly, @thejulianjes, mind taking a look at this issue as it has been labeled with an integration (zha
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
zha documentation zha source (message by IssueLinks)
Same problem here
Encountered same issue on 2023.12.4 error_log-3.txt
Tried to reconfigure network, reboot, now integration is not showing "initializing" but no zigbee device is working. Had to re-pair every zigbee device to work again. (possibly due to network reconfig I did when stuck initializing?)
Diagnostics: zha-b1738f4b724427bec34bcc396a7b0ff4-Zigbee Coordinator-e8e28ba646e99c7f83d95fb82f306659.json.txt config_entry-zha-b1738f4b724427bec34bcc396a7b0ff4.json.txt
@cdalexndr you can do a full system reboot Settings > System > Hardware > Advanced options> Reboot system. Or less recommended: unplug it and replug it back in. You do not need to repair (I have not needed to). After rebooting, ZHA should work again. For me, it works anywhere between 3 hours and 24 hours before it needs another reboot.
Here are my logs, I am experiencing the same issue. home-assistant_2024-01-08T14-00-17.880Z.log
I am running Core 2023.1.2, Supervisor 2023.12.0, OS 11.3, on a Raspberry Pi 4. (HUSBZB-1) HubZ Smart Home Controller - Standard Com Port, s/n: 1160046D - Silicon Labs
This issue is probably a duplicate of this: https://github.com/home-assistant/core/issues/105506
I'm experiencing this issue as well (after seeing the same issue start in the 2023.12.x releases as documented in #105445 and related tickets). ZHA was stable for me through 2023.11.3, and has not worked well since. I avoided the 2023.12.x releases all together due to these bugs, but updated to 2024.1.1 earlier this week. ZHA worked for a few days, but as of this afternoon, has started falling into the "Initializing..." As of now, I can't get it to recover, even with a full system reboot and power off. I'm using an HA Yellow with the built-in Zigbee radio.
Here are the logs with debugging enabled since the last reboot. ZHA never comes online and stays stuck in initializing:
home-assistant_2024-01-09T06-09-23.858Z.log
I'll likely need to revert to 2023.11.3 again, but I'm not sure how long I can stay on that old of a version. I don't suppose there's been any discussion of reverting ZHA to the 2023.11.3 code base until these issues can be resolved?
I can confirm the issues of ZHA instability are still present even on the latest. I had a stable 2024.1.2 for a few days but since yesterday ZHA just randomly restarted twice, 6 hours apart, :( the 2nd time the system was unstable for an hour before recovering.
Will try to see if I can manage to capture logs
Same issue here.
Same issue here.
Same thing here:
I run HAOS on an NUC, have the SkyConnect connected via USB extension cable (like you're supposed to), got the 2.4 update to the Silicon Labs Multiprotocol to 2.4.0, things started breaking... hours later I updated to 2.4.1... still broken... another few hours 2.4.2 was pushed and I upgraded.
Since 2.4.2 it's been up a day, then randomly the ZHA integration goes back to "Failed Setup Will Retry"
What's worse, I'm running both Zigbee AND Thread on the SkyConnect... so BOTH type of devices (85 of them) are broken... including lights.
Wife Acceptance Factor is dropping rapidly.
How do I downgrade back to 2.3.2??
Had the same issue, Although my error is specifically:
async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
Downgrading to 2023.12.4 seems to have caused less problems after I restarted the zigbee router based devices.
Wife Acceptance Factor is dropping rapidly.
Know what you mean. Home automations (including things we have come to rely on) being broken for months is not winning me any points. I had to revert a bunch of things to failsafe mode and find workarounds for a bunch of other things. Overall this is causing me a significant amount of work and effort.
@cdalexndr: upgrade to 2024.1.2. There were may bugs fixed between 2023.12.4 and then.
Multi-PAN has issues independent of ZHA, some of which will be addressed in an update scheduled for release very soon. If you're having reloads and using multi-PAN, this isn't a ZHA issue, nor something ZHA can fix. Be aware that multi-PAN is still in the experimental phase (though improving) so if you need stability, I strongly suggest using separate sticks for Zigbee and Thread (or using an external Thread border router).
so was there anything obvious in my logs? do you want/need me to do anything to help get to the bottom of this?
were the libraries which were changed (reverted) in 1.1 changed back again in 1.2 or something?
2024.1.1 to 2024.1.2 was a very tiny change and there would be no difference between how the two behave network-wise.
What exact coordinator are you using?
For me the issues started in the 2023.12.X releases. I am with Sonoff Dongle P
Even now I see random reboots, example:
Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 1098, in request_callback_rsp await self.request(request, timeout=timeout, **response_params) File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 1052, in request self._uart.send(frame) ^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'send'
This is the indicator that re-init has been done
Just a minute before, a lot of timeouts
`Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 1098, in request_callback_rsp
await self.request(request, timeout=timeout, **response_params)
File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 1059, in request
response = await response_future
^^^^^^^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 1097, in request_callback_rsp
async with async_timeout.timeout(timeout):
File "/usr/local/lib/python3.11/site-packages/async_timeout/__init__.py", line 141, in __aexit__
self._do_exit(exc_type)
File "/usr/local/lib/python3.11/site-packages/async_timeout/__init__.py", line 228, in _do_exit
raise asyncio.TimeoutError
TimeoutError`
Can confirm issues since 2023.12.x aswell. The Skyconnect seems to be crashing or loosing the connection, my logs from the multiprotocoll integration are full of some messages like trying to conncet with baudrate X while its trying different baudrates before it stops overall.
Physically reconnecting the Skyconnect and starting the integration again fixes the issue temporarly. It mostly crashes at 2-3AM in the night. With 2024.0 it was stable for a couple of days, now we are back to 24h.
my coordinator as it is currently is a sonoff brigge flashed with tasmota. unlike others here.
Let me set one up to test. I've been running my home network on a Silvercrest gateway without issues for the past day so perhaps it's something specific to the Sonoff.
I'm also seeing these issues since the 2023.12.x update using an HA Yellow, so it's not just Sonoff. The coordinator built into the Yellow hardware seems to trigger the issue as well. I uploaded my debug logs previously in this thread before migrating back to 2023.11.3 since the network was largely unusable, but let me know if you need more logs and I can try to upgrade again.
On Thu, Jan 11, 2024, 15:09 puddly @.***> wrote:
Let me set one up to test. I've been running my home network on a Silvercrest gateway without issues for the past day so perhaps it's something specific to the Sonoff.
— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/107490#issuecomment-1888047267, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACLHPI4U766AKRVB3IQUHTYOBPILAVCNFSM6AAAAABBQRKL4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBYGA2DOMRWG4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
There were a lot of changes between 2023.12.4 and 2024.1.0 so please try the latest version. If you still have issues, post a debug log of the integration reload.
Multi-PAN issues are not related to ZHA unless downgrading solves the problem. Keep in mind that ZHA prior to 2023.12.0 did not notify you when your coordinator was offline or unresponsive so it's very possible that you're not actually seeing any new issues that were not present in the past.
I tested 2024.1.2 and that's where I had the most recent issues on my HA Yellow. The logs above are from that version. Just noting that the issues started in 2023.12.x. Prior to that, ZHA was rock solid. Ever since, it's been very flaky.
On Thu, Jan 11, 2024 at 3:20 PM puddly @.***> wrote:
There were a lot of changes between 2023.12.4 and 2024.1.0 so please try the latest version. If you still have issues, post a debug log of the integration reload.
Multi-PAN issues are not related to ZHA unless downgrading solves the problem. Keep in mind that ZHA prior to 2023.12.0 did not notify you when your coordinator was offline or unresponsive so it's very possible that you're not actually seeing any new issues that were not present in the past.
— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/107490#issuecomment-1888060224, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACLHPPTUJ3JJSRDNTVZWRDYOBQRPAVCNFSM6AAAAABBQRKL4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBYGA3DAMRSGQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I tested 2024.1.2 and that's where I had the most recent issues on my HA Yellow. The logs above are from that version. Just noting that the issues started in 2023.12.x. Prior to that, ZHA was rock solid. Ever since, it's been very flaky. … On Thu, Jan 11, 2024 at 3:20 PM puddly @.> wrote: There were a lot of changes between 2023.12.4 and 2024.1.0 so please try the latest version. If you still have issues, post a debug log of the integration reload. Multi-PAN issues are not related to ZHA unless downgrading solves the problem. Keep in mind that ZHA prior to 2023.12.0 did not notify you when your coordinator was offline or unresponsive so it's very possible that you're not actually seeing any new issues that were not present in the past. — Reply to this email directly, view it on GitHub <#107490 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACLHPPTUJ3JJSRDNTVZWRDYOBQRPAVCNFSM6AAAAABBQRKL4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBYGA3DAMRSGQ . You are receiving this because you are subscribed to this thread.Message ID: @.>
If this is the case are you willing to try something drastic to help identify this? Would you be willing to try running the most recent version with all other integrations disabled? Just for a bit to see if the stability issue goes away?
The challenge is that it often takes a day or more for th issue to crop up (but then it tends to stay -- even full system reboots wouldn't bring it back last time -- I had to downgrade to get it working again). Would it work to wait to disable the other integrations until the issue crops up, and then turn the other integrations off? If so, I may be able to do that, but this is also my house, and not a test site, so my ability to have extended downtime is a bit limited (hence why I had to revert to 2023.11.3 where things are at least stable).
On Thu, Jan 11, 2024 at 5:06 PM David F. Mulcahey @.***> wrote:
I tested 2024.1.2 and that's where I had the most recent issues on my HA Yellow. The logs above are from that version. Just noting that the issues started in 2023.12.x. Prior to that, ZHA was rock solid. Ever since, it's been very flaky. … <#m3367827849832089470> On Thu, Jan 11, 2024 at 3:20 PM puddly @.> wrote: There were a lot of changes between 2023.12.4 and 2024.1.0 so please try the latest version. If you still have issues, post a debug log of the integration reload. Multi-PAN issues are not related to ZHA unless downgrading solves the problem. Keep in mind that ZHA prior to 2023.12.0 did not notify you when your coordinator was offline or unresponsive so it's very possible that you're not actually seeing any new issues that were not present in the past. — Reply to this email directly, view it on GitHub <#107490 (comment) https://github.com/home-assistant/core/issues/107490#issuecomment-1888060224>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACLHPPTUJ3JJSRDNTVZWRDYOBQRPAVCNFSM6AAAAABBQRKL4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBYGA3DAMRSGQ https://github.com/notifications/unsubscribe-auth/AACLHPPTUJ3JJSRDNTVZWRDYOBQRPAVCNFSM6AAAAABBQRKL4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBYGA3DAMRSGQ . You are receiving this because you are subscribed to this thread.Message ID: @.>
If this is the case are you willing to try something drastic to help identify this? Would you be willing to try running the most recent version with all other integrations disabled? Just for a bit to see if the stability issue goes away?
— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/107490#issuecomment-1888164831, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACLHPLMZP5C4WDPMHMSYBTYOB5BZAVCNFSM6AAAAABBQRKL4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBYGE3DIOBTGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
It’s worth a shot and I completely understand the impact this would have. No worries either way.
I have the same problem, I have to reboot 2-3 times for ZHA to startup correctly. I am using a Sonoff Zigbee 3.0 dongle. I had this occur before updating to 2024.1, but since then the Home Assistant frontend crashes completely after 1-2 minutes whenever ZHA does not start. I am not 100% sure that these two issues are related though.
These are the related log messages:
2024-01-13 20:07:45.709 DEBUG (MainThread) [homeassistant.components.zha] ZHA storage file does not exist or was already removed
2024-01-13 20:07:50.431 ERROR (bellows.thread_0) [bellows.uart] CRC error in frame b'9c2791907e' (b'9190' != b'140c')
2024-01-13 20:07:56.889 DEBUG (MainThread) [homeassistant.components.zha] Failed to set up ZHA
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/protocol.py", line 74, in command
return await future
^^^^^^^^^^^^
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/zha/__init__.py", line 163, in async_setup_entry
zha_gateway = await ZHAGateway.async_from_config(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 193, in async_from_config
await instance.async_initialize()
File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 211, in async_initialize
await app.startup(auto_form=True)
File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 226, in startup
await self.initialize(auto_form=auto_form)
File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 142, in initialize
await self.load_network_info(load_devices=False)
File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 251, in load_network_info
(nwk,) = await ezsp.getNodeId()
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 215, in _command
return await self._protocol.command(name, *args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/protocol.py", line 73, in command
async with asyncio_timeout(EZSP_CMD_TIMEOUT):
File "/usr/local/lib/python3.11/asyncio/timeouts.py", line 111, in __aexit__
raise TimeoutError from exc_val
TimeoutError
Same problem
2024-01-14 21:28:18.852 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: EzspError('EZSP is not running') 2024-01-14 21:28:18.853 WARNING (MainThread) [zigpy.application] Watchdog failure Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 665, in _watchdog_loop await self.watchdog_feed() File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 647, in watchdog_feed await self._watchdog_feed() File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 999, in _watchdog_feed (res,) = await self._ezsp.readCounters() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 212, in _command raise EzspError("EZSP is not running") bellows.exception.EzspError: EZSP is not running 2024-01-14 21:30:38.990 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140501092979264] Error handling message: Unknown error (unknown_error) M from 192.168.x.x (Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36) Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/websocket_api/decorators.py", line 26, in _handle_async_response await func(hass, connection, msg) File "/usr/src/homeassistant/homeassistant/components/zha/websocket_api.py", line 1047, in websocket_get_configuration zha_gateway = get_zha_gateway(hass) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/zha/core/helpers.py", line 459, in get_zha_gateway raise ValueError("No gateway object exists") ValueError: No gateway object exists 2024-01-14 21:30:39.125 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140501092979264] Error handling message: Unknown error (unknown_error) M from 192.168.x.x (Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36) Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/websocket_api/decorators.py", line 26, in _handle_async_response await func(hass, connection, msg) File "/usr/src/homeassistant/homeassistant/components/zha/websocket_api.py", line 1140, in websocket_get_network_settings backup = async_get_active_network_settings(hass) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/zha/api.py", line 43, in async_get_active_network_settings app = get_zha_gateway(hass).application_controller ^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/zha/core/helpers.py", line 459, in get_zha_gateway raise ValueError("No gateway object exists") ValueError: No gateway object exists 2024-01-14 21:30:45.635 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140501092979264] Error handling message: Unknown error (unknown_error) M from 192.168.x.x (Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36) Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/websocket_api/decorators.py", line 26, in _handle_async_response await func(hass, connection, msg) File "/usr/src/homeassistant/homeassistant/components/zha/websocket_api.py", line 1047, in websocket_get_configuration zha_gateway = get_zha_gateway(hass) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/zha/core/helpers.py", line 459, in get_zha_gateway raise ValueError("No gateway object exists") ValueError: No gateway object exists 2024-01-14 21:30:45.641 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140501092979264] Error handling message: Unknown error (unknown_error) M from 192.168.x.x (Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36) Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/websocket_api/decorators.py", line 26, in _handle_async_response await func(hass, connection, msg) File "/usr/src/homeassistant/homeassistant/components/zha/websocket_api.py", line 1140, in websocket_get_network_settings backup = async_get_active_network_settings(hass) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/zha/api.py", line 43, in async_get_active_network_settings app = get_zha_gateway(hass).application_controller ^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/zha/core/helpers.py", line 459, in get_zha_gateway raise ValueError("No gateway object exists") ValueError: No gateway object exists
whatever was changed in 2024.1.3 has made it even worse. what was once a week has happened about 4 times in 2 days
for what its worth, here is what it is saying mostly in the debug logs (other than the other ZHA log entries saying that DELIVERY_FAILED errors (understandably) (it's just occured again!)
2024-01-14 23:20:46.009 ERROR (MainThread) [zigpy.zcl] [0xA01A:1:0x0b04] Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/zigpy/zcl/init.py", line 411, in reply return await self._endpoint.reply( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/zigpy/endpoint.py", line 278, in reply return await self.device.reply( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/zigpy/device.py", line 483, in reply return await self.request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/zigpy/device.py", line 317, in request await send_request File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 833, in request await self.send_packet( File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 868, in sendpacket status, = await self._ezsp.sendUnicast( ^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'sendUnicast'
My network went on a restarting spree today, a lot of errors and timeouts. Devices dropped off but started rejoining when I lauched add devices. Apologies I cant pin point when things started to hit the fan but trying to fix it I had turn on debug a few times. Hopefully something jumps out home-assistant.log home-assistant_zha_2024-01-15T22-01-51.716Z.log home-assistant_zha_2024-01-15T21-27-11.054Z.log
There were no ZHA or library changes between 2024.1.2 and 2024.1.3 so I think the problem you're having is just randomly manifesting. The repeated restart issue will be fixed by https://github.com/home-assistant/core/pull/107963.
ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT
). This is only possible if you are having high CPU usage and/or some other integration/addon is slowing HA down enough to cause this.@harvindhillon I don't believe your issue is related to this one. Your log is littered with MAC_CHANNEL_ACCESS_FAILURE
(the radio refusing to transmit because it's too noisy). Similarly:
2024-01-15 21:53:08.918 WARNING (MainThread) [zigpy.application] Zigbee channel 11 utilization is 88.24%! 2024-01-15 21:53:08.918 WARNING (MainThread) [zigpy.application] If you are having problems joining new devices, are missing sensor updates, or have issues keeping devices joined, ensure your coordinator is away from interference sources such as USB 3.0 devices, SSDs, WiFi routers, etc.
Thanks, @puddly. To your hardware question:
I'm using the built-in Silicon Labs radio on an HA Yellow with a 4GB CM4 RPi driving it. It's in the normal Zigbee-only mode (not multiprotocol). My processor usage hovers around 5%-10%, so it's not like the system is over loaded (although I'm not sure how many things are bound by single core speed that those multi-core usage percentages may not reflect.)
I attached logs above and in the previous iteration of this ticket. I have reverted back to 2023.11.3 which was the last stable version of ZHA prior to this run of issues that started in 2023.12. I can give the latest brain a try again later this week if you need more logs. I did update the radio firmware recently, and haven't tested that against the latest releases yet.
Hmmm... as I said in my description, I am using Multi-PAN. So is my problem NOT a ZHA problem? I'm not using SkyConnect however, but Home Assistant Yellow, so https://github.com/home-assistant/addons/issues/3408 is not an exact match either.
As it happens - I do have a SkyConnect dongle. So I could go one way and set up a separate radio for Matter. As Zigbee is much more important to me right now, I would also be willing to disable Matter/Multi-PAN and just use ZHA if I can figure out how to do it without breaking anything.
so https://github.com/home-assistant/addons/issues/3408 is not an exact match either.
The firmware is identical for both so it's very likely the same issue.
if I can figure out how to do it without breaking anything.
There are documented steps here: https://yellow.home-assistant.io/guides/disable-multiprotocol/. It will migrate your network back to Zigbee-only.
Once that's done, plug in the SkyConnect, install the OpenThread Border Router addon: it'll flash your SkyConnect with Thread firmware. You can then push your preferred dataset to the same border router from the Thread configuration and replicate your multi-PAN setup with two stable radios.
Thanks @puddly, the energy scan warning is a weird one cause it is not always and I have my dongle on a 5m powered USB2 cable in the middle of my house, far from any radios. It fluctuates from as low as 15 to as high as 88. It is usually hovering around 76 though
"energy_scan": {
"11": 72.54901960784314,
That's my dongle, nothing around it. ZHA Channel 11 as wifi channel 1 is free
Apologise for the logs as I dont remember the timestamps but the integration was restarting multiple times on its own and was trying to capture that.
There are errors that I've not seen before
Logger: homeassistant
Source: /usr/src/homeassistant/homeassistant/runner.py:145
First occurred: 5:51:38 PM (77 occurrences)
Last logged: 6:30:23 PM
Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 1098, in request_callback_rsp
await self.request(request, timeout=timeout, **response_params)
File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 1052, in request
self._uart.send(frame)
^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'send'
so home-assistant/addons#3408 is not an exact match either.
The firmware is identical for both so it's very likely the same issue.
if I can figure out how to do it without breaking anything.
There are documented steps here: https://yellow.home-assistant.io/guides/disable-multiprotocol/. It will migrate your network back to Zigbee-only.
Once that's done, plug in the SkyConnect, install the OpenThread Border Router addon: it'll flash your SkyConnect with Thread firmware. You can then push your preferred dataset to the same border router from the Thread configuration and replicate your multi-PAN setup with two stable radios.
So I disabled Multi-PAN and my system is working now. Yay! Going to try setting up a separate radio for Thread next as suggested. Thanks!
Any solution for non-SkyConnect users in the works? ZHA is now reinitializing 5 minutes after reboot which takes about 10 minutes each time so it's basically unusable. It used to only have this issue once or twice a day.
is there a way of restarting HA as part of an automation? ie if detecting that the integration isn't available (or a device isn't) then restart HA completely? (I don't seem to be able to restart the integration manually even when i have access to the gui because it needs to be "up" for it to be able to restart... so i'm guessing it wouldnt work with restarting the integration itself.... although I dont mind trying). for a way of getting around this issue for now as I'm out of the house for a few days and my wife and family are at home, their life is going to be marred with frustration whilst i'm away when it inevitably craps out multiple times during the day and night and i'm not around to sort it.
thanks
is there a way of restarting HA as part of an automation? ie if detecting that the integration isn't available (or a device isn't) then restart HA completely? (I don't seem to be able to restart the integration manually even when i have access to the gui because it needs to be "up" for it to be able to restart... so i'm guessing it wouldnt work with restarting the integration itself.... although I dont mind trying). for a way of getting around this issue for now as I'm out of the house for a few days and my wife and family are at home, their life is going to be marred with frustration whilst i'm away when it inevitably craps out multiple times during the day and night and i'm not around to sort it.
thanks
I've set up an automation:
Trigger: one of my plugged in zigbee devices become unavailable Condition: check all my plugged in zigbee devices for unavailable state Action: HA reboot core
But I've since disabled it because ZHA keeps reinitializing every 5 minutes and it takes 10 to boot up again.
is there a way of restarting HA as part of an automation? ie if detecting that the integration isn't available (or a device isn't) then restart HA completely? (I don't seem to be able to restart the integration manually even when i have access to the gui because it needs to be "up" for it to be able to restart... so i'm guessing it wouldnt work with restarting the integration itself.... although I dont mind trying). for a way of getting around this issue for now as I'm out of the house for a few days and my wife and family are at home, their life is going to be marred with frustration whilst i'm away when it inevitably craps out multiple times during the day and night and i'm not around to sort it. thanks
I've set up an automation:
Trigger: one of my plugged in zigbee devices become unavailable Condition: check all my plugged in zigbee devices for unavailable state Action: HA reboot core
But I've since disabled it because ZHA keeps reinitializing every 5 minutes and it takes 10 to boot up again.
great. ok i've just created that. mine doesnt take that long (at all) to restart HA (more like a minute max). still a PITA but at least hopefully it shouldn't affect the family too much at home. thanks for the tips
There are documented steps here: https://yellow.home-assistant.io/guides/disable-multiprotocol/. It will migrate your network back to Zigbee-only.
Once that's done, plug in the SkyConnect, install the OpenThread Border Router addon: it'll flash your SkyConnect with Thread firmware. You can then push your preferred dataset to the same border router from the Thread configuration and replicate your multi-PAN setup with two stable radios.
Thank you very much - my Zigbee is working again without too much of a hassle.
Have also a HomeAssistant Yellow, out of the box. Nothing added. Had also to disable the Multiprocotol.
Hi, I'm using a SLB-06N coordinator (ezsp) over ethernet. When ZHA goes into its 'initializing' state, the dongle is still up and running and I can reach its admin console perfectly over HTTPS, so the issue is clearly on the ZHA side. A reboot of HA (without touching the dongle at all) fixes the issue. What logs can I add to help diagnosing this?
@ddeconin-gh Please enable ZHA debug logging, reload the integration, and leave debug logging on until ZHA enters the "initializing" state. After it recovers, disable debug logging, ZIP the log, and upload it.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
I've been experiencing this issue consistently of late.
I'll see if I can get debug logs.
It has been a constant issue for my setup as well. Recently everything got worse and had a non-responsive Zigbee network. I bought a second Skyconnect, disabled multi protocol, and I believe that solved the issues I had for both things, haven’t had a ZHA stuck in initializing since!
I just had it fail again.
I'm not sure the best way to pull the logs off the instance.
At any rate, here are some of the error logs that I'm seeing in the logs
2024-06-21 07:27:38.632 DEBUG (MainThread) [homeassistant.components.zha.core.gateway] Connection to the radio was lost: <NcpResetCode.ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT: 81>
DEBUG (MainThread) [homeassistant.components.zha.core.discovery] 'sensor' platform -> 'DeviceCounterSensor' using counter groups[counters] counter group[ezsp_counters] counter[ASH_OVERRUN_ERROR]
2024-06-21 08:26:39.923 DEBUG (MainThread) [homeassistant.components.zha.core.discovery] 'sensor' platform -> 'DeviceCounterSensor' using counter groups[counters] counter group[ezsp_counters] counter[ASH_FRAMING_ERROR]
2024-06-21 08:26:39.923 DEBUG (MainThread) [homeassistant.components.zha.core.discovery] 'sensor' platform -> 'DeviceCounterSensor' using counter groups[counters] counter group[ezsp_counters] counter[ASH_OVERFLOW_ERROR]
# command to get different errors
grep -i error zha.log|grep -v "Connection to the radio was lost" |grep -v "counter\[ASH_OVERRUN_ERROR\]" |grep -v "ASH_FRAMING_ERROR"|grep -v "ASH_OVERFLOW_ERROR"
Some logs with initialization in it
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0001]: Performing cluster handler specific initialization: ['battery_voltage', 'battery_percentage_remaining']
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0001]: finished cluster handler initialization
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0000]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0000]: finished cluster handler initialization
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x1000]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x1000]: finished cluster handler initialization
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0019]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0019]: initializing cached cluster handler attributes: ['current_file_version']
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0019]: finished cluster handler initialization
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0008]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0008]: finished cluster handler initialization
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0006]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0006]: finished cluster handler initialization
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0019]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0019]: initializing cached cluster handler attributes: ['current_file_version']
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0019]: finished cluster handler initialization
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0005]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0005]: finished cluster handler initialization
2024-06-21 08:26:39.936 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0300]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0300]: finished cluster handler initialization
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0003]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0001]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0000]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x1000]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0019]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0008]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0006]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0019]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0005]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0xBCA0:1:0x0300]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0xBCA0](TS004F): completed initialization
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0x4785](TS0044): started initialization
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:ZDO](TS0044): 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0000]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0000]: finished cluster handler initialization
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0001]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0001]: initializing uncached cluster handler attributes: ['battery_voltage', 'battery_percentage_remaining'] - from cache[True]
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0001]: Performing cluster handler specific initialization: ['battery_voltage', 'battery_percentage_remaining']
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0001]: finished cluster handler initialization
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0019]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0019]: initializing cached cluster handler attributes: ['current_file_version']
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0019]: finished cluster handler initialization
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0019]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0019]: initializing cached cluster handler attributes: ['current_file_version']
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0019]: finished cluster handler initialization
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0000]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0001]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0019]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:1:0x0019]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:2:0x0001]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.937 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:2:0x0001]: initializing uncached cluster handler attributes: ['battery_voltage', 'battery_percentage_remaining'] - from cache[True]
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:2:0x0001]: Performing cluster handler specific initialization: ['battery_voltage', 'battery_percentage_remaining']
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:2:0x0001]: finished cluster handler initialization
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:2:0x0001]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:3:0x0001]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:3:0x0001]: initializing uncached cluster handler attributes: ['battery_voltage', 'battery_percentage_remaining'] - from cache[True]
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:3:0x0001]: Performing cluster handler specific initialization: ['battery_voltage', 'battery_percentage_remaining']
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:3:0x0001]: finished cluster handler initialization
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:3:0x0001]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:4:0x0001]: initializing cluster handler: from_cache: True
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:4:0x0001]: initializing uncached cluster handler attributes: ['battery_voltage', 'battery_percentage_remaining'] - from cache[True]
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:4:0x0001]: Performing cluster handler specific initialization: ['battery_voltage', 'battery_percentage_remaining']
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:4:0x0001]: finished cluster handler initialization
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x4785:4:0x0001]: 'async_initialize' stage succeeded
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0x4785](TS0044): completed initialization
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0x0000](EZSP): started initialization
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.cluster_handlers] [0x0000:ZDO](EZSP): 'async_initialize' stage succeeded
2024-06-21 08:26:39.938 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0x0000](EZSP): completed initialization
Connection to the radio was lost: <NcpResetCode.ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT: 81>
means that something prevented ZHA from responding to your radio for about six seconds and the radio cut off the connection. This is usually a custom integration that blocks the Core event loop.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
The problem
As per previous issue (https://github.com/home-assistant/core/issues/105445) I am experiencing my ZHA randomly becoming completely unresponsive and seeing that the integration is sitting "initialising"
What version of Home Assistant Core has the issue?
core-2024.1.2
What was the last working version of Home Assistant Core?
core-2024.1.1
What type of installation are you running?
Home Assistant Container
Integration causing the issue
ZHA
Link to integration documentation on our website
No response
Diagnostics information
config_entry-zha-5fb366dc2478313fb3cb2b29c52254af.json.txt
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
No response