Closed freemann closed 6 months ago
Hey there @dmulcahey, @adminiuga, @puddly, @thejulianjes, mind taking a look at this issue as it has been labeled with an integration (zha
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
zha documentation zha source (message by IssueLinks)
The problem After some hours zha stops working and fails to initialise. After restarting HA it works again for some hours.
What version of Home Assistant Core has the issue? Core 2024.1.3 Supervisor 2023.12.1 Operating System 11.4 Frontend 20240104.0
What was the last working version of Home Assistant Core? core-2023.12.1
What type of installation are you running? HA OS on Pi4
Integration causing the issue ZHA, using HA Sky Connect Stick
Debug logs; config_entry-zha-3730e59617a2f4a9c117811e431d639a.json.txt home-assistant_zha_2024-01-18T09-11-29.985Z.log
This just happened again and the log showed the below;
2024-01-18 13:51:14.482 ERROR (MainThread) [bellows.uart] Lost serial connection: ConnectionResetError('Remote server closed connection')
And also
2024-01-18 13:51:14.485 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
Same issue for me. Recently I upgraded from HA Core 2024.1.2 to 2024.1.3. Since then every few hours the Zigbee integration fails. Reloading doesn't help. I have to restart HA Core or reboot the VM (HA OS on a Proxmox virtual machine, x86_64). I downgraded to HA Core 2024.1.2 and the issue is gone.
@jaccolo There was not a single ZHA code or dependency change between Home Assistant Core 2024.1.2 and 2024.1.3 so that is very surprising. If you can, please generate debug logs with both versions so that they can be compared.
@jaccolo There was not a single ZHA code or dependency change between Home Assistant Core 2024.1.2 and 2024.1.3 so that is very surprising. If you can, please generate debug logs with both versions so that they can be compared.
I will do that this weekend (it's not very convenient when HA controlled lights aren't working when it's dark, and when all lights in the home are HA controlled :-) )
@jaccolo I've just downgraded to 2024.1.2 and run debug logs,- see below for both versions of core.
@fredd589 You're using the multiprotocol addon. When you downgrade to 2024.1.2, are you also downgrading the Multiprotocol addon to a lower version along with Core?
ZHA crashed again tonight, with Core 2024.1.2. So downgrading does not solve it.
Downgrading to 2023.12.4 did help here (virtualenv + zha + zb-gw03 (EFR32MG21 with ncp-uart firmware).
I downgraded to 2023.12.3 (because I now that 2023.12.4 also had this issue, when thinking back). We were on vacation from 28-12-2023 and HA was on that moment on the most recent version. On vacation I noticed that ZHA was crashed and the Light Ghost didn't work because of the crash.
Now on 2023.12.3 ZHA is complety ** it's not starting at all. Maybe that's because of the SkyConnect updates which are not backwards compatible??
Attached 2 logfiles for; Core 2023.12.3 Supervisor 2023.12.1 Operating System 11.4 Frontend 20231208.2
home-assistant_zha_2024-01-19T10-20-32.381Z.log home-assistant_zha_2024-01-19T10-34-09.315Z.log
2023.12.4 also not working... From boot the ZHA integration fails to setup; "Failed setup, will retry"
Edit: Upgraded to 2024.1.1, ZHA is up and running... for how long...
2024.1.1 crashed. At 21:24:55 it worked, now it's crashed. Tried to restart... it's not working..
Logfile won't upload... will try tomorrow
updating to 2024.1.2...
@fredd589 You're using the multiprotocol addon. When you downgrade to 2024.1.2, are you also downgrading the Multiprotocol addon to a lower version along with Core?
I did not downgrade the multi protocol, I have also not had a crash since downgrading core.
Try todays release please
Upgrading to 2024.1.4 Thanks for your time and effort!
2024.1.4 is still going strong
Update 2024.1.4 did not solve the problem, after a few hours it loses all devices again unfortunately.
Hello, same issues here. After update, the same error again :( need to restart HA to work again....for few hours...
No crash here after ~7 hours
error_log-5.txt It failed...
[edit 15:30] downgrade to 2024.1.2
Only a compleet reboot will fix this temporally, until it fails again.
I have experience this all week.
But then today, I tried to disable the ZHA integration and Enable it again a few minutes later. Now ZHA works and have been stable for the passed 5 hours. However now my Ikea Integration is failing to start... Not sure if there is an underlying Home Assistant problem which is causing the integrations not to start correctly.
I just disabled the ZHA integration waited a few minutes and enabled it. The strange thing was that first it gave a failed initialized and the second time it started ok. Lets see whats going to happen
2024.1.4 feels promising. No crash since ~8 hours.
Here 2024.1.4 crashes after 7,5 hours. But the crashes seems to occur randomly.... Sometimes a few hours, then a day.
@freemann It looks like you are using the Multiprotocol addon. This isn't an issue with ZHA: https://github.com/home-assistant/addons/issues/3408
Updated this morning to latest Core (2024.1.4). Worked fine for about 10 hours. Now crashed again. Added debug logging of ZHA to this message (gzipped, because it is quite large). home-assistant_zha_2024-01-20T18-40-23.118Z.log.gz
System:
➜ ~ dmesg | grep -i 'usb 2-1'
[ 1.197162] usb 2-1: new full-speed USB device number 2 using xhci_hcd
[ 1.332412] usb 2-1: New USB device found, idVendor=10c4, idProduct=ea60, bcdDevice= 1.00
[ 1.333495] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1.334422] usb 2-1: Product: SkyConnect v1.0
[ 1.335126] usb 2-1: Manufacturer: Nabu Casa
[ 1.335818] usb 2-1: SerialNumber: 581e2249c596ed11820ec698a7669f5d
[ 2.624959] usb 2-1: cp210x converter now attached to ttyUSB0
➜ ~ ha core info
arch: amd64
audio_input: null
audio_output: null
backups_exclude_database: false
boot: true
image: ghcr.io/home-assistant/qemux86-64-homeassistant
ip_address: 172.30.32.1
machine: qemux86-64
port: 8123
ssl: false
update_available: false
version: 2024.1.4
version_latest: 2024.1.4
watchdog: true
What I think might be relevant is this (from the logging):
2024-01-20 19:06:57.276 DEBUG (MainThread) [bellows.uart] Connection lost: ConnectionResetError('Remote server closed connection')
2024-01-20 19:06:57.277 ERROR (MainThread) [bellows.uart] Lost serial connection: ConnectionResetError('Remote server closed connection')
2024-01-20 19:06:57.277 DEBUG (MainThread) [bellows.ezsp] socket://core-silabs-multiprotocol:9999 connection lost unexpectedly: Remote server closed connection
2024-01-20 19:06:57.277 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
2024-01-20 19:06:57.278 DEBUG (MainThread) [bellows.zigbee.application] Received _reset_controller_application frame with ("Serial connection loss: ConnectionResetError('Remote server closed connection')",)
2024-01-20 19:06:57.278 DEBUG (MainThread) [zigpy.application] Connection to the radio has been lost: "Serial connection loss: ConnectionResetError('Remote server closed connection')"
2024-01-20 19:06:57.278 DEBUG (MainThread) [homeassistant.components.zha.core.gateway] Connection to the radio was lost: "Serial connection loss: ConnectionResetError('Remote server closed connection')"
2024-01-20 19:06:57.278 DEBUG (MainThread) [bellows.uart] Connection lost: None
2024-01-20 19:06:57.278 DEBUG (MainThread) [bellows.uart] Closed serial connection
2024-01-20 19:06:57.279 DEBUG (MainThread) [homeassistant.components.zha.core.gateway] Shutting down ZHA ControllerApplication
@jaccolo As mentioned above, it looks like you are using the Multiprotocol addon. This isn't an issue with ZHA, nor will this be fixed by a Core update: https://github.com/home-assistant/addons/issues/3408
@jaccolo As mentioned above, it looks like you are using the Multiprotocol addon. This isn't an issue with ZHA, nor will this be fixed by a Core update: home-assistant/addons#3408
@puddly The logging seems to tell me that ZHA loses the connection with the Skyconnect USB device (UART = serial). When the connection is lost, is seems logical that Multiprotocol isn't working too, because that uses the same device. So I don't think it's multiprotocol related, but USB/serial/device related. Is everyone experiencing this issue, using the Skyconnect USB device?
The logging seems to tell me that ZHA loses the connection with the Skyconnect USB device
ZHA does not communicate with the SkyConnect at all when you're using the multiprotocol addon. The multiprotocol addon communicates with the SkyConnect and then exposes a TCP server for ZHA to communicate with. The "serial" protocol is used with the raw TCP socket so the "UART" disconnect is the multiprotocol addon's TCP server shutting down and resetting.
@puddly Thanks for explaining. Unfortunately the logging of "Silicon Labs Multiprotocol" add-on is gone after restarting Core. I will wait for the next crash and check that logging. Perhaps it isn't related at all to ZHA, and is the Multiprotocol addon causing issues.
I'm going now to disable the Multiprotocol, lets see!
I'm going now to disable the Multiprotocol, lets see!
Me to, lets see!
done!
Success! Options successfully saved.
For others, here's a step by step howto disable MultiProtocol; https://skyconnect.home-assistant.io/procedures/disable-multiprotocol/
Same problem for me and yes, I'm also using the SkyConnect. ZHA crashes every few hours with a zigpy error.
Same problem for me and yes, I'm also using the SkyConnect. ZHA crashes every few hours with a zigpy error.
Are you using multi protocol?
Yes I did. Just deactivated it after I read @freemann s post. As I don't use any thread devices at the moment it's no problem to do so. For the future that won't be a solution, though. I also can't say if this fixes the problem yet, as I just deactivated it like 20 minutes ago, after posting.
Is no one reading the giant warning when enabling multiprotocol? You legit have to agree to to potentially break things to even enable it…
No, sorry. Even dont know how and when I turned it on...
OK, it could break thing... But you wont expect it hangs HA and kills ZHA by enabling this. But I dont need so disabled it.
Thanks!
Is no one reading the giant warning when enabling multiprotocol? You legit have to agree to to potentially break things to even enable it…
But it worked fine, just since one week its causes issues. If you never test, you will never know :P
Same as the others. If there was a giant warning, I don't remember it. I own the SkyConnect since it's original release, tested it and it worked until a few days ago. As the log mentions zha and zigpy errors, I wouldn't have thought about the possibility of multiprotocol being the real error.
2024.1.4 crashed here too after half a day. I'm back on 2023.12.4. I don't use a multiprotocol setup but my zigbee modem is attached using a network connection.
I'm having the same issue as well. Restarting HA fixes it, but it inevitably starts dying within a few hours.
Does anybody have a known stable version to revert to? Or is this seemingly a SkyConnect internal issue?
@tdalbo92 Please check if you are using multi protocol and if so, disable it; https://skyconnect.home-assistant.io/procedures/disable-multiprotocol/
After disabling MultiProtocol and updating to 2024.1.5 everything looks stable again.
I can attempt this for debugging, but this disables Thread, correct?
I can attempt this for debugging, but this disables Thread, correct?
Yes, you should only disable of you don't use Thread. 2024.1.5 should have some fixes and there was a add-on update for OpenThread Border Router. Also see; https://github.com/home-assistant/addons/issues/3408
I'll migrate my Thread devices to my Amazon Echo for the time being, and follow the progress in that thread for updates.
Thank you!
At my side everything is also stable now, thanks @dmulcahey!
Disabled multiprotocol, and running stable for 20 hours now
The problem
Randomly ZHA crashes. After the crash it fails to initialize and only a full reboot fixes the issue for X hours. Here HA itself doesn't crash and keeps running normal I enabled debug log but the logfile is 87MB and can't add it.
Ha version: Core 2024.1.3 Supervisor 2023.12.1 Operating System 11.4
What version of Home Assistant Core has the issue?
2024.1.3
What was the last working version of Home Assistant Core?
2023.1.2
What type of installation are you running?
Home Assistant Supervised
Integration causing the issue
ZHA
Link to integration documentation on our website
No response
Diagnostics information
config_entry-zha-d74ff06558e3c8e562710d33dd0e67f2.json
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response