home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.11k stars 29.79k forks source link

ZHA doesn't work - Zigbee devices do not response #83876

Closed patrickschoenherr closed 1 year ago

patrickschoenherr commented 1 year ago

The problem

A few days ago some zigbee devices started to stop responding. Mainly I have Aqara Weather and Door Sensors installed. The problems started to begin after updating to Version 2022.12.0.

Meanwhile I re-installed the whole ZHA integration. But the problems remained. The whole system was working fine for nearly half a year before.

What version of Home Assistant Core has the issue?

2022.12.3

What was the last working version of Home Assistant Core?

2022.11.2

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ZHA

Link to integration documentation on our website

https://www.home-assistant.io/integrations/zha/

Diagnostics information

zha-b3f090497b80dab05d0c6d9a82d5ac2c-Zigbee Coordinator-d61b3b60560acc1d66922f1b3176ec90.json.txt config_entry-zha-b3f090497b80dab05d0c6d9a82d5ac2c.json (1).txt

Example YAML snippet

No response

Anything in the logs that might be useful for us?

Logger: zigpy_znp.zigbee.application
Source: /usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py:652
First occurred: 21:45:00 (22 occurrences)
Last logged: 22:41:32

Watchdog check failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 998, in request
    response = await response_future
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py", line 650, in _watchdog_loop
    await self._znp.request(c.SYS.Ping.Req())
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 994, in request
    async with async_timeout.timeout(
  File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 129, in __aexit__
    self._do_exit(exc_type)
  File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 212, in _do_exit
    raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError

Logger: zigpy.application
Source: runner.py:120
First occurred: 22:06:17 (28 occurrences)
Last logged: 22:40:17

Unknown device AddrModeAddress(addr_mode=<AddrMode.NWK: 2>, address=0x853A)
Unknown device AddrModeAddress(addr_mode=<AddrMode.NWK: 2>, address=0xD8A6)

Logger: homeassistant
Source: /usr/src/homeassistant/homeassistant/runner.py:97
First occurred: 22:27:37 (1 occurrences)
Last logged: 22:27:37

Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 786, in _discover_unknown_device
    return await zigpy.zdo.broadcast(
  File "/usr/local/lib/python3.10/site-packages/zigpy/device.py", line 515, in broadcast
    return await app.broadcast(
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 760, in broadcast
    await self.send_packet(
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py", line 1026, in send_packet
    response = await self._send_request_raw(
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py", line 901, in _send_request_raw
    response = await self._znp.request(
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 991, in request
    self._uart.send(frame)
AttributeError: 'NoneType' object has no attribute 'send'

Logger: zigpy_znp.zigbee.application
Source: /usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py:739
First occurred: 22:22:19 (3 occurrences)
Last logged: 22:44:33

Failed to reconnect
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 1037, in request_callback_rsp
    await self.request(request, timeout=timeout, **response_params)
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 998, in request
    response = await response_future
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py", line 734, in _reconnect
    await self.initialize()
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 111, in initialize
    await self.permit(0)
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py", line 302, in permit
    response = await self._znp.request_callback_rsp(
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 1036, in request_callback_rsp
    async with async_timeout.timeout(timeout):
  File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 129, in __aexit__
    self._do_exit(exc_type)
  File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 212, in _do_exit
    raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError

Additional information

No response

home-assistant[bot] commented 1 year ago

Hey there @dmulcahey, @adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration (zha) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `zha` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Change the title of the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign zha` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


zha documentation zha source (message by IssueLinks)

markdchurchill commented 1 year ago

I'm getting a lot of this since the update. Most of my zigbee devices have dropped off the network.

All phillips hue and osram bulbs are unavailable.

Ikea Tradfri bulbs seem happy.

The garage door sensor seems ok (uses zhaquirks.xiaomi.aqara.magnet_aq2.MagnetAQ2 if relevant) and a Sonoff motion sensor is ok.

A lot of this in the logs:

2022-12-13T08:33:41.442486963Z 2022-12-13 18:33:41.441 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x2598:3:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Timeout waiting for ACK'), DeliveryError('Timeout waiting for ACK'), DeliveryError('Timeout waiting for ACK'), DeliveryError('TXStatus.ADDRESS_NOT_FOUND')]

also

2022-12-13T09:38:34.462342022Z 2022-12-13 19:38:34.461 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x8172:11:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Timeout waiting for ACK'), DeliveryError('Timeout waiting for ACK'), DeliveryError('Timeout waiting for ACK'), DeliveryError('Timeout waiting for ACK')]

I don't know if this is relevant, but it looked sus, and came up early on: 2022-12-13T07:04:49.740517523Z 2022-12-13 17:04:49.739 WARNING (MainThread) [zigpy_xbee.zigbee.application] Announced IEEE 00:17:88:01:08:63:94:4d is different from originator ff:ff:ff:ff:ff:ff:ff:ff

I have zigbee logs set to debug from dealing with some teething problems a few years back, so can probably run some tests if needed.

devployment commented 1 year ago

Same here. Same issue with most devices "became unavailable" in the logbook. Same error async_initialize: all attempts have failed in the logs.

In my case, coordinator is a ConBee II stick.

TripelJ commented 1 year ago

Yeah same here. I setup new with sky connect stick. Worked initially fine. Then it stopped with the same error messages as above. https://community.home-assistant.io/t/zha-new-setup-on-skyconnect-worked-shortly/502172/3

Grey-Lancaster commented 1 year ago

was working fine Now Error setting up entry HubZ Smart Home Controller, s/n: 61201F3A - Silicon Labs for zha 11:35:25 AM – (ERROR) components/zha/core/gateway.py - message first occurred at 11:24:22 AM and shows up 2 times Couldn't start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 1 of 3) 11:35:25 AM – (WARNING) Zigbee Home Automation - message first occurred at 11:24:15 AM and shows up 6 times

EDIT: VM in synology. Had to remove and re-add the usb stick and it is working again

melkkoe commented 1 year ago

Same here :(. Installed backup to 2022-12.1 but without success. Remove and re-add stick didnt work either

Logger: homeassistant.config_entries Source: components/zha/core/gateway.py:172 First occurred: 09:50:23 (4 occurrences) Last logged: 17:50:41

Error setting up entry TI CC2531 USB CDC, s/n: __0X00124B0014D99B6F - Texas Instruments for zha Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 664, in _skip_bootloader result = await responses.get() File "/usr/local/lib/python3.10/asyncio/queues.py", line 159, in get await getter asyncio.exceptions.CancelledError

Logger: homeassistant.components.zha.core.gateway Source: components/zha/core/gateway.py:172 Integration: Zigbee Home Automation (documentation, issues) First occurred: 09:49:35 (12 occurrences) Last logged: 17:50:41

Couldn't start ZNP = Texas Instruments Z-Stack ZNP protocol: CC253x, CC26x2, CC13x2 coordinator (attempt 1 of 3) Couldn't start ZNP = Texas Instruments Z-Stack ZNP protocol: CC253x, CC26x2, CC13x2 coordinator (attempt 2 of 3) Couldn't start ZNP = Texas Instruments Z-Stack ZNP protocol: CC253x, CC26x2, CC13x2 coordinator (attempt 3 of 3) Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 664, in _skip_bootloader result = await responses.get() File "/usr/local/lib/python3.10/asyncio/queues.py", line 159, in get await getter asyncio.exceptions.CancelledError

Grey-Lancaster commented 1 year ago

Same here :(. Installed backup to 2022-12.1 but without success. Remove and re-add stick didnt work either Is it in a Synology VM? If so try removing it again. save it then go back and add it. (Perhaps with HA shut down)

sqeezee commented 1 year ago

Same issue, running on VM, fresh instalation (new sky connect stick) - for me helped to disconect stick for little longer (1min or so..) Its working for an hours without issue...

puddly commented 1 year ago

Nothing changed between 2022.11.x and 2022.12.x related to network operation or stick communication. The only thing that happens is that you restart Home Assistant. The Conbee specifically has issues with this, as do some VMs.

If you're using a CC2531, it is also known to "lock up" and must be physically unplugged and plugged back in.

sqeezee commented 1 year ago

Nothing changed between 2022.11.x and 2022.12.x related to network operation or stick communication. The only thing that happens is that you restart Home Assistant. The Conbee specifically has issues with this, as do some VMs.

If you're using a CC2531, it is also known to "lock up" and must be physically unplugged and plugged back in.

I'm using SkyConnect...

TripelJ commented 1 year ago

Same issue, running on VM, fresh instalation (new sky connect stick) - for me helped to disconect stick for little longer (1min or so..) Its working for an hours without issue...

Good to hear it’s working for you for a while. I don’t that have unfortunately when I unplug my skyconnect from the rpi4. I did get mine to work for few hours one time as well. But don’t know what triggered it. Shutdown / restarts, moving to other usb port etc. but now I have no luck. The devices can be seen, the network map is shown etc. but I can’t control any device and will get the mentioned error: async_initialize: all attempts have failed. frustrated. Tomorrow I’ll try and add an IKEA bulb to see if it works. As I can understand most say it does not work with Philips hue at all, but with other devices like Aqara or IKEA.

melkkoe commented 1 year ago

Nothing changed between 2022.11.x and 2022.12.x related to network operation or stick communication. The only thing that happens is that you restart Home Assistant. The Conbee specifically has issues with this, as do some VMs.

If you're using a CC2531, it is also known to "lock up" and must be physically unplugged and plugged back in.

Thanks for your reply.

I use a RPI 4B with a cc2531 usb stick. I already tried to physically unplug/plug in, but maybe i had to wait an extra minute or so before plugging back in , as i read here in the comments. (Will try that )

robotastic-io commented 1 year ago

I can confirm this. Also get a lot of

2022-12-17 17:56:55.165 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x2ECB:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Request failed after 5 attempts: <Status.MAC_CHANNEL_ACCESS_FAILURE: 225>'), DeliveryError('Request failed after 5 attempts: <Status.MAC_CHANNEL_ACCESS_FAILURE: 225>'), DeliveryError('Request failed after 5 attempts: <Status.MAC_CHANNEL_ACCESS_FAILURE: 225>'), DeliveryError('Request failed after 5 attempts: <Status.MAC_CHANNEL_ACCESS_FAILURE: 225>')]

Additional Info: HASIO: 2022.12.6 x86-64

Zigbee Stick: Texas Instruments CC1352/CC2652

Restarting works for a bit, but then the connection fails again. I also just moved hardware from a pi3 to a x86 thin client. Was working fine for the first two days.

Adminiuga commented 1 year ago

What Status.MAC_CHANNEL_ACCESS_FAILURE means? Does Google say anything about it?

roobre commented 1 year ago

Similar issue here, using the Sonoff Zigbee Dongle 3 Plus, which I've been using without problems for almost a year now config_entry-zha-4a97dded6412306a97f1246b786a0649.json.txt.

In my case this stopped working completely when I upgraded to 2022.12.6. Previously, on 2022.12.0, most devices worked fine. Downgrading to 2022.12.0 or 2022.11.6 did not work either.

Some interesting observations: Receiving works fine, most of my sensors (thermometers and such) work fine and provide updated results. Some lights also accurately report their status, which changes if I physically press the switch. Those lights, however, I cannot control from hass, the operation just times out. Finally, there are some zigbee swtiches for which hass doesn't even read their state. I cannot find any distance pattern in this behavior: Devices pretty close to the coordinator may not work at all, while other further away may work fully.

2022-12-17 16:58:44.211 ERROR (SyncWorker_5) [homeassistant.components.dhcp] Cannot watch for dhcp packets: [Errno 1] Operation not permitted
2022-12-17 16:59:04.809 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x030C:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:04.837 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x030C:1:0x0b04]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:05.926 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x030C:1:0x0702]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:08.557 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x15C9:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:08.776 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xD57E:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:08.856 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x70DF:1:0x0b04]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:08.862 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xF26B:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:08.924 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xB111:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.060 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xF26B:1:0x0b04]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.145 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x0EB0:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.360 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x70DF:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.470 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x0EB0:1:0x0702]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.540 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x444B:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.615 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xD57E:1:0x0702]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.642 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xD57E:1:0x0b04]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.691 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x444B:1:0x0702]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.802 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xF26B:1:0x0702]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.832 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x190B:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:09.903 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x7168:1:0x0b04]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:10.014 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x7168:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:10.047 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xFE2C:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>')]
2022-12-17 16:59:10.084 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xFE2C:1:0x0008]: async_initialize: all attempts have failed: [DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>')]
2022-12-17 16:59:10.171 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x444B:1:0x0b04]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:10.249 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x6626:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Request failed after 5 attempts: <Status.MAC_NO_ACK: 233>'), DeliveryError('Request failed after 5 attempts: <Status.MAC_NO_ACK: 233>'), DeliveryError('Request failed after 5 attempts: <Status.MAC_NO_ACK: 233>'), DeliveryError('Request failed after 5 attempts: <Status.MAC_NO_ACK: 233>')]
2022-12-17 16:59:10.350 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x70DF:1:0x0702]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:11.126 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x7168:1:0x0702]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:11.446 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x0EB0:1:0x0b04]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:13.123 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x8C57:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:13.563 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xD5DA:1:0x0702]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:13.713 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xD5DA:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:14.128 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xD5DA:1:0x0b04]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 16:59:15.378 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x549B:1:0x0006]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 17:00:41.013 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x3308:1:0x0402]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 17:00:41.522 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x3308:1:0x0405]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]
2022-12-17 17:00:41.826 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x3308:1:0x040d]: async_initialize: all attempts have failed: [TimeoutError(), TimeoutError(), TimeoutError(), TimeoutError()]

EDIT: This may probably seem weird, but after a couple hours, my Zigbee devices started to become responsive again. Power cycling some of them seemed to also help.

I don't have a clue why this happened or if this is related to this issue or not. I certainly hadn't seen this situation before.

dumpfheimer commented 1 year ago

I had this too while testing sky connect but switched back to ZNP pretty soon because of this. ZNP is working perfectly fine on 2022.12.

AndroGenom commented 1 year ago

The same her, since update from 2022.11.4 to 2022.12.6, then 2022.12.7 - ZHA integration does not really work. Automation (I have only related to thermostats automation at this stage), which worked before is no longer working.

Executed: December 18, 2022 at 8:34:10 PM
Result:
params:
  domain: climate
  service: set_temperature
  service_data:
    hvac_mode: heat
    temperature: 19.5
    entity_id:
      - climate.lumi_lumi_airrtc_agl001_thermostat
  target:
    entity_id:
      - climate.lumi_lumi_airrtc_agl001_thermostat
running_script: false
limit: 10

Results in Errors in the log:

Logger: homeassistant.core
Source: components/zha/core/channels/hvac.py:301
First occurred: 12:05:29 AM (17 occurrences)
Last logged: 8:34:41 PM

Error executing service: <ServiceCall climate.set_temperature (c:01GMJN8YTT98RBC7JFKBD75X43): hvac_mode=heat, temperature=15.0, area_id=['living_room'], device_id=['*<here the device number>*']>
Error executing service: <ServiceCall climate.set_temperature (c:01GMJNGVZ9B8WEXQNTVQAR8PB8): hvac_mode=heat, temperature=15.0, area_id=['living_room'], device_id=['*<here the device number>*']>
Error executing service: <ServiceCall climate.set_temperature (c:01GMKA9DYBSM2Q0ET8TGE7T4SA): hvac_mode=heat, temperature=20.0, entity_id=['climate.lumi_lumi_airrtc_agl001_thermostat']>
Error executing service: <ServiceCall climate.set_temperature (c:01GMKAQAQPKWQF8MSJV1MPTHAY): hvac_mode=heat, temperature=20.0, entity_id=['climate.lumi_lumi_airrtc_agl001_thermostat']>
Error executing service: <ServiceCall climate.set_temperature (c:01GMKBNF2WKVBBHNECFEP26PYX): hvac_mode=heat, temperature=19.5, entity_id=['climate.lumi_lumi_airrtc_agl001_thermostat']>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/core.py", line 1763, in catch_exceptions
    await coro_or_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1782, in _execute_service
    await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service
    await service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 943, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/climate/__init__.py", line 613, in async_service_temperature_set
    await entity.async_set_temperature(**kwargs)
  File "/usr/src/homeassistant/homeassistant/components/zha/climate.py", line 463, in async_set_temperature
    success = await thrm.async_set_heating_setpoint(
  File "/usr/src/homeassistant/homeassistant/components/zha/core/channels/hvac.py", line 266, in async_set_heating_setpoint
    if not await self.write_attributes(data):
  File "/usr/src/homeassistant/homeassistant/components/zha/core/channels/hvac.py", line 301, in write_attributes
    res = await self.cluster.write_attributes(data, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/zigpy/zcl/__init__.py", line 567, in write_attributes
    return await self.write_attributes_raw(attrs, manufacturer)
  File "/usr/local/lib/python3.10/site-packages/zigpy/zcl/__init__.py", line 573, in write_attributes_raw
    result = await self._write_attributes(attrs, manufacturer=manufacturer)
  File "/usr/local/lib/python3.10/site-packages/zigpy/zcl/__init__.py", line 324, in request
    return await self._endpoint.request(
  File "/usr/local/lib/python3.10/site-packages/zigpy/endpoint.py", line 237, in request
    return await self.device.request(
  File "/usr/local/lib/python3.10/site-packages/zigpy/device.py", line 312, in request
    return await asyncio.wait_for(req.result, timeout)
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

Script ot automation - does not matter. There is no possibility to put thermostats to state "off" - system reports status off, but thermostat does not change its state.

and another error appeared in the log:

Logger: zigpy.application
Source: /usr/local/lib/python3.10/site-packages/zigpy/application.py:134
First occurred: 9:42:33 PM (1 occurrences)
Last logged: 9:42:33 PM

Couldn't start application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 132, in startup
    await self.initialize(auto_form=auto_form)
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 77, in initialize
    await self.load_network_info(load_devices=False)
  File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 209, in load_network_info
    status, node_type, nwk_params = await ezsp.getNetworkParameters()
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

Automation where power plug is on and off - still works.

Additional info:

MattWestb commented 1 year ago

Servus @dumpfheimer. Then switching to and from one EZSP coordinator ZHA is restoring the network setting but EZSP adapters is using hashed link key and other adapter cant using it and the other way around EZSP is not using the saved TC-Link keys from "other" adapters. Best is then you is going to or from EZSP adapter and "others" is not migrating the network only just form one new one or you is getting problems.

If changing EZSP to one other EZSP all the network setting is being OK but device binding and reporting setting is pointed to the old adapters long address (IEEE) and may making end device not reporting OK if not re configuring them or burning the new adapter with the old IEEE (only possible doing ones).

dumpfheimer commented 1 year ago

@MattWestb thanks for sharing! Very disappointed to hear that.

I have a few questions now.

  1. Does that mean the migration between other vendors and EZSP is essentially defective?
  2. While migrating ZHA asked me if I want to change the IEEE of the sky connect. Was that the on time IEEE burn you are speaking of?
  3. Is there any way around that issue? My network is at this point too large for me to be willing to re-pair all of the devices

Appreciate your help

AndroGenom commented 1 year ago

Servus @MattWestb,

Thanks for some clarification. I guess I am not completely clear about required steps to get issue being resolved. Could you please add few lines explaining what exactly should to be done (at least to try something to get rid of this issue)?

Appreciate your help Ein großes Dankeschön für Deine Hilfe

MattWestb commented 1 year ago

@AndroGenom I dont knowing what is wrong but its looks like bellows cant starting but i dont knowing way and then ZHA cant starting OK and your automation is failing.

@dumpfheimer Silabs have one great implantation with "hashed TC-Link keys" and they is only saving the hash in the NVM and no keys for devices then its calculated from the device IEEE. The back side is other (TI and deCONZ) cant do the same thing. If merged one EZSP to one no EZSP you is getting the network key and counters for it but is getting problem with routing and other things then the TC-link keys in all devices is wrong (need repairing for getting it updated in the devices) and then going from other to EZSP you is getting the TC-Link keys (standard max 12 but its depending of the firmware i cooking with 127) and can working OK but its not grantee for that.

Yes if you saying "YESSS" is the IEEE changing (burned in protected flash) from the factory one and can only being done ones. If you like resetting it to chips default you need one real J-Link probe and soldering connection on the PCB (i have doing it on my pre series Sky Connect). If you need getting it done i can helping you (Wine 11).

I was migrating 29 device in / from my EZSP / Thread RCP test networks and the best way is putting up one new (Z)HA and resetting nearby routers and adding them in the new ZHA (no importing the network setting must forming one new one !!!). But if you have burning the IEEE in sky Connect its little tricky not getting network conflicts (2 devices with the same IEEE "in the air".

Best practice i adding all routers as back bone before adding end devices but then covering it can being good letting some in the old system so its working in the mean time.

Its one tricky situation and best is making one small test network and see its working OK before migrating all devices and getting problems later (My production system with around 50 devices is still running on original network setting only forced changing channel one time).

Also with EZSP is good blocking the coordinator having end device as parent for not getting problem with some sleepers (IT is having the same problem or worse but cant forcing it).

dumpfheimer commented 1 year ago

Thanks again, @MattWestb So, concerning the IEEE burning. This is the message one is confronted with when migrating a coordinator to EZSP:

Your backup has a different IEEE address than your radio. For your network to function properly, the IEEE address of your radio should also be changed. This is a permanent operation.

Am I the only one who feels like this message should put a stronger emphasis on this being a ONE TIME PER DEVICE operation? For me permanent is the opposite of temporary. E.g. if I restart the device, it stays there. English is not my first language and this may be bad interpretation from me, but I would much prefer a clearer message, stating specifically that this can be done only once per device.

In addition, if it is not possible to successfully migrate from or to EZSP I would love for the migration tool to warn about this circumstance. I would like to believe that there will be quite a few people migrating to Sky Connect in the next weeks and months and this could save them a lot of trouble.

If I am not the only one who feels that way I would open 2 issues, one for the IEEE burn warning and one for the EZSP migration warning. Would like to hear what you think first, though

patrickschoenherr commented 1 year ago

Are there any ideas why the integration isn't working, aside from the problem with ezsp migration etc.? I didn't migrate from or to an other devise/stick. The problems for me occurred after updating to 2022.12.

dumpfheimer commented 1 year ago

Did you change anything else? Like coordinator firmware? Does it work when you revert to 2022.11?

devployment commented 1 year ago

Can second what @patrickschoenherr said. Same for me. No change whatsoever. No device added. No firmware upgrade on the coordinator. Nothing. Just updated HA. Since then, nothing works anymore related to ZHA.

Edit: Reverting to different version does not bring it back either

patrickschoenherr commented 1 year ago

I didn't change anything else. I just realized the problems when I tried to connect some Hue lights. Those didn't react as expected.

I didn't try an downgrade to 2022.11 yet. Maybe I'll give it a try on the weekend. Before I do this I wanted to find out if other users experienced the same problems.

dumpfheimer commented 1 year ago

Could you turn on debug logging and check for log messages in this form?

Detected Z-Stack ... Connection to ... failed, cleaning up We were disconnected from ...

MattWestb commented 1 year ago

Most of the problems if running virtual or docker environment is connecting USB devices that the host system is not doing OK or one USB device is "rebooting" and the host system is losing it and then the upper system cant connecting to it until one hard reboot of the host system. The second problem if is getting radio interference from nearby devices like USB-SSD. Use USB2 ports and extension cable for getting the radio antenna away from all 2.4 GHz sources like WiFi. Bt, USB and other radio signals from the host system.

Im not one code worrier but have knowledge and experience of hardware and radio and have working with it pritty long time with it but debug logging is A&O for our devs can getting information for diagnose what is not working OK in your systems.

MattWestb commented 1 year ago

@dumpfheimer I agree with you and we was having one very long time ZHA user that was starting with one EM3581 stick and have migrating his system more times and was trying using one coordinator (with the burned IEEE that was the same as the original and the current coordinator) as router and in the end the hi must forming one new network and rejoining all devices for getting it back "in the air" and cant using the old coordinator in the same radio distance ad the production system for not killing it (IEEE conflict).

Our @puddly you is knowing the problematic and have making the migrating tool and seen some problems then migrating to and from EZSP. I think as dumpfheimer we need more info to the user and also more warnings so the user is knowing what can happening. If not i suggesting only migrating EZSP to / from EZSP for not getting broken Zigbee networks.

Perhaps checking if the EZSP is having TC-Link keys in the token storage (= OK migrating to "other") or have the seed for hashed ones in it and not migrating if its not using the same methods or throwing more warnings to the user.

By the way im not native Ger or Eng only one "old swede" (Alte Swe....) and is writing as im speaking ;-((

patrickschoenherr commented 1 year ago

@dumpfheimer Here are the logs I found with the keywords you have mentioned:

edit: removed logs

dumpfheimer commented 1 year ago

You might want to remove the logs by editing your comment to remove sensitive information.

Up until here there is nothing standing out to me. There should be a "We were disconnected from ..." Message at some point when it stops working

patrickschoenherr commented 1 year ago

@dumpfheimer

Maybe I found a solution. I had the edl21 integration left in the configuration.yaml heading also to the serial port: /dev/ttyUSB0. I removed the lines and rebooted the system.

ZHA integration is running in the moment without any failures. I will test and report here.

dumpfheimer commented 1 year ago

That could definitely be the problem

MattWestb commented 1 year ago

Great finding and fix @patrickschoenherr !!!

If you like deleting your log permanent click on the edit and delete the first version and its real gone ;-))

patrickschoenherr commented 1 year ago

@MattWestb Thanks for the hint. I'm not quite familiar with github yet.

ZHA is still running without any log entries and after several restarts. I think the problem is solved for me.

Should I close the issue though other topics (ezsp migration) were discussed?

MattWestb commented 1 year ago

@patrickschoenherr You have solving your problem so the best closing the issue then all other things is not direct related to your problem. You can leaving it open but i think better user making new issues with there problems and not making mega issues that is not possible following with all different coordinators / hardware in the mix.

The migrating problematic is better making one new discussion for and its also possible comment here after you have closing it if some user like to do.