home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.13k stars 30.18k forks source link

Couldn't start EZSP = Silicon Labs EmberZNet protocol + Error setting up entry SONOFF Zigbee 3.0 USB Dongle Plus V2 #86435

Closed mgutt closed 1 year ago

mgutt commented 1 year ago

The problem

All devices connected through the Sonoff Zigbee Gateway EFR32MG21 aren't reachable after HA container restart.

image

This happens randomly after my backup process is stopping and starting the container automatically at 02:30 in the night.

All devices are working until then, as they are reporting power consumption etc:

image

So it must be related to the restart.

Strange: Can be solved by restarting the container again.

This happend the 5th time since I installed HA (2022/09).

1.) Any idea how to solve this or could this be a bug in HA? 2.) Is there a CLI command / API URL available which returns the status of the Zigbee Gateway or a Zigbee Device, so I can trigger an additional restart?

What version of Home Assistant Core has the issue?

2023.1.4

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Container

Integration causing the issue

No response

Link to integration documentation on our website

No response

Diagnostics information

2023-01-23 02:30:59.039 ERROR (bellows.thread_0) [bellows.uart] CRC error in frame b'e070381c0b26bfe550458a4dbb649f46bef6d6d2f46aaeb2509a69f2f6059c5fc0c77e' (b'c0c7' != b'994b') 2023-01-23 02:30:59.042 ERROR (bellows.thread_0) [bellows.uart] CRC error in frame b'0418b1a9112a15b65854b624ab5593499c589362f1ff9874beda3c9f46bef7ccd2f46aaeb2509a690350059c5fa8f97e' (b'a8f9' != b'0e5d') 2023-01-23 02:31:05.418 ERROR (MainThread) [zigpy.application] Couldn't start application Traceback (most recent call last): File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 146, in startup await self.initialize(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in initialize await self.load_network_info(load_devices=False) File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 209, in load_network_info status, node_type, nwk_params = await ezsp.getNetworkParameters() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError 2023-01-23 02:31:05.435 WARNING (MainThread) [homeassistant.components.zha.core.gateway] Couldn't start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 1 of 3) Traceback (most recent call last): File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 174, in async_initialize self.application_controller = await app_controller_cls.new( File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 169, in new await app.startup(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 146, in startup await self.initialize(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in initialize await self.load_network_info(load_devices=False) File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 209, in load_network_info status, node_type, nwk_params = await ezsp.getNetworkParameters() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError 2023-01-23 02:31:08.570 ERROR (bellows.thread_0) [bellows.uart] CRC error in frame b'c8a7aaf7ba7f8ffcefc34deb698c46233cd557f404aae3bb419840638488e070381c0b26b8e550458a4dbb649f46bef7d7d2f46aaeb2509a690bdf059c5f6e5b7e' (b'6e5b' != b'49d8') 2023-01-23 02:31:14.974 ERROR (MainThread) [zigpy.application] Couldn't start application Traceback (most recent call last): File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 146, in startup await self.initialize(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in initialize await self.load_network_info(load_devices=False) File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 209, in load_network_info status, node_type, nwk_params = await ezsp.getNetworkParameters() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError 2023-01-23 02:31:14.980 WARNING (MainThread) [homeassistant.components.zha.core.gateway] Couldn't start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 2 of 3) Traceback (most recent call last): File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 174, in async_initialize self.application_controller = await app_controller_cls.new( File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 169, in new await app.startup(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 146, in startup await self.initialize(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in initialize await self.load_network_info(load_devices=False) File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 209, in load_network_info status, node_type, nwk_params = await ezsp.getNetworkParameters() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError 2023-01-23 02:31:24.522 ERROR (MainThread) [zigpy.application] Couldn't start application Traceback (most recent call last): File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 146, in startup await self.initialize(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in initialize await self.load_network_info(load_devices=False) File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 209, in load_network_info status, node_type, nwk_params = await ezsp.getNetworkParameters() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError 2023-01-23 02:31:24.529 WARNING (MainThread) [homeassistant.components.zha.core.gateway] Couldn't start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 3 of 3) Traceback (most recent call last): File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 174, in async_initialize self.application_controller = await app_controller_cls.new( File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 169, in new await app.startup(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 146, in startup await self.initialize(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in initialize await self.load_network_info(load_devices=False) File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 209, in load_network_info status, node_type, nwk_params = await ezsp.getNetworkParameters() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError 2023-01-23 02:31:24.533 ERROR (MainThread) [homeassistant.config_entries] Error setting up entry SONOFF Zigbee 3.0 USB Dongle Plus V2 for zha Traceback (most recent call last): File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/config_entries.py", line 382, in async_setup result = await component.async_setup_entry(hass, self) File "/usr/src/homeassistant/homeassistant/components/zha/init.py", line 111, in async_setup_entry await zha_gateway.async_initialize() File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 189, in async_initialize raise exc File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 174, in async_initialize self.application_controller = await app_controller_cls.new( File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 169, in new await app.startup(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 146, in startup await self.initialize(auto_form=auto_form) File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in initialize await self.load_network_info(load_devices=False) File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 209, in load_network_info status, node_type, nwk_params = await ezsp.getNetworkParameters() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError 2023-01-23 02:31:24.693 ERROR (MainThread) [homeassistant.components.automation.drucker_einschalten] Got error 'Unable to get zha device 1148ae3ac8ca1a4cd5e118a0a524c314' when setting up triggers for Drucker ein (Taster) 2023-01-23 06:00:00.188 WARNING (MainThread) [homeassistant.helpers.service] Unable to find referenced entities switch.lumi_lumi_plug_maeu01_switch_12 or it is/they are currently not available 2023-01-23 06:00:00.202 WARNING (MainThread) [homeassistant.helpers.service] Unable to find referenced entities switch.lumi_lumi_plug_maeu01_switch_17 or it is/they are currently not available 2023-01-23 06:00:00.350 WARNING (MainThread) [homeassistant.helpers.service] Unable to find referenced entities switch.lumi_lumi_plug_maeu01_switch_20 or it is/they are currently not available 2023-01-23 06:00:00.428 WARNING (MainThread) [homeassistant.helpers.service] Unable to find referenced entities switch.lumi_lumi_plug_maeu01_switch_11 or it is/they are currently not available 2023-01-23 07:30:00.073 WARNING (MainThread) [homeassistant.helpers.service] Unable to find referenced entities switch.lumi_lumi_plug_maeu01_switch_20 or it is/they are currently not available 2023-01-23 08:00:00.095 WARNING (MainThread) [homeassistant.helpers.service] Unable to find referenced entities switch.lumi_lumi_plug_maeu01_switch_10 or it is/they are currently not available

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

mgutt commented 1 year ago

I'm answering question 2 by myself: Until this has been fixed, I'm executing the following script every 5 minutes on the docker host, which counts the string "unavailable" in the API states response and if its amount is greater than 100 it restarts the container:

#!/bin/bash

# #####################################
# SETTINGS
# #####################################

# Long Lived Access Token (https://www.home-assistant.io/docs/authentication/#your-account-profile)
ha_token="ABCDEF"

# URL to your Home-Assistant GUI
ha_hostname="https://ha.example.com"

# Restart Home-Assistant if more than this amount of devices are unavailable
ha_devices_offline_limit=100

# #####################################
# SCRIPT
# #####################################

# make script race condition safe
if [[ -d "/tmp/${0///}" ]] || ! mkdir "/tmp/${0///}"; then exit 1; fi; trap 'rmdir "/tmp/${0///}"' EXIT;

# obtain container id
ha_containerid=$(docker container ls | grep -F "homeassistant/" | cut -d" " -f1)

# container needs to run at least 60 seconds
ha_container_runtime=$(docker inspect --format='{{.State.StartedAt}}' home-assistant)
if [[ $(( $(date +%s) - $(date --date="$ha_container_runtime" +%s) )) -lt 60 ]]; then
  echo "Home-Assistant is not running long enough ($ha_container_runtime seconds)."
  exit
fi

# count unavailable devices
ha_devices_offline_count=$(curl -sS -X GET -H "Authorization: Bearer $ha_token" $ha_hostname/api/states | grep -o unavailable | wc -l)
echo "Found $ha_devices_offline_count unavailable devices."

# too many unavailable devices cause container restart
if [[ $ha_devices_offline_count -gt $ha_devices_offline_limit ]]; then

  # restart container
  docker container restart "$ha_containerid"

  # notification
  if [[ -f /usr/local/emhttp/webGui/scripts/notify ]]; then
    /usr/local/emhttp/webGui/scripts/notify -i "normal" -s "Home-Assistant restarted" -d "There were too many unavailable devices!"
  fi

fi
home-assistant[bot] commented 1 year ago

Hey there @dmulcahey, @adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration (zha) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `zha` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Change the title of the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign zha` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


zha documentation zha source (message by IssueLinks)

issue-triage-workflows[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.