home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
70.11k stars 29.17k forks source link

ZHA looses connectivity to some devices after some time #102252

Closed martinw72 closed 4 months ago

martinw72 commented 9 months ago

The problem

After a while ZHA looses connectivity to around 30% of my devices. Most of them are Aquara and Ikea devices, like motion sensor, multisensors… a integration restart solves the problem. The problem started with the update to 2023.10.

What version of Home Assistant Core has the issue?

2023.10

What was the last working version of Home Assistant Core?

2023.9.1

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ZHA with Skylink

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

home-assistant[bot] commented 9 months ago

Hey there @dmulcahey, @adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration (zha) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `zha` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign zha` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


zha documentation zha source (message by IssueLinks)

aliceapps commented 9 months ago

I have the same issue, but since all my zigbee devices are either Ikea or Aqara it's safe to say ZHA looses connectivity to all devices (not always at the same time though).

martinw72 commented 9 months ago

@aliceapps: With the latest update 2023.10.5 I can see some improvement. The time between a reload of the integration an the failure has increased. Not all my Aqara devices are effected. In addition some Beseed light switches and wall plugs are effected, too!

puddly commented 9 months ago

Can both of you upload diagnostics JSON for the ZHA integration?

gkukurin commented 9 months ago

I have the same issue, here is my diagnostics JSON

{
  "home_assistant": {
    "installation_type": "Home Assistant OS",
    "version": "2023.10.5",
    "dev": false,
    "hassio": true,
    "virtualenv": false,
    "python_version": "3.11.5",
    "docker": true,
    "arch": "aarch64",
    "timezone": "Europe/Zagreb",
    "os_name": "Linux",
    "os_version": "6.1.21-v8",
    "supervisor": "2023.10.1",
    "host_os": "Home Assistant OS 11.1",
    "docker_version": "24.0.6",
    "chassis": "embedded",
    "run_as_root": true
  },
  "custom_components": {
    "homewhiz": {
      "version": "0.0.6",
      "requirements": [
        "bleak",
        "bleak_retry_connector",
        "dacite",
        "aiohttp",
        "bidict"
      ]
    },
    "hacs": {
      "version": "1.33.0",
      "requirements": [
        "aiogithubapi>=22.10.1"
      ]
    },
    "shelly": {
      "version": "1.0.5",
      "requirements": [
        "pyShelly==1.0.3",
        "paho-mqtt==1.6.1",
        "websocket-client"
      ]
    }
  },
  "integration_manifest": {
    "domain": "zha",
    "name": "Zigbee Home Automation",
    "after_dependencies": [
      "onboarding",
      "usb"
    ],
    "codeowners": [
      "@dmulcahey",
      "@adminiuga",
      "@puddly"
    ],
    "config_flow": true,
    "dependencies": [
      "file_upload"
    ],
    "documentation": "https://www.home-assistant.io/integrations/zha",
    "iot_class": "local_polling",
    "loggers": [
      "aiosqlite",
      "bellows",
      "crccheck",
      "pure_pcapy3",
      "zhaquirks",
      "zigpy",
      "zigpy_deconz",
      "zigpy_xbee",
      "zigpy_zigate",
      "zigpy_znp",
      "universal_silabs_flasher"
    ],
    "requirements": [
      "bellows==0.36.5",
      "pyserial==3.5",
      "pyserial-asyncio==0.6",
      "zha-quirks==0.0.105",
      "zigpy-deconz==0.21.1",
      "zigpy==0.57.2",
      "zigpy-xbee==0.18.3",
      "zigpy-zigate==0.11.0",
      "zigpy-znp==0.11.6",
      "universal-silabs-flasher==0.0.14",
      "pyserial-asyncio-fast==0.11"
    ],
    "usb": [
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*2652*",
        "known_devices": [
          "slae.sh cc2652rb stick"
        ]
      },
      {
        "vid": "1A86",
        "pid": "55D4",
        "description": "*sonoff*plus*",
        "known_devices": [
          "sonoff zigbee dongle plus v2"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*sonoff*plus*",
        "known_devices": [
          "sonoff zigbee dongle plus"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*tubeszb*",
        "known_devices": [
          "TubesZB Coordinator"
        ]
      },
      {
        "vid": "1A86",
        "pid": "7523",
        "description": "*tubeszb*",
        "known_devices": [
          "TubesZB Coordinator"
        ]
      },
      {
        "vid": "1A86",
        "pid": "7523",
        "description": "*zigstar*",
        "known_devices": [
          "ZigStar Coordinators"
        ]
      },
      {
        "vid": "1CF1",
        "pid": "0030",
        "description": "*conbee*",
        "known_devices": [
          "Conbee II"
        ]
      },
      {
        "vid": "10C4",
        "pid": "8A2A",
        "description": "*zigbee*",
        "known_devices": [
          "Nortek HUSBZB-1"
        ]
      },
      {
        "vid": "0403",
        "pid": "6015",
        "description": "*zigate*",
        "known_devices": [
          "ZiGate+"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*zigate*",
        "known_devices": [
          "ZiGate"
        ]
      },
      {
        "vid": "10C4",
        "pid": "8B34",
        "description": "*bv 2010/10*",
        "known_devices": [
          "Bitron Video AV2010/10"
        ]
      }
    ],
    "zeroconf": [
      {
        "type": "_esphomelib._tcp.local.",
        "name": "tube*"
      },
      {
        "type": "_zigate-zigbee-gateway._tcp.local.",
        "name": "*zigate*"
      },
      {
        "type": "_zigstar_gw._tcp.local.",
        "name": "*zigstar*"
      },
      {
        "type": "_uzg-01._tcp.local.",
        "name": "uzg-01*"
      },
      {
        "type": "_slzb-06._tcp.local.",
        "name": "slzb-06*"
      }
    ],
    "is_built_in": true
  },
  "data": {
    "config": {
      "zigpy_config": {
        "network": {
          "channel": 15
        },
        "database_path": "/config/zigbee.db",
        "device": {
          "path": "/dev/serial/by-id/usb-Silicon_Labs_Sonoff_Zigbee_3.0_USB_Dongle_Plus_0001-if00-port0",
          "baudrate": 115200,
          "flow_control": null
        }
      },
      "device_config": {},
      "enable_quirks": true
    },
    "config_entry": {
      "entry_id": "04cf3a2a40b00bd45d6e267d3b9c2da2",
      "version": 3,
      "domain": "zha",
      "title": "Sonoff Zigbee 3.0 USB Dongle Plus - Sonoff Zigbee 3.0 USB Dongle Plus",
      "data": {
        "device": {
          "path": "/dev/serial/by-id/usb-Silicon_Labs_Sonoff_Zigbee_3.0_USB_Dongle_Plus_0001-if00-port0",
          "baudrate": 115200,
          "flow_control": null
        },
        "radio_type": "znp"
      },
      "options": {
        "custom_configuration": {
          "zha_options": {
            "enhanced_light_transition": true,
            "default_light_transition": 0,
            "light_transitioning_flag": true,
            "always_prefer_xy_color_mode": true,
            "group_members_assume_state": true,
            "enable_identify_on_join": true,
            "consider_unavailable_mains": 7200,
            "consider_unavailable_battery": 21600
          }
        }
      },
      "pref_disable_new_entities": false,
      "pref_disable_polling": false,
      "source": "usb",
      "unique_id": "**REDACTED**",
      "disabled_by": null
    },
    "application_state": {
      "node_info": {
        "nwk": 0,
        "ieee": "**REDACTED**",
        "logical_type": 0
      },
      "network_info": {
        "extended_pan_id": "**REDACTED**",
        "pan_id": 19044,
        "nwk_update_id": 6,
        "nwk_manager_id": 0,
        "channel": 15,
        "channel_mask": 32768,
        "security_level": 5,
        "network_key": "**REDACTED**",
        "tc_link_key": {
          "key": [
            90,
            105,
            103,
            66,
            101,
            101,
            65,
            108,
            108,
            105,
            97,
            110,
            99,
            101,
            48,
            57
          ],
          "tx_counter": 0,
          "rx_counter": 0,
          "seq": 0,
          "partner_ieee": "**REDACTED**"
        },
        "key_table": [],
        "children": [],
        "nwk_addresses": {},
        "stack_specific": {
          "zstack": {
            "tclk_seed": "33e769d80608d03047e7e482c37c2945"
          }
        },
        "metadata": {
          "zstack": {
            "TransportRev": 2,
            "ProductId": 1,
            "MajorRel": 2,
            "MinorRel": 7,
            "MaintRel": 1,
            "CodeRevision": 20230507,
            "BootloaderBuildType": 0,
            "BootloaderRevision": null
          }
        },
        "source": "zigpy-znp@0.11.6"
      },
      "counters": {
        "Retry_NONE": {
          "0": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name=0, _raw_value=196, reset_count=0, _last_reset_value=0)"
          }
        },
        "Retry_LastGoodRoute": {
          "2": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name=2, _raw_value=5, reset_count=0, _last_reset_value=0)"
          }
        }
      },
      "broadcast_counters": {},
      "device_counters": {},
      "group_counters": {}
    },
    "energy_scan": {
      "11": 0.0,
      "12": 22.745098039215687,
      "13": 16.862745098039216,
      "14": 0.0,
      "15": 67.05882352941177,
      "16": 0.0,
      "17": 7.0588235294117645,
      "18": 7.0588235294117645,
      "19": 9.803921568627452,
      "20": 62.745098039215684,
      "21": 63.92156862745098,
      "22": 54.11764705882353,
      "23": 8.235294117647058,
      "24": 0.0,
      "25": 0.0,
      "26": 78.43137254901961
    },
    "versions": {
      "bellows": "0.36.5",
      "zigpy": "0.57.2",
      "zigpy_deconz": "0.21.1",
      "zigpy_xbee": "0.18.3",
      "zigpy_znp": "0.11.6",
      "zigpy_zigate": "0.11.0",
      "zhaquirks": "0.0.105"
    }
  }
}
gkukurin commented 9 months ago

I also found this error in the logs, hopefully it will give a clue about the issue:

Logger: homeassistant.components.zha.core.cluster_handlers
Source: components/zha/core/cluster_handlers/__init__.py:537
Integration: Zigbee Home Automation (documentation, issues)
First occurred: 1:24:03 PM (1 occurrences)
Last logged: 1:24:03 PM

[0x1F1D:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>'), DeliveryError('Request failed after 5 attempts: <Status.NWK_INVALID_REQUEST: 194>')]
sqrt-1764 commented 9 months ago

Same for me: Hardware: Home Assistand Yellow (1 week old) Affected Devices: Silvercrest Plug, Ikea Smarbulbs While those devices did not react, my Aquara TRV (lumi.airrtc.agl001) did work, also the IKEA Dimmer showed up in the logs. After some time (without restarting reloading) the silvercrest plug and the Ikea Bulbs reacted again.

Associated Log-Entry:

Logger: homeassistant.components.websocket_api.http.connection
Source: components/websocket_api/commands.py:226
Integration: Home Assistant WebSocket API (documentation, issues)
First occurred: 14:17:42 (9 occurrences)
Last logged: 22:19:39

[547440328000] Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>
[547473561280] Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>
[548018987072] Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 64, in wrap_zigpy_exceptions
    yield
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 84, in wrapper
    return await RETRYABLE_REQUEST_DECORATOR(func)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/util.py", line 132, in retry
    return await func()
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/quirks/__init__.py", line 199, in command
    return await self.request(
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/zcl/__init__.py", line 377, in request
    return await self._endpoint.request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/endpoint.py", line 253, in request
    return await self.device.request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/device.py", line 293, in request
    await self._application.request(
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 828, in request
    await self.send_packet(
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 870, in send_packet
    raise zigpy.exceptions.DeliveryError(
zigpy.exceptions.DeliveryError: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 226, in handle_call_service
    await hass.services.async_call(
  File "/usr/src/homeassistant/homeassistant/core.py", line 2012, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2049, in _execute_service
    return await target(service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 235, in handle_service
    return await service.entity_service_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 876, in entity_service_call
    response_data = await _handle_entity_call(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 948, in _handle_entity_call
    result = await task
             ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/zha/switch.py", line 88, in async_turn_on
    await self._on_off_cluster_handler.turn_on()
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/general.py", line 388, in turn_on
    result = await self.on()
             ^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 83, in wrapper
    with wrap_zigpy_exceptions():
  File "/usr/local/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 75, in wrap_zigpy_exceptions
    raise HomeAssistantError(message) from exc
homeassistant.exceptions.HomeAssistantError: Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>

Diagnostics:

{
  "home_assistant": {
    "installation_type": "Home Assistant OS",
    "version": "2023.10.5",
    "dev": false,
    "hassio": true,
    "virtualenv": false,
    "python_version": "3.11.5",
    "docker": true,
    "arch": "aarch64",
    "timezone": "Europe/Berlin",
    "os_name": "Linux",
    "os_version": "6.1.21-v8",
    "supervisor": "2023.10.1",
    "host_os": "Home Assistant OS 11.1",
    "docker_version": "24.0.6",
    "chassis": "embedded",
    "run_as_root": true
  },
  "custom_components": {},
  "integration_manifest": {
    "domain": "zha",
    "name": "Zigbee Home Automation",
    "after_dependencies": [
      "onboarding",
      "usb"
    ],
    "codeowners": [
      "@dmulcahey",
      "@adminiuga",
      "@puddly"
    ],
    "config_flow": true,
    "dependencies": [
      "file_upload"
    ],
    "documentation": "https://www.home-assistant.io/integrations/zha",
    "iot_class": "local_polling",
    "loggers": [
      "aiosqlite",
      "bellows",
      "crccheck",
      "pure_pcapy3",
      "zhaquirks",
      "zigpy",
      "zigpy_deconz",
      "zigpy_xbee",
      "zigpy_zigate",
      "zigpy_znp",
      "universal_silabs_flasher"
    ],
    "requirements": [
      "bellows==0.36.5",
      "pyserial==3.5",
      "pyserial-asyncio==0.6",
      "zha-quirks==0.0.105",
      "zigpy-deconz==0.21.1",
      "zigpy==0.57.2",
      "zigpy-xbee==0.18.3",
      "zigpy-zigate==0.11.0",
      "zigpy-znp==0.11.6",
      "universal-silabs-flasher==0.0.14",
      "pyserial-asyncio-fast==0.11"
    ],
    "usb": [
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*2652*",
        "known_devices": [
          "slae.sh cc2652rb stick"
        ]
      },
      {
        "vid": "1A86",
        "pid": "55D4",
        "description": "*sonoff*plus*",
        "known_devices": [
          "sonoff zigbee dongle plus v2"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*sonoff*plus*",
        "known_devices": [
          "sonoff zigbee dongle plus"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*tubeszb*",
        "known_devices": [
          "TubesZB Coordinator"
        ]
      },
      {
        "vid": "1A86",
        "pid": "7523",
        "description": "*tubeszb*",
        "known_devices": [
          "TubesZB Coordinator"
        ]
      },
      {
        "vid": "1A86",
        "pid": "7523",
        "description": "*zigstar*",
        "known_devices": [
          "ZigStar Coordinators"
        ]
      },
      {
        "vid": "1CF1",
        "pid": "0030",
        "description": "*conbee*",
        "known_devices": [
          "Conbee II"
        ]
      },
      {
        "vid": "10C4",
        "pid": "8A2A",
        "description": "*zigbee*",
        "known_devices": [
          "Nortek HUSBZB-1"
        ]
      },
      {
        "vid": "0403",
        "pid": "6015",
        "description": "*zigate*",
        "known_devices": [
          "ZiGate+"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*zigate*",
        "known_devices": [
          "ZiGate"
        ]
      },
      {
        "vid": "10C4",
        "pid": "8B34",
        "description": "*bv 2010/10*",
        "known_devices": [
          "Bitron Video AV2010/10"
        ]
      }
    ],
    "zeroconf": [
      {
        "type": "_esphomelib._tcp.local.",
        "name": "tube*"
      },
      {
        "type": "_zigate-zigbee-gateway._tcp.local.",
        "name": "*zigate*"
      },
      {
        "type": "_zigstar_gw._tcp.local.",
        "name": "*zigstar*"
      },
      {
        "type": "_uzg-01._tcp.local.",
        "name": "uzg-01*"
      },
      {
        "type": "_slzb-06._tcp.local.",
        "name": "slzb-06*"
      }
    ],
    "is_built_in": true
  },
  "data": {
    "config": {
      "zigpy_config": {
        "ota": {
          "otau_directory": "/config/zigpy_ota",
          "ikea_provider": true
        },
        "database_path": "/config/zigbee.db",
        "device": {
          "path": "/dev/ttyAMA1",
          "baudrate": 115200,
          "flow_control": "software"
        }
      },
      "device_config": {},
      "enable_quirks": true
    },
    "config_entry": {
      "entry_id": "ff2dfa6334576ce90d15b99477f65b39",
      "version": 3,
      "domain": "zha",
      "title": "Yellow Zigbee module - Nabu Casa",
      "data": {
        "device": {
          "path": "/dev/ttyAMA1",
          "baudrate": 115200,
          "flow_control": "software"
        },
        "radio_type": "ezsp"
      },
      "options": {},
      "pref_disable_new_entities": false,
      "pref_disable_polling": false,
      "source": "user",
      "unique_id": null,
      "disabled_by": null
    },
    "application_state": {
      "node_info": {
        "nwk": 0,
        "ieee": "**REDACTED**",
        "logical_type": 0
      },
      "network_info": {
        "extended_pan_id": "**REDACTED**",
        "pan_id": 41753,
        "nwk_update_id": 0,
        "nwk_manager_id": 0,
        "channel": 20,
        "channel_mask": 134215680,
        "security_level": 5,
        "network_key": "**REDACTED**",
        "tc_link_key": {
          "key": [
            90,
            105,
            103,
            66,
            101,
            101,
            65,
            108,
            108,
            105,
            97,
            110,
            99,
            101,
            48,
            57
          ],
          "tx_counter": 8192,
          "rx_counter": 0,
          "seq": 0,
          "partner_ieee": "**REDACTED**"
        },
        "key_table": [],
        "children": [],
        "nwk_addresses": {},
        "stack_specific": {
          "ezsp": {
            "hashed_tclk": "ef09febcead6bcfdd9ba9d10084bb56e"
          }
        },
        "metadata": {
          "ezsp": {
            "manufacturer": "Nabu Casa",
            "board": "Yellow v1.3",
            "version": "6.10.3.0 build 297",
            "stack_version": 8,
            "can_burn_userdata_custom_eui64": false,
            "can_rewrite_custom_eui64": false
          }
        },
        "source": "bellows@0.36.5"
      },
      "counters": {
        "controller_app_counters": {
          "unicast_rx": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_rx', _raw_value=16024, reset_count=0, _last_reset_value=0)"
          },
          "unicast_tx_success": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_tx_success', _raw_value=1447, reset_count=0, _last_reset_value=0)"
          },
          "broadcast_tx_success_unexpected": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='broadcast_tx_success_unexpected', _raw_value=140, reset_count=0, _last_reset_value=0)"
          },
          "unicast_tx_failure": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_tx_failure', _raw_value=1799, reset_count=0, _last_reset_value=0)"
          },
          "unicast_tx_success_duplicate": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_tx_success_duplicate', _raw_value=1, reset_count=0, _last_reset_value=0)"
          },
          "broadcast_rx": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='broadcast_rx', _raw_value=140, reset_count=0, _last_reset_value=0)"
          },
          "broadcast_tx_failure_unexpected": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='broadcast_tx_failure_unexpected', _raw_value=2, reset_count=0, _last_reset_value=0)"
          }
        },
        "ezsp_counters": {
          "MAC_RX_BROADCAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_RX_BROADCAST', _raw_value=401, reset_count=146, _last_reset_value=65313)"
          },
          "MAC_TX_BROADCAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_TX_BROADCAST', _raw_value=238, reset_count=146, _last_reset_value=31320)"
          },
          "MAC_RX_UNICAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_RX_UNICAST', _raw_value=523, reset_count=146, _last_reset_value=86334)"
          },
          "MAC_TX_UNICAST_SUCCESS": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_TX_UNICAST_SUCCESS', _raw_value=349, reset_count=146, _last_reset_value=37096)"
          },
          "MAC_TX_UNICAST_RETRY": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_TX_UNICAST_RETRY', _raw_value=3, reset_count=146, _last_reset_value=353)"
          },
          "MAC_TX_UNICAST_FAILED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_TX_UNICAST_FAILED', _raw_value=0, reset_count=146, _last_reset_value=3)"
          },
          "APS_DATA_RX_BROADCAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_RX_BROADCAST', _raw_value=4, reset_count=146, _last_reset_value=138)"
          },
          "APS_DATA_TX_BROADCAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_TX_BROADCAST', _raw_value=4, reset_count=146, _last_reset_value=138)"
          },
          "APS_DATA_RX_UNICAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_RX_UNICAST', _raw_value=105, reset_count=146, _last_reset_value=15917)"
          },
          "APS_DATA_TX_UNICAST_SUCCESS": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_TX_UNICAST_SUCCESS', _raw_value=26, reset_count=146, _last_reset_value=1421)"
          },
          "APS_DATA_TX_UNICAST_RETRY": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_TX_UNICAST_RETRY', _raw_value=64, reset_count=146, _last_reset_value=3611)"
          },
          "APS_DATA_TX_UNICAST_FAILED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_TX_UNICAST_FAILED', _raw_value=32, reset_count=146, _last_reset_value=1767)"
          },
          "ROUTE_DISCOVERY_INITIATED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ROUTE_DISCOVERY_INITIATED', _raw_value=40, reset_count=146, _last_reset_value=3656)"
          },
          "NEIGHBOR_ADDED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NEIGHBOR_ADDED', _raw_value=0, reset_count=146, _last_reset_value=2)"
          },
          "NEIGHBOR_REMOVED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NEIGHBOR_REMOVED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "NEIGHBOR_STALE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NEIGHBOR_STALE', _raw_value=2, reset_count=146, _last_reset_value=259)"
          },
          "JOIN_INDICATION": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='JOIN_INDICATION', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "CHILD_REMOVED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='CHILD_REMOVED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "ASH_OVERFLOW_ERROR": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ASH_OVERFLOW_ERROR', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "ASH_FRAMING_ERROR": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ASH_FRAMING_ERROR', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "ASH_OVERRUN_ERROR": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ASH_OVERRUN_ERROR', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "NWK_FRAME_COUNTER_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NWK_FRAME_COUNTER_FAILURE', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "APS_FRAME_COUNTER_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_FRAME_COUNTER_FAILURE', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "UTILITY": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='UTILITY', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "APS_LINK_KEY_NOT_AUTHORIZED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_LINK_KEY_NOT_AUTHORIZED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "NWK_DECRYPTION_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NWK_DECRYPTION_FAILURE', _raw_value=135, reset_count=146, _last_reset_value=19599)"
          },
          "APS_DECRYPTION_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DECRYPTION_FAILURE', _raw_value=0, reset_count=146, _last_reset_value=42)"
          },
          "ALLOCATE_PACKET_BUFFER_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ALLOCATE_PACKET_BUFFER_FAILURE', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "RELAYED_UNICAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='RELAYED_UNICAST', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PHY_TO_MAC_QUEUE_LIMIT_REACHED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PHY_TO_MAC_QUEUE_LIMIT_REACHED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PACKET_VALIDATE_LIBRARY_DROPPED_COUNT": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PACKET_VALIDATE_LIBRARY_DROPPED_COUNT', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "TYPE_NWK_RETRY_OVERFLOW": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='TYPE_NWK_RETRY_OVERFLOW', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PHY_CCA_FAIL_COUNT": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PHY_CCA_FAIL_COUNT', _raw_value=0, reset_count=146, _last_reset_value=47)"
          },
          "BROADCAST_TABLE_FULL": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='BROADCAST_TABLE_FULL', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PTA_LO_PRI_REQUESTED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_LO_PRI_REQUESTED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PTA_HI_PRI_REQUESTED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_HI_PRI_REQUESTED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PTA_LO_PRI_DENIED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_LO_PRI_DENIED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PTA_HI_PRI_DENIED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_HI_PRI_DENIED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PTA_LO_PRI_TX_ABORTED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_LO_PRI_TX_ABORTED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "PTA_HI_PRI_TX_ABORTED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_HI_PRI_TX_ABORTED', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "ADDRESS_CONFLICT_SENT": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ADDRESS_CONFLICT_SENT', _raw_value=0, reset_count=146, _last_reset_value=0)"
          },
          "EZSP_FREE_BUFFERS": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='EZSP_FREE_BUFFERS', _raw_value=128, reset_count=146, _last_reset_value=0)"
          }
        }
      },
      "broadcast_counters": {},
      "device_counters": {},
      "group_counters": {}
    },
    "energy_scan": {
      "11": 85.82097888710312,
      "12": 85.82097888710312,
      "13": 25.74050169409602,
      "14": 12.244260188723507,
      "15": 59.15797905332195,
      "16": 68.14622793558128,
      "17": 49.512515447068886,
      "18": 49.512515447068886,
      "19": 8.631361812931262,
      "20": 94.48255331375627,
      "21": 92.0598007161209,
      "22": 91.05606689948522,
      "23": 87.33047519856483,
      "24": 80.38447947821754,
      "25": 8.631361812931262,
      "26": 87.33047519856483
    },
    "versions": {
      "bellows": "0.36.5",
      "zigpy": "0.57.2",
      "zigpy_deconz": "0.21.1",
      "zigpy_xbee": "0.18.3",
      "zigpy_znp": "0.11.6",
      "zigpy_zigate": "0.11.0",
      "zhaquirks": "0.0.105"
    }
  }
}
sqrt-1764 commented 8 months ago

Bug still present in Core 2023.11.0

sqrt-1764 commented 8 months ago

Any update on the current status of this incident? I am using Zigbee für Heating and Light - so it is a quite essential component of my Smart-Home.

puddly commented 8 months ago

@sqrt-1764 From your diagnostics, your network is on channel 20. Channel 20 is also 94% congested. You should be receiving a warning every time your start up ZHA that channel 20 is extremely noisy and what steps you can take to reduce the noise.

Make sure your 2.4GHz WiFi doesn't overlap with Zigbee (see here) and if that isn't enough, move your Zigbee network to another channel:

image
panhans commented 8 months ago

Zigbee utilization is misleading. After switching channel warning appears again because ZHA flood the frequencys by itself. For me it doesn't matter on which channel I switch. After switching I always get warnings.

puddly commented 8 months ago

@panhans Please upload diagnostics for the ZHA integration, which will contain energy scan results for your environment. If your coordinator is in a bad spot (i.e. near USB 3.0 ports, SSDs, or 2.4GHz wifi), no channel will be free of interference.

panhans commented 8 months ago

For now everything is fine. I fully restarted my host, unplugged my sonoff dongle-e (2m extension cable) for some minutes, and my external drive is now connected on the front of my mini pc. Will see if message drops will appear again. Never had issues with my old raspBee on my pi.

{
  "home_assistant": {
    "installation_type": "Home Assistant OS",
    "version": "2023.11.3",
    "dev": false,
    "hassio": true,
    "virtualenv": false,
    "python_version": "3.11.6",
    "docker": true,
    "arch": "x86_64",
    "timezone": "Europe/Berlin",
    "os_name": "Linux",
    "os_version": "6.1.59",
    "supervisor": "2023.11.3",
    "host_os": "Home Assistant OS 11.1",
    "docker_version": "24.0.6",
    "chassis": "vm",
    "run_as_root": true
  },
  "custom_components": {
    "deutschebahn": {
      "version": "2.0.4",
      "requirements": [
        "schiene==0.26"
      ]
    },
    "watchman": {
      "version": "0.5.1",
      "requirements": [
        "prettytable==3.0.0"
      ]
    },
    "alarmo": {
      "version": "v1.9.13",
      "requirements": []
    },
    "proxmoxve": {
      "version": "3.2.1",
      "requirements": [
        "proxmoxer==2.0.1"
      ]
    },
    "continuously_casting_dashboards": {
      "version": "1.2.6",
      "requirements": [
        "catt==0.12.11"
      ]
    },
    "ics": {
      "version": "20211212.01",
      "requirements": [
        "recurring-ical-events",
        "icalendar>=4.0.4",
        "tzlocal",
        "integrationhelper",
        "voluptuous",
        "python-dateutil>2.7.3"
      ]
    },
    "hacs": {
      "version": "1.33.0",
      "requirements": [
        "aiogithubapi>=22.10.1"
      ]
    },
    "xiaomi_cloud_map_extractor": {
      "version": "v2.2.0",
      "requirements": [
        "pillow",
        "pybase64",
        "python-miio",
        "requests",
        "pycryptodome"
      ]
    },
    "teamtracker": {
      "version": "0.1",
      "requirements": [
        "arrow",
        "aiofiles"
      ]
    },
    "dwd_weather": {
      "version": "v2.0.13",
      "requirements": [
        "simple_dwd_weatherforecast==2.0.24",
        "markdownify==0.6.5",
        "suntimes==1.1.2"
      ]
    },
    "presence_simulation": {
      "version": "3.2",
      "requirements": []
    },
    "dwd_pollenflug": {
      "version": "1.0.2",
      "requirements": [
        "pytz"
      ]
    },
    "browser_mod": {
      "version": "2.3.0",
      "requirements": []
    },
    "tgtg": {
      "version": "5.3.0",
      "requirements": [
        "tgtg==0.16.0"
      ]
    }
  },
  "integration_manifest": {
    "domain": "zha",
    "name": "Zigbee Home Automation",
    "after_dependencies": [
      "onboarding",
      "usb"
    ],
    "codeowners": [
      "@dmulcahey",
      "@adminiuga",
      "@puddly"
    ],
    "config_flow": true,
    "dependencies": [
      "file_upload"
    ],
    "documentation": "https://www.home-assistant.io/integrations/zha",
    "iot_class": "local_polling",
    "loggers": [
      "aiosqlite",
      "bellows",
      "crccheck",
      "pure_pcapy3",
      "zhaquirks",
      "zigpy",
      "zigpy_deconz",
      "zigpy_xbee",
      "zigpy_zigate",
      "zigpy_znp",
      "universal_silabs_flasher"
    ],
    "requirements": [
      "bellows==0.36.8",
      "pyserial==3.5",
      "pyserial-asyncio==0.6",
      "zha-quirks==0.0.106",
      "zigpy-deconz==0.21.1",
      "zigpy==0.59.0",
      "zigpy-xbee==0.19.0",
      "zigpy-zigate==0.11.0",
      "zigpy-znp==0.11.6",
      "universal-silabs-flasher==0.0.14",
      "pyserial-asyncio-fast==0.11"
    ],
    "usb": [
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*2652*",
        "known_devices": [
          "slae.sh cc2652rb stick"
        ]
      },
      {
        "vid": "1A86",
        "pid": "55D4",
        "description": "*sonoff*plus*",
        "known_devices": [
          "sonoff zigbee dongle plus v2"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*sonoff*plus*",
        "known_devices": [
          "sonoff zigbee dongle plus"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*tubeszb*",
        "known_devices": [
          "TubesZB Coordinator"
        ]
      },
      {
        "vid": "1A86",
        "pid": "7523",
        "description": "*tubeszb*",
        "known_devices": [
          "TubesZB Coordinator"
        ]
      },
      {
        "vid": "1A86",
        "pid": "7523",
        "description": "*zigstar*",
        "known_devices": [
          "ZigStar Coordinators"
        ]
      },
      {
        "vid": "1CF1",
        "pid": "0030",
        "description": "*conbee*",
        "known_devices": [
          "Conbee II"
        ]
      },
      {
        "vid": "10C4",
        "pid": "8A2A",
        "description": "*zigbee*",
        "known_devices": [
          "Nortek HUSBZB-1"
        ]
      },
      {
        "vid": "0403",
        "pid": "6015",
        "description": "*zigate*",
        "known_devices": [
          "ZiGate+"
        ]
      },
      {
        "vid": "10C4",
        "pid": "EA60",
        "description": "*zigate*",
        "known_devices": [
          "ZiGate"
        ]
      },
      {
        "vid": "10C4",
        "pid": "8B34",
        "description": "*bv 2010/10*",
        "known_devices": [
          "Bitron Video AV2010/10"
        ]
      }
    ],
    "zeroconf": [
      {
        "type": "_esphomelib._tcp.local.",
        "name": "tube*"
      },
      {
        "type": "_zigate-zigbee-gateway._tcp.local.",
        "name": "*zigate*"
      },
      {
        "type": "_zigstar_gw._tcp.local.",
        "name": "*zigstar*"
      },
      {
        "type": "_uzg-01._tcp.local.",
        "name": "uzg-01*"
      },
      {
        "type": "_slzb-06._tcp.local.",
        "name": "slzb-06*"
      }
    ],
    "is_built_in": true
  },
  "data": {
    "config": {
      "enable_quirks": true,
      "custom_quirks_path": "/config/custom_zha_quirks/",
      "zigpy_config": {
        "ota": {
          "ikea_provider": true,
          "ledvance_provider": true,
          "salus_provider": true,
          "inovelli_provider": true,
          "thirdreality_provider": true
        },
        "database_path": "/config/zigbee.db",
        "device": {
          "path": "/dev/serial/by-id/usb-ITEAD_SONOFF_Zigbee_3.0_USB_Dongle_Plus_V2_20220714143838-if00",
          "flow_control": "software",
          "baudrate": 115200
        },
        "validate_network_settings": true
      },
      "device_config": {}
    },
    "config_entry": {
      "entry_id": "68cdc68c915ca64cf3f4bb1f851835a5",
      "version": 3,
      "domain": "zha",
      "title": "SONOFF Zigbee 3.0 USB Dongle Plus V2, s/n: 20220714143838 - ITEAD",
      "data": {
        "device": {
          "path": "/dev/serial/by-id/usb-ITEAD_SONOFF_Zigbee_3.0_USB_Dongle_Plus_V2_20220714143838-if00",
          "flow_control": "software",
          "baudrate": 115200
        },
        "radio_type": "ezsp"
      },
      "options": {},
      "pref_disable_new_entities": false,
      "pref_disable_polling": false,
      "source": "user",
      "unique_id": null,
      "disabled_by": null
    },
    "application_state": {
      "node_info": {
        "nwk": 0,
        "ieee": "**REDACTED**",
        "logical_type": 0
      },
      "network_info": {
        "extended_pan_id": "**REDACTED**",
        "pan_id": 42442,
        "nwk_update_id": 4,
        "nwk_manager_id": 0,
        "channel": 25,
        "channel_mask": 134215680,
        "security_level": 5,
        "network_key": "**REDACTED**",
        "tc_link_key": {
          "key": [
            90,
            105,
            103,
            66,
            101,
            101,
            65,
            108,
            108,
            105,
            97,
            110,
            99,
            101,
            48,
            57
          ],
          "tx_counter": 163840,
          "rx_counter": 0,
          "seq": 0,
          "partner_ieee": "**REDACTED**"
        },
        "key_table": [],
        "children": [],
        "nwk_addresses": {},
        "stack_specific": {
          "ezsp": {
            "hashed_tclk": "4c88fa0791544599e63f677c2dce928f"
          }
        },
        "metadata": {
          "ezsp": {
            "manufacturer": "",
            "board": "",
            "version": "6.10.3.0 build 297",
            "stack_version": 8,
            "can_burn_userdata_custom_eui64": true,
            "can_rewrite_custom_eui64": false
          }
        },
        "source": "bellows@0.36.8"
      },
      "counters": {
        "controller_app_counters": {
          "unicast_rx": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_rx', _raw_value=13173, reset_count=0, _last_reset_value=0)"
          },
          "unicast_tx_success": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_tx_success', _raw_value=3972, reset_count=0, _last_reset_value=0)"
          },
          "broadcast_tx_failure_unexpected": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='broadcast_tx_failure_unexpected', _raw_value=1, reset_count=0, _last_reset_value=0)"
          },
          "unicast_tx_failure": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_tx_failure', _raw_value=13, reset_count=0, _last_reset_value=0)"
          },
          "broadcast_tx_success_unexpected": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='broadcast_tx_success_unexpected', _raw_value=27, reset_count=0, _last_reset_value=0)"
          },
          "broadcast_rx": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='broadcast_rx', _raw_value=19, reset_count=0, _last_reset_value=0)"
          },
          "unicast_tx_success_unexpected": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_tx_success_unexpected', _raw_value=3, reset_count=0, _last_reset_value=0)"
          },
          "unicast_tx_failure_unexpected": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='unicast_tx_failure_unexpected', _raw_value=2, reset_count=0, _last_reset_value=0)"
          }
        },
        "ezsp_counters": {
          "MAC_RX_BROADCAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_RX_BROADCAST', _raw_value=83, reset_count=20, _last_reset_value=6131)"
          },
          "MAC_TX_BROADCAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_TX_BROADCAST', _raw_value=45, reset_count=20, _last_reset_value=5194)"
          },
          "MAC_RX_UNICAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_RX_UNICAST', _raw_value=803, reset_count=20, _last_reset_value=44725)"
          },
          "MAC_TX_UNICAST_SUCCESS": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_TX_UNICAST_SUCCESS', _raw_value=248, reset_count=20, _last_reset_value=8273)"
          },
          "MAC_TX_UNICAST_RETRY": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_TX_UNICAST_RETRY', _raw_value=34, reset_count=20, _last_reset_value=2771)"
          },
          "MAC_TX_UNICAST_FAILED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='MAC_TX_UNICAST_FAILED', _raw_value=4, reset_count=20, _last_reset_value=631)"
          },
          "APS_DATA_RX_BROADCAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_RX_BROADCAST', _raw_value=0, reset_count=20, _last_reset_value=28)"
          },
          "APS_DATA_TX_BROADCAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_TX_BROADCAST', _raw_value=0, reset_count=20, _last_reset_value=28)"
          },
          "APS_DATA_RX_UNICAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_RX_UNICAST', _raw_value=188, reset_count=20, _last_reset_value=12985)"
          },
          "APS_DATA_TX_UNICAST_SUCCESS": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_TX_UNICAST_SUCCESS', _raw_value=150, reset_count=20, _last_reset_value=3825)"
          },
          "APS_DATA_TX_UNICAST_RETRY": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_TX_UNICAST_RETRY', _raw_value=12, reset_count=20, _last_reset_value=514)"
          },
          "APS_DATA_TX_UNICAST_FAILED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DATA_TX_UNICAST_FAILED', _raw_value=0, reset_count=20, _last_reset_value=15)"
          },
          "ROUTE_DISCOVERY_INITIATED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ROUTE_DISCOVERY_INITIATED', _raw_value=1, reset_count=20, _last_reset_value=720)"
          },
          "NEIGHBOR_ADDED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NEIGHBOR_ADDED', _raw_value=0, reset_count=20, _last_reset_value=5)"
          },
          "NEIGHBOR_REMOVED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NEIGHBOR_REMOVED', _raw_value=0, reset_count=20, _last_reset_value=3)"
          },
          "NEIGHBOR_STALE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NEIGHBOR_STALE', _raw_value=0, reset_count=20, _last_reset_value=19)"
          },
          "JOIN_INDICATION": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='JOIN_INDICATION', _raw_value=0, reset_count=20, _last_reset_value=5)"
          },
          "CHILD_REMOVED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='CHILD_REMOVED', _raw_value=0, reset_count=20, _last_reset_value=2)"
          },
          "ASH_OVERFLOW_ERROR": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ASH_OVERFLOW_ERROR', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "ASH_FRAMING_ERROR": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ASH_FRAMING_ERROR', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "ASH_OVERRUN_ERROR": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ASH_OVERRUN_ERROR', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "NWK_FRAME_COUNTER_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NWK_FRAME_COUNTER_FAILURE', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "APS_FRAME_COUNTER_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_FRAME_COUNTER_FAILURE', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "UTILITY": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='UTILITY', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "APS_LINK_KEY_NOT_AUTHORIZED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_LINK_KEY_NOT_AUTHORIZED', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "NWK_DECRYPTION_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='NWK_DECRYPTION_FAILURE', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "APS_DECRYPTION_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='APS_DECRYPTION_FAILURE', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "ALLOCATE_PACKET_BUFFER_FAILURE": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ALLOCATE_PACKET_BUFFER_FAILURE', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "RELAYED_UNICAST": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='RELAYED_UNICAST', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PHY_TO_MAC_QUEUE_LIMIT_REACHED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PHY_TO_MAC_QUEUE_LIMIT_REACHED', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PACKET_VALIDATE_LIBRARY_DROPPED_COUNT": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PACKET_VALIDATE_LIBRARY_DROPPED_COUNT', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "TYPE_NWK_RETRY_OVERFLOW": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='TYPE_NWK_RETRY_OVERFLOW', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PHY_CCA_FAIL_COUNT": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PHY_CCA_FAIL_COUNT', _raw_value=0, reset_count=20, _last_reset_value=50)"
          },
          "BROADCAST_TABLE_FULL": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='BROADCAST_TABLE_FULL', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PTA_LO_PRI_REQUESTED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_LO_PRI_REQUESTED', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PTA_HI_PRI_REQUESTED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_HI_PRI_REQUESTED', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PTA_LO_PRI_DENIED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_LO_PRI_DENIED', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PTA_HI_PRI_DENIED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_HI_PRI_DENIED', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PTA_LO_PRI_TX_ABORTED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_LO_PRI_TX_ABORTED', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "PTA_HI_PRI_TX_ABORTED": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='PTA_HI_PRI_TX_ABORTED', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "ADDRESS_CONFLICT_SENT": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='ADDRESS_CONFLICT_SENT', _raw_value=0, reset_count=20, _last_reset_value=0)"
          },
          "EZSP_FREE_BUFFERS": {
            "__type": "<class 'zigpy.state.Counter'>",
            "repr": "Counter(name='EZSP_FREE_BUFFERS', _raw_value=213, reset_count=20, _last_reset_value=0)"
          }
        }
      },
      "broadcast_counters": {},
      "device_counters": {},
      "group_counters": {}
    },
    "energy_scan": {
      "11": 89.93931580241996,
      "12": 94.48255331375627,
      "13": 92.95959997754716,
      "14": 91.05606689948522,
      "15": 31.01324838787301,
      "16": 33.860880820104335,
      "17": 31.01324838787301,
      "18": 39.90320178295578,
      "19": 36.830390267097734,
      "20": 10.914542804728702,
      "21": 12.244260188723507,
      "22": 31.01324838787301,
      "23": 43.057636198227904,
      "24": 59.15797905332195,
      "25": 39.90320178295578,
      "26": 36.830390267097734
    },
    "versions": {
      "bellows": "0.36.8",
      "zigpy": "0.59.0",
      "zigpy_deconz": "0.21.1",
      "zigpy_xbee": "0.19.0",
      "zigpy_znp": "0.11.6",
      "zigpy_zigate": "0.11.0",
      "zhaquirks": "0.0.106"
    }
  }
}
sqrt-1764 commented 7 months ago

@sqrt-1764 From your diagnostics, your network is on channel 20. Channel 20 is also 94% congested. You should be receiving a warning every time your start up ZHA that channel 20 is extremely noisy and what steps you can take to reduce the noise.

Thank you for your feedback. I changed the channel to 26 as this is the only one that is not totally congested. Now it is far better. As the 2.4 GHz Band is quite congested in urban areas ZHA needs some better algorithms to cope with these situations. Screenshot_20231103-185008

Now I have some more questions ;-)

As this is Offtopic here I am looking for a proper place to discuss these issues.

puddly commented 7 months ago

As the 2.4 GHz Band is quite congested in urban areas ZHA needs some better algorithms to cope with these situations.

ZHA already picks the best 2.4GHz Zigbee channel in your environment when you form a new network, heavily prioritizing Zigbee channels that fall between the primary 2.4GHz WiFi channel sidebands (15, 20, 25).

The fundamental problem here is RF interference. Zigbee is a very quiet protocol compared to WiFi so it's a physical limitation and not something that can be solved from software, especially at the level ZHA is able to control the radio. If you are experiencing poor reliability, you need to optimize your environment for Zigbee, not the other way around.

It seems that ZAH is missing some algotithm to confirm that the commands are executed properly.

Zigbee commands all require acknowledgements from the device so we know if a command executes properly. They are also retried three times if they fail.

sqrt-1764 commented 7 months ago

@puddly I am familiar with the physics of RF (licensed HAM-Radio operator). The problem is that the situation when creating the Zigbee Network is a momentary situation. Some WiFi-Networks might be quiet at that moment. Also the WiFi Accesspoints change their channels from time to time adjusting to the current WiFi situation. You can see from my screenshot that there is not really a chance to choose a free channel.

So what does your suggestion to optimize my environment for Zigbee mean? I only have control over my Network (guess which one it is ;-) ) - the other ones I can not control. In my house there are only 2 other networks, the rest is somewhere in the neighbourhood.

Zigbee commands all require acknowledgements from the device so we know if a command executes properly. They are also retried three times if they fail.

Today I had the following situation - testing an update to my heating automation: My 4 thermostats where in the off state. I then activated them and set a target temperature. All thermostats reported to be switched on and set to that temperature. Then I sent the command to switch off the thermostats. All 4 thermostats reported to be in the off state. But one thermostat contiunued to heat. Checking the controls at the themostat itself revealed that the thermostat still was active and set to the previous temperature. When I repeated the command, this thermostat also was deactivated.

In the other instance (also changed to channel 26) I control only lights. I grouped the lights into Zigbee groups and control those groups from my automation. I observe, that the status of the group often does not correspond to the states of the individual lights. The group reports the lights to be on, but all individual lights report to be off. Checking the real state of the lights (I only can do this randomly as I do not live there) reveals that the lights are indeed on. But I now can not trust the status-reports in HA for those lights. And also automations can not react correctly when the state of the entities are not reported correctly ...

Regarding failing commands: I am missing an option to automatically react to such errors within HA. E.g. to send a notification if a command failed. In case of a light in the wrong state, everybody can see that the light is still on, but in case of thermostats the current state of the heating is not so easily to be observed ...

puddly commented 7 months ago

So what does your suggestion to optimize my environment for Zigbee mean? I only have control over my Network (guess which one it is ;-) ) - the other ones I can not control. In my house there are only 2 other networks, the rest is somewhere in the neighbourhood.

I'm not sure what else there is to suggest 😅. You can either change the Zigbee network channel, your WiFi network channel, or position your Zigbee coordinator away from interference. There is nothing else that can be done, Zigbee can't change channels dynamically to avoid interference, nor is it sensible to retry requests a dozen times. It's possible that in your environment, Zigbee is not the best choice.

All 4 thermostats reported to be in the off state. But one thermostat contiunued to heat. Checking the controls at the themostat itself revealed that the thermostat still was active and set to the previous temperature.

I would gather debug logs for this device, as Zigbee has ACKs and confirmations and we know with certainty if a command succeeds or not. If it fails, an error is raised. If no error is raised but the write does not work, this is either a device firmware bug or a bug with the ZHA HVAC code.

I observe, that the status of the group often does not correspond to the states of the individual lights.

Convert your Zigbee group to a normal HA light group. Zigbee group commands are broadcasts, congest the network, and are only useful in very specific circumstances. 99% of the time a light group is the best choice.

sqrt-1764 commented 7 months ago

I don't know if this gets too offtopic and a new issue should be opened?

So what does your suggestion to optimize my environment for Zigbee mean? I only have control over my Network (guess which one it is ;-) ) - the other ones I can not control. In my house there are only 2 other networks, the rest is somewhere in the neighbourhood.

I'm not sure what else there is to suggest 😅. You can either change the Zigbee network channel, your WiFi network channel, or position your Zigbee coordinator away from interference. There is nothing else that can be done, Zigbee can't change channels dynamically to avoid interference, nor is it sensible to retry requests a dozen times. It's possible that in your environment, Zigbee is not the best choice.

The problem then would be that Zigbee would not usable in urban areas. I don't know how other vendors solve this problem (Maybe Zigbee2MQTT behaves better?) I would expect that the result of a command is checked by ZHA. But in case of my thermostat, the state of the thermostat was set to the expected value without the thermostat having this state. Unfortunately this is a sporadic error so that it is hard to repeat ...

All 4 thermostats reported to be in the off state. But one thermostat contiunued to heat. Checking the controls at the themostat itself revealed that the thermostat still was active and set to the previous temperature.

I already have the best position for the Zigbee coordinator (horizontally in the middle of my apartment as far away from any neighbour as I can get. The next WiFi Accesspoint horizontally is at least 6m away) My own WiFi Channel is 1, the coordinator has Zigbee 26. Distance between my Router and the HA Yellow is 1m. The distance to the failing thermostat (Aqara E1) from HA Yellow was 20 cm.

OK, I checked the Network vissualization. The thermostat is connected to my Silvercrest Switch (which is also located 20cm away from the HA Yellow) that is connected to the coordinator - one hop. But this is basic Zigbee stuff that should work.

So that error is quite unexpected. The signal strength should have been as good as it could get.

Either the command should have failed - then an error should be logged. Or the command was not executed. In either case the status of the device must not change without a confirmation that the command was executed successfully ...

I would gather debug logs for this device, as Zigbee has ACKs and confirmations and we know with certainty if a command succeeds or not. If it fails, an error is raised. If no error is raised but the write does not work, this is either a device firmware bug or a bug with the ZHA HVAC code.

How do I gather debug logs for only one device?

I already had debug logging enabled for the ZAH integration. Where do I find the output? In the logbook no errors are reported, nothing to be seen in the History. Also system/logs shows no errors around the time I tested the thermostats.

I am looking for a log where every action and answer is logged, but I can not find that at the moment.

By the way: I repeatedly disabled the debug-logging for ZAH but after some time the debug-logging is reenabled without me doing anything there.

And checking the logs I now see that there is a problem: I currently control my thermostats via scenes. When I disable the thermostats, I select "mode off" for all the thermostats. In the log I now see

Logger: homeassistant.components.automation.heizung_taglich
Source: components/automation/__init__.py:676
Integration: Automation (documentation, issues)
First occurred: 16:55:08 (4 occurrences)
Last logged: 21:00:42

Error while executing automation automation.heizung_taglich: must contain at least one of temperature, target_temp_high, target_temp_low.

Logger: homeassistant.components.automation.heizung_taglich
Source: helpers/script.py:1783
Integration: Automation (documentation, issues)
First occurred: 16:55:08 (8 occurrences)
Last logged: 21:00:42

Heizung - täglich: If at step 1: Error executing script. Invalid data for call_service at pos 1: must contain at least one of temperature, target_temp_high, target_temp_low.
Heizung - täglich: Error executing script. Invalid data for if at pos 1: must contain at least one of temperature, target_temp_high, target_temp_low.

The problem is, that there is no option to enter a target temperature when the state of the thermostat is off. I think that this is a bug.

But nevertheless - the command is executed with the expected result. Only after searching the System-Log I noticed this error. No feedback in the UI whatsoever ...

I observe, that the status of the group often does not correspond to the states of the individual lights.

Convert your Zigbee group to a normal HA light group. Zigbee group commands are broadcasts, congest the network, and are only useful in very specific circumstances. 99% of the time a light group is the best choice.

When creating the Zigbee groups I followed the best practises from the forum posts. So I repeat my question - where do I find a comprehsive guide that addresses such points? At the moment I have to search the internet and then select the good advices from the outdated ones ... which is not a simple task being not a Zigbee expert. ;-)

I am also a software developer and familiar with networks. So I can see that a simple broadcast would problematic because a broadcast has no direct feedback for the correct execution of a command. When a device misses the broadcast nobody will detect that. Regarding the congestion of the network by these broadcasts - why should the network be congested? I would expect that one broadcast is sent when the state of the group is to be changed. Without a broadcast there should be more traffic as every device has to be addressed individually. Or am I missing something?

But I will change to HA groups and test that.

myevit commented 7 months ago

I am observing the similar behaviour. ZHA sitting on channel 25, WIFI 2.4 is on Channel 1. Sonos Network also sits on the channel 1. With 2023.12 update it feels like it become more frequent. Used to I have disconnects on one HUE motion sensor. Now I have disconnects on HUE dimmer. I have reset the dimmer, now looking if it is isolated incident.

issue-triage-workflows[bot] commented 4 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.