Open a-d-r-i-a-n-d opened 6 months ago
@frank777777 Can you post debug logs from 2023.12 as well as 2024.1.4?
Popping in here just to say that I too noticed that I’ve been experiencing the same lags/delays when triggering a Zigbee device (e.g., thirdreality outlet acting as a router). Seems to have also started in 2024.01.x. On/off delays up to one minute.
Other sensors (door, temperature, etc.) appear to be working normal. It seems to affect Zigbee routers specifically.
I also noticed that (as shown in attached screenshot), the reporting history for the battery only sensors (not routers) won’t render like it normally does. It otherwise works fine, and no delays afaik.
@puddly Very Good morning, could you please share a link with a step by step instruction on how to generate the debug logs of ZHA ( I presume is the register bugs icon on ZHA, and then download .. am Iright?)
Popping in here just to say that I too noticed that I’ve been experiencing the same lags/delays when triggering a Zigbee device (e.g., thirdreality outlet acting as a router). Seems to have also started in 2024.01.x. On/off delays up to one minute.
Other sensors (door, temperature, etc.) appear to be working normal. It seems to affect Zigbee routers specifically.
I also noticed that (as shown in attached screenshot), the reporting history for the routers won’t render like it normally does.
Wondering if the new 2024.01.05 is fixing this
2024.01.05 does not fix this. I am also having the described issue, so far only 2023.12.3 "fixes" it.
Same here again after upgrading from 2023.12.3 to 2024.1.1 and .2. All Zigbee devices respond slow.
logging:
The async error does not seem to be related to the issue since this also happens in 2023.12.3 on my side. 2023.12.3 works like a charm. I will provide my 2024.1.5 debug logs later on. I have to do this when my girlfrind is not aware since it affects basically every instance of our smarthome and nothing will work flawlessly anymore :(
@thegreatdane6 totally understand - same here . my wife joke me when things are not working anymore , so I can't upgrade to error now . thanks greatdane for your effort -
Another symptom. Trying to add another Zigbee sensor fails, with a message I don’t typically see.
I reverted to 2023.12.04. Works again and snappy response.
Hi all,
I am also affected by the issue. Am currently on 2024.1.5, to which I upgraded last night from 2024.1.2 - without seeing any improvements.
My current state ("the mess") is
temp + door/window sensors seem to be working mostly fine (and instantly, tested with temp. increase / door open)
remotes work most of the time (as long as they don't trigger zigbee lights ;))
1/2 Tuya mmwave sensors is working fine, the other one returns a constant presence detected
light bulbs and power plugs work very seldomly and if, then with a delay of rather hours than seconds
adding of sensor usually fails or stays in "configuring"
Until early on this year the overall state was reliable, with one or the other sensor dropping from time to time, but apart from that solid (including the other mmwave sensor currently having issues). "The mess" did not occur right after the upgrade (from 2023.12.4 to 2024.1.1) but rather gradually (or at least "over night") and is something I am dealing with for around 1,5w now, rendering my zigbee-based lights and plugs basically unusable.
Following the discussion here, I rolled back to 2023.12.1 for a short period for testing purposes. This improved the situation - not seeing any severe delays in switching lights or any unusual behaviour anymore. However (at least for now) I can't go back to the 2013.12.x permanently, as I noticed that I had done some configurations (dashboards, adding of some additional sensors (just 3 door/window sensors + 2 remotes - same models already active prior to "the mess"), related automations and another integration) since upgrading to 2014.1.x, stuff I would need to backup/export somehow first...
I collected the following log files:
a) home-assistant_zha_2024-01-22T19-01-30.973Z.log Log since upgrading to 2014.1.5, containing several remote button presses and at least one (failing) attempt to scan for a new zigbee device. home-assistant_zha_2024-01-22T19-01-30.973Z.log
b) home-assistant_zha_2024-01-22T19-40-13.664Z.log Log with the short testing period of the old 2023.12.1 of a couple of minutes, right after reverting to 2023.12.1 home-assistant_zha_2024-01-22T19-40-13.664Z.log
Hope we can get this sorted out.
Kind regards, Holger
PS: worst of the "delaying request (X enqueued). Previously delayed request is now running, delayed by Ys" I found in some older log files I created on the 2014.1.2 release was X=109909 (enqueued requests) and Y=16763 (s delay). Still have these files in case needed.
Exactly the same
For anyone with Tuya MM wave sensors: please try removing them and un plug them. Give it some time and see if anything gets better. We are seeing some models of these absolutely hammer meshes and cause significant issues.
Will do so over night and report back.
Read reports on them flooding the netwok too. However, I find it a bit hard to believe that the two mmwave sensors on my network should suddenly cause such severe problems basically over night. Integrated them in AUG+SEPT and had them working stable since then.
Out of curiosity: What's the best way to see how much traffic they are actually causing, especially as they are also functioning as router? I performed a naive grep -i for the NWKs (0x7D29 and 0xcaff - my inner nerd loves this last one btw :D) of my two sensors in all log files I collected recently. That shows that they add up for around 5% of the total log entries. Might not be the most meaningful approach though.
As promised the 12+h later update:
Will now perform a reboot of the HAOS system as the mmwave sensors were not shown as unavailable and check a couple of hours later if this had some impact.
As promised the 12+h later update:
- Tuya mmwave sensors are unplugged for about 13h now
- no change about "the mess" - light bulb and plugs not responding (gave up monitoring them after several minutes), sensor reporting current values with a delay of some seconds.
Will now perform a reboot of the HAOS system as the mmwave sensors were not shown as unavailable and check a couple of hours later if this had some impact.
Please enable debug logging after the reboot and post the log after your testing.
Performed yet another reboot to enable debug logging early on.
For now (5min after the reboot) all working so far. Tested turning on a bunch of bulbs via remote and HA webinterface. home-assistant_zha_2024-01-23T12-58-40.276Z.log
Will test again in a couple of hours and see if it changed.
Thanks for the work ! I presume we found the bad guys .. wonder why .. they works before ..
You're welcome :) But let's wait and see if this really did the trick.
1,5h after the reboot it's still working stable, with "Previously delayed request is now running, delayed by" between 0.0X and 5sec, the later mostly occuring when turning on the lights in a room with 5+ zigbee bulbs.
However, I personally cannot work out why the mmwave sensors did not cause any problems on the 2023.12.X release as opposed to the 2014.1.X. If these are the cause. Is there a more sensible procedure for determining their load on the network than the one I described above?
You're welcome :) But let's wait and see if this really did the trick.
1,5h after the reboot it's still working stable, with "Previously delayed request is now running, delayed by" between 0.0X and 5sec, the later mostly occuring when turning on the lights in a room with 5+ zigbee bulbs.
However, I personally cannot work out why the mmwave sensors did not cause any problems on the 2023.12.X release as opposed to the 2014.1.X. If these are the cause. Is there a more sensible procedure for determining their load on the network than the one I described above?
We had other bugs that impacted devices that behave. The bug(s) caused messages to not be handled correctly. We fixed those issues and this issue became visible.
I do not use an mmwave device. So from my point of view it is not related to them.
I use PIR, Temp/humidity, plugs, bulbs and switches
The afternoon and early evening tests with the mmwave sensors off was rather reliable. However in the evening I have seen an increase of up to ~550 enqueued requests with delays up to 90s.
This then returned to a rather instant behaviour, but peaked again after some time. home-assistant_zha_2024-01-23T20-14-16.889Z.zip
So I sort of have the feeling, that the mmwave sensors might have amplified another, still existing problem.
@marc4s Please upload debug logs for the ZHA integration in addition to diagnostics JSON.
Similar issues here. Network is stable initially but starts degrading over time. Not sure if its the same issue. I've seen multiple ZHA issues opened here.
Multiprotocol is not enabled. No mmwave sensor AFAIK
I think this is the issue:
2024-01-25 21:48:58.965 DEBUG (MainThread) [zigpy.application] Max concurrency (8) reached, delaying request (13761 enqueued)
Funny thing, only 2 minutes later:
024-01-25 21:50:56.115 DEBUG (MainThread) [zigpy.application] Max concurrency (8) reached, delaying request (13916 enqueued)
config_entry-zha
{
"home_assistant": {
"installation_type": "Home Assistant OS",
"version": "2024.1.5",
"dev": false,
"hassio": true,
"virtualenv": false,
"python_version": "3.11.6",
"docker": true,
"arch": "aarch64",
"timezone": "Europe/Amsterdam",
"os_name": "Linux",
"os_version": "6.1.63-haos-raspi",
"supervisor": "2023.12.1",
"host_os": "Home Assistant OS 11.4",
"docker_version": "24.0.7",
"chassis": "embedded",
"run_as_root": true
},
"custom_components": {},
"integration_manifest": {
"domain": "zha",
"name": "Zigbee Home Automation",
"after_dependencies": [
"onboarding",
"usb"
],
"codeowners": [
"@dmulcahey",
"@adminiuga",
"@puddly",
"@TheJulianJES"
],
"config_flow": true,
"dependencies": [
"file_upload"
],
"documentation": "https://www.home-assistant.io/integrations/zha",
"iot_class": "local_polling",
"loggers": [
"aiosqlite",
"bellows",
"crccheck",
"pure_pcapy3",
"zhaquirks",
"zigpy",
"zigpy_deconz",
"zigpy_xbee",
"zigpy_zigate",
"zigpy_znp",
"universal_silabs_flasher"
],
"requirements": [
"bellows==0.37.6",
"pyserial==3.5",
"pyserial-asyncio==0.6",
"zha-quirks==0.0.109",
"zigpy-deconz==0.22.4",
"zigpy==0.60.6",
"zigpy-xbee==0.20.1",
"zigpy-zigate==0.12.0",
"zigpy-znp==0.12.1",
"universal-silabs-flasher==0.0.15",
"pyserial-asyncio-fast==0.11"
],
"usb": [
{
"vid": "10C4",
"pid": "EA60",
"description": "*2652*",
"known_devices": [
"slae.sh cc2652rb stick"
]
},
{
"vid": "1A86",
"pid": "55D4",
"description": "*sonoff*plus*",
"known_devices": [
"sonoff zigbee dongle plus v2"
]
},
{
"vid": "10C4",
"pid": "EA60",
"description": "*sonoff*plus*",
"known_devices": [
"sonoff zigbee dongle plus"
]
},
{
"vid": "10C4",
"pid": "EA60",
"description": "*tubeszb*",
"known_devices": [
"TubesZB Coordinator"
]
},
{
"vid": "1A86",
"pid": "7523",
"description": "*tubeszb*",
"known_devices": [
"TubesZB Coordinator"
]
},
{
"vid": "1A86",
"pid": "7523",
"description": "*zigstar*",
"known_devices": [
"ZigStar Coordinators"
]
},
{
"vid": "1CF1",
"pid": "0030",
"description": "*conbee*",
"known_devices": [
"Conbee II"
]
},
{
"vid": "0403",
"pid": "6015",
"description": "*conbee*",
"known_devices": [
"Conbee III"
]
},
{
"vid": "10C4",
"pid": "8A2A",
"description": "*zigbee*",
"known_devices": [
"Nortek HUSBZB-1"
]
},
{
"vid": "0403",
"pid": "6015",
"description": "*zigate*",
"known_devices": [
"ZiGate+"
]
},
{
"vid": "10C4",
"pid": "EA60",
"description": "*zigate*",
"known_devices": [
"ZiGate"
]
},
{
"vid": "10C4",
"pid": "8B34",
"description": "*bv 2010/10*",
"known_devices": [
"Bitron Video AV2010/10"
]
}
],
"zeroconf": [
{
"type": "_esphomelib._tcp.local.",
"name": "tube*"
},
{
"type": "_zigate-zigbee-gateway._tcp.local.",
"name": "*zigate*"
},
{
"type": "_zigstar_gw._tcp.local.",
"name": "*zigstar*"
},
{
"type": "_uzg-01._tcp.local.",
"name": "uzg-01*"
},
{
"type": "_slzb-06._tcp.local.",
"name": "slzb-06*"
}
],
"is_built_in": true
},
"data": {
"config": {},
"config_entry": {
"entry_id": "cbe1f5d79a7cd3aa470251ee6ea6a0ce",
"version": 4,
"minor_version": 1,
"domain": "zha",
"title": "/dev/ttyAMA1",
"data": {
"device": {
"path": "/dev/ttyAMA1",
"baudrate": 115200,
"flow_control": "software"
},
"radio_type": "ezsp"
},
"options": {
"custom_configuration": {
"zha_options": {
"enhanced_light_transition": true,
"enable_identify_on_join": false,
"default_light_transition": 0,
"light_transitioning_flag": true,
"always_prefer_xy_color_mode": true,
"group_members_assume_state": true,
"consider_unavailable_mains": 7200,
"consider_unavailable_battery": 21600
}
}
},
"pref_disable_new_entities": false,
"pref_disable_polling": false,
"source": "user",
"unique_id": null,
"disabled_by": null
},
"application_state": {
"node_info": {
"nwk": 0,
"ieee": "**REDACTED**",
"logical_type": 0,
"model": "Yellow v1.2",
"manufacturer": "Nabu Casa",
"version": "6.10.3.0 build 297"
},
"network_info": {
"extended_pan_id": "**REDACTED**",
"pan_id": 12339,
"nwk_update_id": 0,
"nwk_manager_id": 0,
"channel": 15,
"channel_mask": 134215680,
"security_level": 5,
"network_key": "**REDACTED**",
"tc_link_key": {
"key": [
90,
105,
103,
66,
101,
101,
65,
108,
108,
105,
97,
110,
99,
101,
48,
57
],
"tx_counter": 0,
"rx_counter": 0,
"seq": 0,
"partner_ieee": "**REDACTED**"
},
"key_table": [],
"children": [],
"nwk_addresses": {},
"stack_specific": {
"ezsp": {
"hashed_tclk": "c79dd5f9adf3f747f4ac7960d3613354"
}
},
"metadata": {
"ezsp": {
"stack_version": 8,
"can_burn_userdata_custom_eui64": true,
"can_rewrite_custom_eui64": false
}
},
"source": "bellows@0.37.6"
},
"counters": {
"ezsp_counters": {
"MAC_RX_BROADCAST": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='MAC_RX_BROADCAST', _raw_value=65535, reset_count=35, _last_reset_value=2228190)"
},
"MAC_TX_BROADCAST": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='MAC_TX_BROADCAST', _raw_value=17462, reset_count=35, _last_reset_value=1382531)"
},
"MAC_RX_UNICAST": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='MAC_RX_UNICAST', _raw_value=19681, reset_count=35, _last_reset_value=1503648)"
},
"MAC_TX_UNICAST_SUCCESS": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='MAC_TX_UNICAST_SUCCESS', _raw_value=8694, reset_count=35, _last_reset_value=584458)"
},
"MAC_TX_UNICAST_RETRY": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='MAC_TX_UNICAST_RETRY', _raw_value=4413, reset_count=35, _last_reset_value=352196)"
},
"MAC_TX_UNICAST_FAILED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='MAC_TX_UNICAST_FAILED', _raw_value=923, reset_count=35, _last_reset_value=77868)"
},
"APS_DATA_RX_BROADCAST": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_DATA_RX_BROADCAST', _raw_value=444, reset_count=35, _last_reset_value=36917)"
},
"APS_DATA_TX_BROADCAST": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_DATA_TX_BROADCAST', _raw_value=389, reset_count=35, _last_reset_value=33434)"
},
"APS_DATA_RX_UNICAST": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_DATA_RX_UNICAST', _raw_value=6361, reset_count=35, _last_reset_value=478426)"
},
"APS_DATA_TX_UNICAST_SUCCESS": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_DATA_TX_UNICAST_SUCCESS', _raw_value=3155, reset_count=35, _last_reset_value=276114)"
},
"APS_DATA_TX_UNICAST_RETRY": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_DATA_TX_UNICAST_RETRY', _raw_value=2307, reset_count=35, _last_reset_value=159654)"
},
"APS_DATA_TX_UNICAST_FAILED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_DATA_TX_UNICAST_FAILED', _raw_value=762, reset_count=35, _last_reset_value=29894)"
},
"ROUTE_DISCOVERY_INITIATED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='ROUTE_DISCOVERY_INITIATED', _raw_value=600, reset_count=35, _last_reset_value=41659)"
},
"NEIGHBOR_ADDED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='NEIGHBOR_ADDED', _raw_value=0, reset_count=35, _last_reset_value=27)"
},
"NEIGHBOR_REMOVED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='NEIGHBOR_REMOVED', _raw_value=0, reset_count=35, _last_reset_value=2)"
},
"NEIGHBOR_STALE": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='NEIGHBOR_STALE', _raw_value=176, reset_count=35, _last_reset_value=11916)"
},
"JOIN_INDICATION": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='JOIN_INDICATION', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"CHILD_REMOVED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='CHILD_REMOVED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"ASH_OVERFLOW_ERROR": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='ASH_OVERFLOW_ERROR', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"ASH_FRAMING_ERROR": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='ASH_FRAMING_ERROR', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"ASH_OVERRUN_ERROR": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='ASH_OVERRUN_ERROR', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"NWK_FRAME_COUNTER_FAILURE": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='NWK_FRAME_COUNTER_FAILURE', _raw_value=240, reset_count=35, _last_reset_value=21038)"
},
"APS_FRAME_COUNTER_FAILURE": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_FRAME_COUNTER_FAILURE', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"UTILITY": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='UTILITY', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"APS_LINK_KEY_NOT_AUTHORIZED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_LINK_KEY_NOT_AUTHORIZED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"NWK_DECRYPTION_FAILURE": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='NWK_DECRYPTION_FAILURE', _raw_value=0, reset_count=35, _last_reset_value=9)"
},
"APS_DECRYPTION_FAILURE": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='APS_DECRYPTION_FAILURE', _raw_value=8, reset_count=35, _last_reset_value=286)"
},
"ALLOCATE_PACKET_BUFFER_FAILURE": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='ALLOCATE_PACKET_BUFFER_FAILURE', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"RELAYED_UNICAST": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='RELAYED_UNICAST', _raw_value=1, reset_count=35, _last_reset_value=168)"
},
"PHY_TO_MAC_QUEUE_LIMIT_REACHED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PHY_TO_MAC_QUEUE_LIMIT_REACHED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"PACKET_VALIDATE_LIBRARY_DROPPED_COUNT": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PACKET_VALIDATE_LIBRARY_DROPPED_COUNT', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"TYPE_NWK_RETRY_OVERFLOW": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='TYPE_NWK_RETRY_OVERFLOW', _raw_value=2015, reset_count=35, _last_reset_value=199696)"
},
"PHY_CCA_FAIL_COUNT": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PHY_CCA_FAIL_COUNT', _raw_value=5955, reset_count=35, _last_reset_value=384090)"
},
"BROADCAST_TABLE_FULL": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='BROADCAST_TABLE_FULL', _raw_value=65535, reset_count=35, _last_reset_value=2228190)"
},
"PTA_LO_PRI_REQUESTED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PTA_LO_PRI_REQUESTED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"PTA_HI_PRI_REQUESTED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PTA_HI_PRI_REQUESTED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"PTA_LO_PRI_DENIED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PTA_LO_PRI_DENIED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"PTA_HI_PRI_DENIED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PTA_HI_PRI_DENIED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"PTA_LO_PRI_TX_ABORTED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PTA_LO_PRI_TX_ABORTED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"PTA_HI_PRI_TX_ABORTED": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='PTA_HI_PRI_TX_ABORTED', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"ADDRESS_CONFLICT_SENT": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='ADDRESS_CONFLICT_SENT', _raw_value=0, reset_count=35, _last_reset_value=0)"
},
"EZSP_FREE_BUFFERS": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='EZSP_FREE_BUFFERS', _raw_value=98, reset_count=35, _last_reset_value=0)"
}
},
"controller_app_counters": {
"watchdog_reset_requests": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='watchdog_reset_requests', _raw_value=6194, reset_count=1, _last_reset_value=1)"
},
"unicast_rx": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='unicast_rx', _raw_value=485040, reset_count=0, _last_reset_value=0)"
},
"broadcast_rx": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='broadcast_rx', _raw_value=37334, reset_count=0, _last_reset_value=0)"
},
"unicast_tx_success": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='unicast_tx_success', _raw_value=279276, reset_count=0, _last_reset_value=0)"
},
"unicast_tx_failure": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='unicast_tx_failure', _raw_value=30656, reset_count=0, _last_reset_value=0)"
},
"broadcast_tx_failure_unexpected": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='broadcast_tx_failure_unexpected', _raw_value=294, reset_count=0, _last_reset_value=0)"
},
"broadcast_tx_success_unexpected": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='broadcast_tx_success_unexpected', _raw_value=33541, reset_count=0, _last_reset_value=0)"
},
"unicast_tx_success_duplicate": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='unicast_tx_success_duplicate', _raw_value=31, reset_count=0, _last_reset_value=0)"
},
"unicast_tx_success_unexpected": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='unicast_tx_success_unexpected', _raw_value=52, reset_count=0, _last_reset_value=0)"
},
"unicast_tx_failure_unexpected": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='unicast_tx_failure_unexpected', _raw_value=14, reset_count=0, _last_reset_value=0)"
},
"nwk_conflicts": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='nwk_conflicts', _raw_value=2, reset_count=0, _last_reset_value=0)"
},
"unicast_tx_failure_duplicate": {
"__type": "<class 'zigpy.state.Counter'>",
"repr": "Counter(name='unicast_tx_failure_duplicate', _raw_value=9, reset_count=0, _last_reset_value=0)"
}
}
},
"broadcast_counters": {},
"device_counters": {},
"group_counters": {}
},
"energy_scan": {
"11": 80.38447947821754,
"12": 88.70042934643088,
"13": 89.93931580241996,
"14": 73.50699819621309,
"15": 97.39286236923465,
"16": 97.97769123383605,
"17": 97.39286236923465,
"18": 98.62178092672917,
"19": 65.26028270288712,
"20": 23.33483723001185,
"21": 31.01324838787301,
"22": 28.30261646762903,
"23": 13.711043742539033,
"24": 15.32285793082191,
"25": 21.09014924761344,
"26": 97.97769123383605
},
"versions": {
"bellows": "0.37.6",
"zigpy": "0.60.6",
"zigpy_deconz": "0.22.4",
"zigpy_xbee": "0.20.1",
"zigpy_znp": "0.12.1",
"zigpy_zigate": "0.12.0",
"zhaquirks": "0.0.109"
}
}
}
``
@tjerkw Can you include a startup debug log? Leave it running for about half an hour.
Update. As I run HA on a VM, I reverted to a working snapshot (OS 11.2, Core 2023.12.4). All Zigbee ran like normal, no delays. I let it run a few days and confirmed things were stable, did another snapshot again for good measure, and then did the upgrade to OS 11.4 and Core 2024.1.5. Notably different is that I did not incrementally update from 2023.12.4 -> 2024.01.1-4 as I did before, rather just 2023.12.4 -> 2024.1.5.
Interestingly, I am not encountering any delays this time. It’s been a day since I did this. This isn’t the first time this technique “fixed” a Zigbee problem for me.
Hope this helps someone else.
I have the same behaviour like nerdyninny currently. 2024.1.5 seems to has it fixed if directly updated from 2023.12.4 (but previous tests was also directly to the latest stable. as my last test was from 2023.12.4 to 2024.1.3 which did not resolved this issue
directly after start there was still a little delay of around 3-4 seconds. but after around 5minutes no delay anymore.
thanks to @marc4s and @nerdyninny to report this, I'll do my upgrade now to test too.. if it works would be bizarre as in changelog there are no ZHA , nor related zigbee change . let' see
RAM usage increased from around 2gb to around 4,5gb since (debug already disabled). but this is the only disadvantage till now
rebooted. RAM usage back at 2gb. and delay is also back. debug log will follow soon
200mb was to big for github, so i used wetransfer
2024-01-26 09:06:35.538 DEBUG (MainThread) [zigpy.application] Max concurrency (16) reached, delaying request (1978 enqueued) 2024-01-26 09:25:40.793 DEBUG (MainThread) [zigpy.application] Max concurrency (16) reached, delaying request (7796 enqueued) this is increasing all the time
Hi all. I too noticed lag and delays in last HA versions (I don't remember which one was the first). I have zigbee radiator valves and they reported regularly their status, but I'm not able to change the temperature target anymore. I repaired them but the issue persists. So, I've tried again to repair them directly to the coordinator, bypassing the router in between and at the moment valves seems to work properly as before. However, I will continue to check in the next days, but in the meantime perhaps someone else could test the same change bypassing the router
Hi all. I too noticed lag and delays in last HA versions (I don't remember which one was the first). I have zigbee radiator valves and they reported regularly their status, but I'm not able to change the temperature target anymore. I repaired them but the issue persists. So, I've tried again to repair them directly to the coordinator, bypassing the router in between and at the moment valves seems to work properly as before. However, I will continue to check in the next days, but in the meantime perhaps someone else could test the same change bypassing the router
What was it paired to previously? Manufacturer and model please
Radiator valves Danfoss eTRV0103 paired with Xbee S2C PRO configured as router; Coordinator: usb Conbee II
Reporting in. ZHA still working since upgrade from 2023.12.4 to 2024.1.5.
@nerdyninny - which device are you using? I am using Sonoff USB Dongle P with pre-installed firmware. I think about updateing the firmware
@nerdyninny , Hi , did you reboot?, as @marc4s had same results, but after rebooting he go the same delay
Some more updates from my side - am quite confused what's going on...
a) Setup was more or less stable last week - with the delayed time growing sometimes throughout the day. b) Repaired 2-3 sensors that had become unresponsive throughout the week AND added a single new Sonoff SNZB-06P yesterday (30.01.2024 around 2pm) for testing purposes as replacement for the Tuya mmwave sensors. However had stability issues with the sensor (quickly became unavailable) but let it sit plugged anyway. Over night (around after 3am) the whole setup went fubar again, leaving me unable to turn on/off the zigbee bulbs.
-> this is covered in home-assistant_zha_2024-01-31T10-30-40.442Z.zip home-assistant_zha_2024-01-31T10-30-40.442Z.zip
Fearing this might be related to the new sonoff mmwave, I unplugged it (and left it that way) and rebooted (without improvement).
Seeing there was an update to 2024.1.6 I tried to upgrade to this version, as the changelog promised a bump in the zigpy dependency. However, the update seemed to not work out, being stuck at starting ZHA and then automatically rolling back to the 2024.1.6 backup. (That's at least the impression it made to me - could only follow this during some very short breaks.)
At the moment I am again on 2024.1.5 - have rebooted twice and am still experiencing very high delays, constantly increasing, according to a simple
grep -i "Previously delayed request is now running, delayed by" home-assistant_zha_2024-01-31T12-41-13.192Z.log
home-assistant_zha_2024-01-31T12-41-13.192Z.zip
Hello,
I'm experiencing the same problem, very unresponsive ZigBee network for about two weeks. But only in one direction, that is, when I give instructions to turn on/off the lights, for example. Status updates continue to work instantly.
ie: If I turn on a light on the physical switch, I immediately see the status on the HA. Motion sensors also report status instantly If I want to turn on the light via HA, it takes a few minutes to execute,
I have about 57 zigbee devices, 12 of them are mmWave.
I have a couple of these in my LOG:
2024-01-31 14:21:59.867 DEBUG (MainThread) [zigpy.application] Max concurrency (16) reached, delaying request (419 enqueued)
...
2024-01-31 14:21:59.815 DEBUG (MainThread) [zigpy.application] Previously delayed request is now running, delayed by 15.73s
...
2024-01-31 14:21:59.926 DEBUG (MainThread) [zigpy_znp.zigbee.application] Request failed (Unsuccessful request status code: <Status.MAC_CHANNEL_ACCESS_FAILURE: 225>), retry attempt 4 of 5 (RouteDiscovery|LastGoodRoute)
After some additional research, I realized this has happened in the past: https://github.com/home-assistant/core/issues/78163 (I hope it helps to solve the problem)
MAC_CHANNEL_ACCESS_FAILURE
means that the radio is refusing to send requests because you're experiencing extreme amounts of interference. Make sure your coordinator is on a USB extension cable and positioned away from interference sources such as USB 3.0 ports, WiFi APs, routers, SSDs, etc.
I have about 57 zigbee devices, 12 of them are mmWave.
Did they work in the past? If so, what version? Can you downgrade HA Core and create a debug log with that version as well?
We’ve already tested the extension cable .. nothing changed ..
@frank777777 You and @netsoft-ruidias are different people 😄. MAC_CHANNEL_ACCESS_FAILURE
is a hard failure reported by the radio firmware. No way around it.
As mentioned above, upload debug logs from both versions! Until there's enough information to track this issue down, nothing will be fixed. I can't buy a dozen Tuya mmWave devices just to test this one issue.
@puddly
MAC_CHANNEL_ACCESS_FAILURE means that the radio is refusing to send requests because you're experiencing extreme amounts of interference. Make sure your coordinator is on a USB extension cable and positioned away from interference sources such as USB 3.0 ports, WiFi APs, routers, SSDs, etc.
I can do this, but be aware that nothing has changed, the coordinator has been in the same place for a year (since day one), and nothing has been added, nothing has been removed.
Did they work in the past? If so, what version?
Yes, of course yes, it has always worked until now. I update to the latest version every month, we can say that it worked perfectly until version 2013.12 and stopped working in version 2014.01.
Can you downgrade HA Core and create a debug log with that version as well?
It will be more complicated, but I'll try to do it tonight, I'll keep you posted.
BTW: Do you know anything about the issue from 2022? that I referencied here https://github.com/home-assistant/core/issues/78163?
Thanks
Hello,
I'm experiencing the same problem, very unresponsive ZigBee network for about two weeks. But only in one direction, that is, when I give instructions to turn on/off the lights, for example. Status updates continue to work instantly.
ie: If I turn on a light on the physical switch, I immediately see the status on the HA. Motion sensors also report status instantly If I want to turn on the light via HA, it takes a few minutes to execute,
I have about 57 zigbee devices, 12 of them are mmWave.
I have a couple of these in my LOG:
2024-01-31 14:21:59.867 DEBUG (MainThread) [zigpy.application] Max concurrency (16) reached, delaying request (419 enqueued) ... 2024-01-31 14:21:59.815 DEBUG (MainThread) [zigpy.application] Previously delayed request is now running, delayed by 15.73s ... 2024-01-31 14:21:59.926 DEBUG (MainThread) [zigpy_znp.zigbee.application] Request failed (Unsuccessful request status code: <Status.MAC_CHANNEL_ACCESS_FAILURE: 225>), retry attempt 4 of 5 (RouteDiscovery|LastGoodRoute)
After some additional research, I realized this has happened in the past: #78163 (I hope it helps to solve the problem)
I confirm that the issue is only in one direction.... did you test if bypassing a router there is some improvement?
My ZHA is in failed state again (all Zigbee related devices disabled).. I found the initial log lines causing it.. Only a restart helps:
024-01-31 19:43:57.130 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
2024-01-31 19:43:57.251 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.252 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.252 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.252 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.253 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.253 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.253 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.254 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.254 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.255 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.255 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.255 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.255 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.256 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.256 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.256 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:43:57.257 WARNING (MainThread) [homeassistant.helpers.dispatcher] Unable to remove unknown dispatcher <bound method GroupProbe._reprobe_group of <homeassistant.components.zha.core.discovery.GroupProbe object at 0x7f92abb510>>
2024-01-31 19:44:01.035 ERROR (MainThread) [zigpy.zcl] [0xCB6A:1:0x0b04] Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/zigpy/zcl/__init__.py", line 411, in reply
return await self._endpoint.reply(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/zigpy/endpoint.py", line 278, in reply
return await self.device.reply(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/zigpy/device.py", line 484, in reply
return await self.request(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/zigpy/device.py", line 318, in request
await send_request()
File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 833, in request
await self.send_packet(
File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 868, in send_packet
status, _ = await self._ezsp.sendUnicast(
^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'sendUnicast'
And then the NoneType' object has no attribute 'sendUnicast'
forever
What is "NCP entered failed state."?
I'm using a Home Assistent Yellow, with multiprotocol disabled. HA Core 2024.1.5
@frank777777 You and @netsoft-ruidias are different people 😄.
MAC_CHANNEL_ACCESS_FAILURE
is a hard failure reported by the radio firmware. No way around it.As mentioned above, upload debug logs from both versions! Until there's enough information to track this issue down, nothing will be fixed. I can't buy a dozen Tuya mmWave devices just to test this one issue.
Hey I have uploaded logs on this a few times (I think) so can you send along a link or something that tells me exactly what you might need? I think I reported this rounding in a month ago. I have mmWave devices, but the issue was there when they were on / in and off / out. I have the sky dongle, usb cable, etc... and things work PERFECTLY on versions from back in Dec. I'll send you a couple mmWave devices if that helps too :P hahaha but seriously the logs on HA are so absurdly loggy...what do I do to help hone in ZHA for us?
I see the same type of issue, I have updated at the beginning of the month and retried with the current latest version and a new machine. Both cases it seems the work fine at first, than the delays start and than the controller goes offline, when reloading the stick it fixes the issue for some minutes and than it starts all over again. I have some logs if they would help.
home-assistant_zha_2024-02-01T00-53-42.272Z.log
Downgrading to 2023.12.3 resolves the issue.
Hi all. I too noticed lag and delays in last HA versions (I don't remember which one was the first). I have zigbee radiator valves and they reported regularly their status, but I'm not able to change the temperature target anymore. I repaired them but the issue persists. So, I've tried again to repair them directly to the coordinator, bypassing the router in between and at the moment valves seems to work properly as before. However, I will continue to check in the next days, but in the meantime perhaps someone else could test the same change bypassing the router
Hi all. In my case, I confirm the issue is present only for end devices served by routers, instead of coordinator directly. Communication with them is working only in one direction: from end devices. So, "uplink" connection doesn't work crashing with a timeout error "Failed to send request: device did not respond" I also confirm that this problem was not present in december. Something seems to be changed in internal data handling
@puddly
Yesterday, to debug the interference problem, I did two experiments, I changed the site coordinator and I changed the zigbee channel from 15 (auto) to 25.
After a few minutes the network began to "heal" itself.
The network worked fine for a couple of hours before it gradually began to become congested and unresponsive again. This morning it is as it was yesterday, without responding to commands (the status update continues to work, apparently)
This is my log from today: home-assistant_zha_2024-02-01T11-39-46.127Z.log
Please note the number of Max concurrency (16) reached, delaying request (xxxx enqueued)
entries...
Where can we see which requests are enqueued? Is there a way to reset this?
Any help would be greatly appreciated, thank you
BTW: today I also downgraded HA to v2023.12.3. For now everything seems to be OK and responsive, however we need to wait a few hours to be sure (the system degrades over time) As soon as I have more information, I will let you know.
BTW: today I also downgraded HA to v2023.12.3. For now everything seems to be OK and responsive, however we need to wait a few hours to be sure (the system degrades over time) As soon as I have more information, I will let you know.
Status Update: It's been seven hours since I went back to version 2023.12 and everything seems to work spectacularly well, without any service degradation.
I believe once again it has been proven that the problem is in version 2024.01.
cc: @dmulcahey, @Adminiuga, @puddly, @TheJulianJES, mind taking a look at this issue as you are listed as a code owner? Thanks!
The problem
Since I've upgraded to 2024.1.0 my ZigBee devices (lights, switches, fans etc) are having a few seconds/minutes delay in responding to commands. I have tried to restore to previous versions but the issue still persist, no matter the version I am restoring to. I have also tried restarting in safe mode, but it didn't helped. Any help is much appreciated!
What version of Home Assistant Core has the issue?
core-2024.1.0
What was the last working version of Home Assistant Core?
core-2023.12.3
What type of installation are you running?
Home Assistant OS
Integration causing the issue
ZHA
Link to integration documentation on our website
https://www.home-assistant.io/integrations/zha/
Diagnostics information
config_entry-zha-1122b1901b65684e9e9a05c89bd2a54d.json.txt
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
System Information
Home Assistant Community Store
GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 4955 Installed Version | 1.33.0 Stage | running Available Repositories | 1371 Downloaded Repositories | 7 HACS Data | okHome Assistant Cloud
logged_in | true -- | -- subscription_expiration | 22 January 2024 at 00:00 relayer_connected | true relayer_region | eu-central-1 remote_enabled | true remote_connected | true alexa_enabled | true google_enabled | false remote_server | eu-central-1-9.ui.nabu.casa certificate_status | ready instance_id | 8bf80b83acdc431081381385ed2692e2 can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | okHome Assistant Supervisor
host_os | Home Assistant OS 11.2 -- | -- update_channel | stable supervisor_version | supervisor-2023.12.1 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 58.0 GB disk_used | 12.2 GB healthy | true supported | true board | generic-x86-64 supervisor_api | ok version_api | ok installed_addons | EMQX (0.3.0), Samba share (12.2.0), Studio Code Server (5.14.2), openWakeWord (1.8.2), Advanced SSH & Web Terminal (17.0.1)Dashboards
dashboards | 2 -- | -- resources | 1 views | 12 mode | storageMercedesME 2020
websocket_connection_state | connected -- | -- api_endpoint_reachable | okRecorder
oldest_recorder_run | 21 December 2023 at 17:55 -- | -- current_recorder_run | 4 January 2024 at 23:22 estimated_db_size | 308.77 MiB database_engine | sqlite database_version | 3.41.2