Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.96k stars 1.67k forks source link

all my powered devices (routers) down after HA update #24387

Open silkyclouds opened 3 hours ago

silkyclouds commented 3 hours ago

What happened?

Hi,

I noticed some slowness when trying to use my zigbee devices this morning, and took that opportunity to update HA to latest version, knowing it would have to be restarted, sometimes, a restart fixes such things...

I now run HA version 2024.10.2, and Z2M Current version: 1.40.2-1. My controller is a sonoff zigbee dongle (the P model) and is up-to-date as well, latest FW flashed on it from @Koenkk )

weird issue, I can see all my "non-router" devices listed. but all the non battery powered ones are gone. here is what I can see in Z2M logs at start :

[08:10:24] INFO: Preparing to start... [08:10:25] INFO: Socat not enabled [08:10:28] INFO: Starting Zigbee2MQTT... Starting Zigbee2MQTT without watchdog. [2024-10-18 08:10:44] error: zh:zstack:znp: Failed to determine if path is valid: 'Error: spawn udevadm ENOENT' [2024-10-18 08:12:12] error: z2m: Failed to call 'Groups' 'start' (AssertionError [ERR_ASSERTION]: GroupID must be at least 1 at Function.create (/app/node_modules/zigbee-herdsman/src/controller/model/group.ts:129:15) at Controller.createGroup (/app/node_modules/zigbee-herdsman/src/controller/controller.ts:505:22) at Zigbee.createGroup (/app/lib/zigbee.ts:409:23) at Groups.syncGroupsWithSettings (/app/lib/extension/groups.ts:81:118) at Groups.start (/app/lib/extension/groups.ts:52:20) at Controller.callExtensions (/app/lib/controller.ts:399:42) at processTicksAndRejections (node:internal/process/task_queues:95:5) at Controller.start (/app/lib/controller.ts:218:9) at start (/app/index.js:154:5))

the logs whill the keep looping on ping failure messages. Restarting Z2M keeps displaying the same error, and I can't get my devices to get back up.

it looks like udev ain't found anymore, did something change in latest HASS ?

zh:zstack:znp: Failed to determine if path is valid: 'Error: spawn udevadm ENOENT'

and after that, there is this group ID problem, but I'm unsure how to fix this :

Failed to call 'Groups' 'start' (AssertionError [ERR_ASSERTION]: GroupID must be at least 1

here is how it looks in the GUI :

image

As you might imagine, this is really, really problematic. Do anyone else encounter the same issue ? any clue about that error message ?

What did you expect to happen?

my line powered / router devices should show up !

How to reproduce it (minimal and precise)

no idea, it started this morning after latest HA update.

Zigbee2MQTT version

1.40.2-1

Adapter firmware version

CC1352P2_CC2652P_launchpad_coordinator_20240710

Adapter

Sonoff dongle (P model)

Setup

HASS in a VM, passed coordinator, Z2M as an add-on.

Debug log

[08:24:14] INFO: Preparing to start... [08:24:14] INFO: Socat not enabled [08:24:15] INFO: Starting Zigbee2MQTT... Starting Zigbee2MQTT without watchdog. [2024-10-18 08:24:18] error: zh:zstack:znp: Failed to determine if path is valid: 'Error: spawn udevadm ENOENT' [2024-10-18 08:24:18] error: z2m: Failed to set permit join to false (--> 'SREQ: AF - dataRequestExt - {"dstaddrmode":2,"dstaddr":"0x000000000000fffd","destendpoint":242,"dstpanid":0,"srcendpoint":242,"clusterid":33,"transid":1,"options":0,"radius":30,"len":6,"data":{"type":"Buffer","data":[25,2,2,10,0,0]}}' failed with status '(0x11: BUFFER_FULL)' (expected '(0x00: SUCCESS)')) [2024-10-18 08:24:19] error: z2m: Failed to call 'Groups' 'start' (AssertionError [ERR_ASSERTION]: GroupID must be at least 1 at Function.create (/app/node_modules/zigbee-herdsman/src/controller/model/group.ts:129:15) at Controller.createGroup (/app/node_modules/zigbee-herdsman/src/controller/controller.ts:505:22) at Zigbee.createGroup (/app/lib/zigbee.ts:409:23) at Groups.syncGroupsWithSettings (/app/lib/extension/groups.ts:81:118) at Groups.start (/app/lib/extension/groups.ts:52:20) at Controller.callExtensions (/app/lib/controller.ts:399:42) at processTicksAndRejections (node:internal/process/task_queues:95:5) at Controller.start (/app/lib/controller.ts:218:9) at start (/app/index.js:154:5)) [2024-10-18 08:24:25] warning: z2m: Failed to ping 'Chambre Raph et Cel - luminaire - tradfri led light' (attempt 1/1, ZCL command 0x90fd9ffffed4466f/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (--> 'SREQ: AF - dataRequest - {"dstaddr":51290,"destendpoint":1,"srcendpoint":1,"clusterid":0,"transid":9,"options":0,"radius":30,"len":5,"data":{"type":"Buffer","data":[16,10,0,0,0]}}' failed with status '(0x11: BUFFER_FULL)' (expected '(0x00: SUCCESS)'))) [2024-10-18 08:24:27] warning: z2m: Failed to ping 'Chambre d'amis - luminaire (led centre) - tradfri led light' (attempt 1/1, ZCL command 0x90fd9ffffe175760/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (--> 'SREQ: AF - dataRequest - {"dstaddr":47180,"destendpoint":1,"srcendpoint":1,"clusterid":0,"transid":10,"options":0,"radius":30,"len":5,"data":{"type":"Buffer","data":[16,11,0,0,0]}}' failed with status '(0x11: BUFFER_FULL)' (expected '(0x00: SUCCESS)'))) [2024-10-18 08:24:29] warning: z2m: Failed to ping 'Chambre d'amis - luminaire (led gauche) - tradfri led light' (attempt 1/1, ZCL command 0x000b57fffeb89236/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (--> 'SREQ: AF - dataRequest - {"dstaddr":25054,"destendpoint":1,"srcendpoint":1,"clusterid":0,"transid":11,"options":0,"radius":30,"len":5,"data":{"type":"Buffer","data":[16,12,0,0,0]}}' failed with status '(0x11: BUFFER_FULL)' (expected '(0x00: SUCCESS)'))) [2024-10-18 08:24:31] warning: z2m: Failed to ping 'Salon - étagère de droite - tradfri led light' (attempt 1/1, ZCL command 0x040d84fffed2e6cd/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (--> 'SREQ: AF - dataRequest - {"dstaddr":56423,"destendpoint":1,"srcendpoint":1,"clusterid":0,"transid":12,"options":0,"radius":30,"len":5,"data":{"type":"Buffer","data":[16,13,0,0,0]}}' failed with status '(0x11: BUFFER_FULL)' (expected '(0x00: SUCCESS)'))) [2024-10-18 08:24:33] warning: z2m: Failed to ping 'Salon - barre son / etc. - prise tuya' (attempt 1/1, ZCL command 0xa4c13801c3d05d22/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (--> 'SREQ: AF - dataRequest - {"dstaddr":17038,"destendpoint":1,"srcendpoint":1,"clusterid":0,"transid":13,"options":0,"radius":30,"len":5,"data":{"type":"Buffer","data":[16,14,0,0,0]}}' failed with status '(0x11: BUFFER_FULL)' (expected '(0x00: SUCCESS)'))) [2024-10-18 08:24:35] warning: z2m: Failed to ping 'Salon - étagère de gauche - tradfri led light' (attempt 1/1, ZCL command 0x84b4dbfffe770a09/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (--> 'SREQ: AF - dataRequest - {"dstaddr":1987,"destendpoint":1,"srcendpoint":1,"clusterid":0,"transid":14,"options":0,"radius":30,"len":5,"data":{"type":"Buffer","data":[16,15,0,0,0]}}' failed with status '(0x11: BUFFER_FULL)' (expected '(0x00: SUCCESS)'))) [2024-10-18 08:24:37]

silkyclouds commented 3 hours ago

One more time, I could not love proxmox more :

image

I will let you know if things get back working after restoring my nightly backup...

But I think there is something wrong with last HASS / Z2M combo.

silkyclouds commented 3 hours ago

I've rolled back to previous version of HA, same issue, do you think my coordinator died ? Is it possible ???

I've check Z2M changelog, and it was updated 2 weeks ago, meaning it worked without any issue for approx 2 weeks, and suddenly stopped working.

As a reminder, this morning, I COULD act on my bulbs and other devices, but it was crazy "slow". and after a reboot, they were all gone.

As Z2M do start, I believe it can communicate with the coordinator though. My home is KAPUT, please help me :D

silkyclouds commented 1 hour ago

a little more investigation seem to show me that every single message I can see in the log indicated my coordination BUFFER is full :

[2024-10-18 10:58:58] warning: z2m: Failed to ping 'Cave - prise seche linge - prise tuya' (attempt 1/1, ZCL command 0xa4c138001c0367c7/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (--> 'SREQ: AF - dataRequest - {"dstaddr":39604,"destendpoint":1,"srcendpoint":1,"clusterid":0,"transid":147,"options":0,"radius":30,"len":5,"data":{"type":"Buffer","data":[16,148,0,0,0]}}' failed with status '(0x11: BUFFER_FULL)' (expected '(0x00: SUCCESS)')))

silkyclouds commented 40 minutes ago

OK, it turned ou that the BUFFER_FULL was due to interferences.

here is my full story : https://community.home-assistant.io/t/troubleshooting-zigbee2mqtt-issues-with-tp-link-deco-x50-routers-my-experience/783892