Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.7k stars 1.64k forks source link

Moved from HA-Add on to Docker compose. Problems started with zigbee2mqtt #21906

Open AussiSG opened 5 months ago

AussiSG commented 5 months ago

What happened?

Moved from HA-Add on to a docker compose environment. This to size down the HA build and be less depend of it. After the move every now and then the zigbee network is getting interrupted or I loose connection to devices.

What did you expect to happen?

Hoped to get stable indepented network outside of HA..

How to reproduce it (minimal and precise)

almost every day the problem occurs..

Zigbee2MQTT version

1.36.0 commit: 86ed71c

Adapter firmware version

20190608

Adapter

CC2531

Setup

Proxmox and docker compose.

Debug log

This is what happend just now, zigbee2mqtt seems to have loose the connection to the zigbee usb controller.... But I have also had the problem that only partial zigbee devices were pingable

Zigbee2MQTT:warn 2024-03-22 10:02:32: Failed to ping 'WCD Schuurverlichting' (attempt 1/2, Read 0x0c4314fffe20ca51/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)) Zigbee2MQTT:warn 2024-03-22 10:02:41: Failed to ping 'WCD Schuurverlichting' (attempt 2/2, Read 0x0c4314fffe20ca51/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)) Zigbee2MQTT:info 2024-03-22 10:02:44: MQTT publish: topic 'zigbee2mqtt/WCD Schuurverlichting/availability', payload '{"state":"offline"}' Error: SRSP - AF - dataRequestExt after 6000ms at Object.start (/app/node_modules/zigbee-herdsman/src/utils/waitress.ts:63:23) at /app/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:312:45 at Queue.execute (/app/node_modules/zigbee-herdsman/src/utils/queue.ts:35:26) at Znp.request (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:300:27) at ZStackAdapter.dataRequestExtended (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/zStackAdapter.ts:997:24) at /app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/zStackAdapter.ts:557:24 at Queue.execute (/app/node_modules/zigbee-herdsman/src/utils/queue.ts:35:26) at ZStackAdapter.sendZclFrameToAll (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/zStackAdapter.ts:555:27) at GreenPower.permitJoin (/app/node_modules/zigbee-herdsman/src/controller/greenPower.ts:237:32) at Controller.permitJoinInternal (/app/node_modules/zigbee-herdsman/src/controller/controller.ts:275:35) Using '/app/data' as data directory Zigbee2MQTT:info 2024-03-22 10:03:43: Logging to console and directory: '/app/data/log/2024-03-22.10-03-42' filename: log.txt Zigbee2MQTT:info 2024-03-22 10:03:43: Starting Zigbee2MQTT version 1.36.0 (commit #86ed71c) Zigbee2MQTT:info 2024-03-22 10:03:43: Starting zigbee-herdsman (0.35.1) Zigbee2MQTT:error 2024-03-22 10:04:24: Error while starting zigbee-herdsman Zigbee2MQTT:error 2024-03-22 10:04:24: Failed to start zigbee Zigbee2MQTT:error 2024-03-22 10:04:24: Check https://www.zigbee2mqtt.io/guide/installation/20_zigbee2mqtt-fails-to-start.html for possible solutions Zigbee2MQTT:error 2024-03-22 10:04:24: Exiting... Zigbee2MQTT:error 2024-03-22 10:04:24: Error: Failed to connect to the adapter (Error: SRSP - SYS - ping after 6000ms) at ZStackAdapter.start (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/zStackAdapter.ts:103:27) at Controller.start (/app/node_modules/zigbee-herdsman/src/controller/controller.ts:132:29) at Zigbee.start (/app/lib/zigbee.ts:62:27) at Controller.start (/app/lib/controller.ts:109:27) at start (/app/index.js:107:5) Using '/app/data' as data directory

Info of docker : root@docker:~# ls -l /dev/serial total 0 drwxr-xr-x 2 nobody nogroup 60 Mar 21 05:16 by-id

root@docker:~# ls -l /dev/tty* crw-rw-rw- 1 nobody nogroup 5, 0 Mar 20 11:07 /dev/tty crw------- 1 root tty 136, 1 Mar 22 2024 /dev/tty1 crw--w---- 1 root tty 136, 2 Mar 22 09:21 /dev/tty2 crw-rw---- 1 root dialout 166, 0 Mar 22 09:18 /dev/ttyACM0

shaohme commented 5 months ago

Looks to me like your system hit the same error as I do #21904

AussiSG commented 5 months ago

I think so..... currenty it seems very unstable

I Just lost my devices and the logs present the following:

warning 2024-03-22 19:52:12Failed to ping 'Lamp TV' (attempt 1/1, Read 0x842e14fffe0d9e2f/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))
warning 2024-03-22 19:52:18Failed to ping 'WCD Schuurverlichting' (attempt 1/1, Read 0x0c4314fffe20ca51/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'MAC no ack' (233)))
warning 2024-03-22 19:52:33Failed to ping 'WCD TV Audio' (attempt 1/1, Read 0xa4c13800e643aedc/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Timeout - 28413 - 1 - 6 - 0 - 1 after 10000ms))
warning 2024-03-22 19:52:38Failed to ping 'WCD Wasmachine' (attempt 1/1, Read 0xa4c138e52efa51d6/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))
shaohme commented 5 months ago

I think so..... currenty it seems very unstable

I Just lost my devices and the logs present the following:

warning 2024-03-22 19:52:12Failed to ping 'Lamp TV' (attempt 1/1, Read 0x842e14fffe0d9e2f/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))
warning 2024-03-22 19:52:18Failed to ping 'WCD Schuurverlichting' (attempt 1/1, Read 0x0c4314fffe20ca51/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'MAC no ack' (233)))
warning 2024-03-22 19:52:33Failed to ping 'WCD TV Audio' (attempt 1/1, Read 0xa4c13800e643aedc/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Timeout - 28413 - 1 - 6 - 0 - 1 after 10000ms))
warning 2024-03-22 19:52:38Failed to ping 'WCD Wasmachine' (attempt 1/1, Read 0xa4c138e52efa51d6/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))

From what I can tell, those warnings could show up because devices are simply turned off.

chris-1243 commented 5 months ago

Error: Failed to connect to the adapter (Error: SRSP - SYS - ping after 6000ms)

Your adapter crashed. You may find this line several time in your log.

You might consider the advice given in the link below:

https://www.zigbee2mqtt.io/guide/faq/#i-read-that-zigbee2mqtt-has-a-limit-of-20-devices-when-using-a-cc2530-cc2531-adapter-is-this-true

AussiSG commented 5 months ago

Error: Failed to connect to the adapter (Error: SRSP - SYS - ping after 6000ms)

Your adapter crashed. You may find this line several time in your log.

You might consider the advice given in the link below:

https://www.zigbee2mqtt.io/guide/faq/#i-read-that-zigbee2mqtt-has-a-limit-of-20-devices-when-using-a-cc2530-cc2531-adapter-is-this-true

I only have 17 devices connected... And also a lot of AC powered device which then serve as a router. So not sure if that's the problem. I do see that I am running an older firmware on the cc2531

The problems really started when moving from HA-Addon to Docker ( compose ).

AussiSG commented 5 months ago

And agian overnight my zigbee network died: And the logs are flooded with these messages :(

Zigbee2MQTT:warn  2024-03-23 07:38:36: Failed to ping 'WCD Kerstboom' (attempt 1/1, Read 0x0c4314fffe10b5b1/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
Zigbee2MQTT:warn  2024-03-23 07:38:56: Failed to ping 'Lamp Bank' (attempt 1/1, Read 0x588e81fffe56e592/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
Zigbee2MQTT:warn  2024-03-23 07:39:33: Failed to ping 'WCD Vaatwasser' (attempt 1/1, Read 0xa4c138237a16949e/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
Zigbee2MQTT:warn  2024-03-23 07:42:33: Failed to ping 'WCD TV Audio' (attempt 1/1, Read 0xa4c13800e643aedc/1 genBasic(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
chris-1243 commented 5 months ago

Once again, your adapter is crashing. Device availability might be heavy work fo cc2531 coordinator... Increase the timeout (30, 60 minutes)

https://www.zigbee2mqtt.io/guide/configuration/device-availability.html

All my installation is running on Docker without issue.

Check the usb autosuspend feature as well.


The CC2530/CC2531 is considered legacy hardware and runs into memory corruption easily.
AussiSG commented 5 months ago

Alright alright 😁 I have order a more compatible version.

Still finding it strange that this behaviour has begun when moving from HA-addon to Docker.

But will report back when moving over to a Zig-a-zig-ah stick

chris-1243 commented 5 months ago

You may consider this thread for firmware update

https://github.com/Koenkk/Z-Stack-firmware/discussions/496

The latest stable firmware (20230507) is unfortunately known to have some issues

I hope you will have a more robust network.