Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
12.03k stars 1.67k forks source link

Z2M seems to be losing routes/bind to router/coordinator. #24401

Open gcs8 opened 1 week ago

gcs8 commented 1 week ago

What happened?

My Z2M was running fine until maybe a month ago, tried all kinds of stuff to try and fix it, no luck, so, I burnt my Z2M config to the ground, purged it from home assistant, purged it's data, it's config, everything I could find. Started from scratch, put the SLZB-06 by SMLIGHT back to v2.3.6 / zigbee: 20221226 where it's not flagged as dev firmware, re-paired all the devices and the map worked great for a couple of days, then, stuff stopped working, devices could no longer be paired, looking at the devices bindings and they are empty or odd, the map no longer shows the routers talking to anything other than the coordinator.

image

image

image

What did you expect to happen?

routes to not break in a mesh

How to reproduce it (minimal and precise)

implement z2m using a SMLIGHT SLZB-06?

Zigbee2MQTT version

1.40.2 commit: unknown

Adapter firmware version

20221226

Adapter

SMLIGHT SLZB-06

Setup

Home Assist OS in a ESXi VM

Debug log

log.zip

habitats-tech commented 1 week ago

I would suggest you look into repositioning either the coordinator or the router. Based on my experience the SLZB-06 antenna is not as efficient as SLZB-06M or SLZB-06P7. Try to have the antenna at an angle to the SLZB-06 or change its location away from metallic objects.

gcs8 commented 6 days ago

@habitats-tech 3 router class devices and 6 EndDevice class sensors are in the same room as it and are having issues, so, I don't think that's it.

gcs8 commented 6 days ago

it looks like there are two errors that are consistent.

1.> z2m: Failed to execute LQI for

2.> when restarting z2m [2024-10-19 09:50:47] info: z2m: Disconnecting from MQTT server [2024-10-19 09:50:47] info: z2m: Stopping zigbee-herdsman... [2024-10-19 09:50:58] info: zh:controller: Wrote coordinator backup to '/config/zigbee2mqtt/coordinator_backup.json' [2024-10-19 09:50:58] info: zh:zstack:znp: closing [2024-10-19 09:50:58] info: z2m: Stopped zigbee-herdsman [2024-10-19 09:50:58] info: z2m: Stopped Zigbee2MQTT

/app/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:264 var er = new ERR_STREAM_WRITE_AFTER_END(); ^ Error: write after end at writeAfterEnd (/app/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:264:12) at DerivedLogger.Writable.write (/app/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:300:21) at DerivedLogger.log (/app/node_modules/winston/lib/winston/logger.js:231:12) at Logger.log (/app/lib/util/logger.ts:198:25) at Logger.info (/app/lib/util/logger.ts:211:14) at Znp.onPortClose (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:96:16) at Object.onceWrapper (node:events:632:26) at Socket.emit (node:events:529:35) at TCP. (node:net:350:12) [09:50:58] INFO: Preparing to start... [09:50:59] INFO: Socat not enabled [09:51:00] INFO: Starting Zigbee2MQTT...

but I am not sure 2 would eat the routes/binding of the devices on the mesh.

danilovvs commented 5 days ago

похоже, что есть две ошибки, которые являются последовательными.

1.>z2m: Failed to execute LQI for

2.> при перезапуске z2m [2024-10-19 09:50:47] информация: z2m: Отключение от сервера MQTT [2024-10-19 09:50:47] информация: z2m: Остановка zigbee-herdsman... [2024-10-19 09:50:58] информация: zh:controller: Записана резервная копия координатора в '/config/zigbee2mqtt/coordinator_backup.json' [2024-10-19 09:50:58] информация: zh:zstack:znp: закрытие [2024-10-19 09:50:58] информация: z2m: Остановлен zigbee-herdsman [2024-10-19 09:50:58] информация: z2m: Остановлен Zigbee2MQTT

/app/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:264 var er = new ERR_STREAM_WRITE_AFTER_END(); ^ Ошибка: запись после завершения в writeAfterEnd (/app/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:264:12) в DerivedLogger.Writable.write (/app/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:300:21) в DerivedLogger.log (/app/node_modules/winston/lib/winston/logger.js:231:12) в Logger.log (/app/lib/util/logger.ts:198:25) в Logger.info (/app/lib/util/logger.ts:211:14) в Znp.onPortClose (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:96:16) в Object.onceWrapper (node:events:632:26) в Socket.emit (node:events:529:35) в TCP. (node:net:350:12) [09:50:58] ИНФОРМАЦИЯ: Подготовка к запуску... [09:50:59] ИНФОРМАЦИЯ: Socat не включен [09:51:00] ИНФОРМАЦИЯ: Запуск Zigbee2MQTT...

но я не уверен, что 2 будут использовать маршруты/привязки устройств в сетке.

I have the same problem. The whole network has literally been crumbling for the last week

Nerivec commented 5 days ago

That second error shouldn't be a problem for the network (it's just a logging fail on stop).

The first one though, would definitely mess up the reported "state of the network" in the network map, so you can't really rely on what it is showing. Seems to be timing out mostly. Any chance something changed in your environment since you started having issues (before the burn)? A WiFi router changed its channel, or some other kind of interference was introduced? If you have a spare ember adapter around, you can use Ember ZLI to scan for ZigBee channels usage.

You are getting TABLE_FULL errors when Z2M tries to setup binds on 3RSB015BZ, so that could be part of the problem. See if there is a hard reset procedure for these, to put them back to factory defaults.

I know SMLight released a lot of fixes in recent releases, so maybe the version (2022) you went back to is a bit too far (though admittedly the very latest could have some undiscovered issues). Maybe see if it behaves better with something more recent (not sure exactly what versions are available for that specific adapter)?

gcs8 commented 5 days ago

@Nerivec I have tried a bunch of adapter ZigBee and FW combos, and they all kinda fall over, I want to say it was fine a few Z2M versions ago then just rolling through updates one day it cropped up. I have simi noticed after burning it down, nuking the MQTT data, the config, everything, it was fine until I want to say a little after I did an update/reboot of home assistant then it started having issues such as router to router links no longer showing up, then it stops allowing you to add new devices at some point.

For all the routers I did a full 10 sec button press to hard reset them, the inovelli switches I did the 20 sec up and seen button reset, I have also tried only allowing joins from the coordinator, or a specific router and no joy there either.

gcs8 commented 4 days ago

also, @Nerivec I am on channel 20, so, should be safe from the rouge WiFis ™️

Nerivec commented 4 days ago

Are all the adapters you tried TCP-based? All zStack-based? Anything specific in your configuration that you think would be different from most setups?

habitats-tech commented 4 days ago

@habitats-tech 3 router class devices and 6 EndDevice class sensors are in the same room as it and are having issues, so, I don't think that's it.

I would suggest you take a stepped approach to attempt resolve the issue. Pair one router-device and one end-device and see if this setup remains stable. Add end-devices slowly and only once you are confident everything works add the rest of the routers. From my experience if your devices are Tuya you are in for lots of surprises over the long term. Tuya devices work well using Tuya coordinators. Some Tuya router-devices do not route properly and some end-devices do not report as expected (e.g. thermometers taking random hours to report, with no ability to configure reporting). This is why it is always best when you report issues also provide reference to what devices you are using.

gcs8 commented 4 days ago

By model lumi.weather: 11 3RSP02028BZ: 3 VZM31-SN: 2 lumi.motion.ac02: 2 lumi.sensor_wleak.aq1: 2 3RSB015BZ: 2 lumi.sensor_occupy.agl1: 1