Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
12.08k stars 1.68k forks source link

zigbee2mqtt crashes after a while #2280

Closed Bruceforce closed 4 years ago

Bruceforce commented 5 years ago

Bug Report

What happened

zigbee2mqtt crashes after a while after upgrade from docker tag latest to latest-dev

What did you expect to happen

zigbee2mqtt shouldn't crash

How to reproduce it (minimal and precise)

Just let it run for a few hours / days.

Debug Info

zigbee2mqtt version: via docker tag latest-dev CC253X firmware version: 20190608

Sorry I can't provide debug logs since the crash happens only after a few days. Do these logs help? docker-compose log: https://gist.github.com/Bruceforce/7475b61be16e2c0a6772d145662ad056 npm debug log: https://gist.github.com/Bruceforce/fbbf4c475c9912ce69b7089a7c112e17

Koenkk commented 5 years ago

These logs don't help, I need the logs at the moment it crashes.

Bruceforce commented 5 years ago

Anything special to debug this or should I just set log level to debug in zigbee2mqtt and wait till it crashes?

Koenkk commented 5 years ago

Run the container with -e DEBUG=zigbee-herdsman:*.

Bruceforce commented 4 years ago

I started it with just loglevel set to debug without the environment variable yesterday, before you posted this. I got this log: https://gist.githubusercontent.com/Bruceforce/23818fd9a557041f3e681896b82a0262/raw/fae3b64764b48c2e5c19bd1ee4c7b0cd13e549ec/log.txt

Does this already help or do you still need me to set the environment variable? I'll do so anyways, but I can't get back to you until it crashes again.

Edit: Right after a rebuild of the container I also got this, but I assume this doesn't help since the error is gone after resetting the CC2531 again...

zigbee2mqtt       | Using '/app/data' as data directory
zigbee2mqtt       |
zigbee2mqtt       | > zigbee2mqtt@1.6.0 start /app
zigbee2mqtt       | > node index.js
zigbee2mqtt       |
zigbee2mqtt       | 2019-11-06T16:15:19.682Z zigbee-herdsman:controller:log Starting with options '{"network":{"networkKeyDistribute":false,"networkKey":[XXX],"panID":6754,"extenedPanID":[221,221,221,221,221,221,221,221],"channelList":[25]},"serialPort":{"baudRate":115200,"rtscts":true,"path":"/dev/ttyACM0"},"databasePath":"/app/data/database.db","backupPath":"/app/data/coordinator_backup.json"}'
zigbee2mqtt       | 2019-11-06T16:15:19.684Z zigbee-herdsman:adapter:zStack:znp:log Opening with /dev/ttyACM0 and {"baudRate":115200,"rtscts":true,"autoOpen":false}
zigbee2mqtt       | 2019-11-06T16:15:24.841Z zigbee-herdsman:adapter:zStack:znp:log Serialport opened
zigbee2mqtt       | 2019-11-06T16:15:24.841Z zigbee-herdsman:adapter:zStack:znp:log Writing skip bootloader payload
zigbee2mqtt       | 2019-11-06T16:15:24.842Z zigbee-herdsman:adapter:zStack:unpi:writer --> buffer [239]
zigbee2mqtt       | 2019-11-06T16:15:25.844Z zigbee-herdsman:adapter:zStack:znp:SREQ --> SYS - version - {}
zigbee2mqtt       | 2019-11-06T16:15:25.845Z zigbee-herdsman:adapter:zStack:unpi:writer --> frame [254,0,33,2,35]
zigbee2mqtt       | zigbee2mqtt:error 2019-11-06T16:15:31: Error while starting zigbee-herdsman
zigbee2mqtt       | zigbee2mqtt:error 2019-11-06T16:15:31: Failed to start zigbee
zigbee2mqtt       | zigbee2mqtt:error 2019-11-06T16:15:31: Exiting...
zigbee2mqtt       | zigbee2mqtt:error 2019-11-06T16:15:31: Error: SRSP - SYS - version after 6000ms
zigbee2mqtt       |     at Timeout.object.timer.setTimeout [as _onTimeout] (/app/node_modules/zigbee-herdsman/dist/utils/waitress.js:44:24)
zigbee2mqtt       |     at ontimeout (timers.js:436:11)
zigbee2mqtt       |     at tryOnTimeout (timers.js:300:5)
zigbee2mqtt       |     at listOnTimeout (timers.js:263:5)
zigbee2mqtt       |     at Timer.processTimers (timers.js:223:10)
zigbee2mqtt       | npm ERR! code ELIFECYCLE
zigbee2mqtt       | npm ERR! errno 1
zigbee2mqtt       | npm ERR! zigbee2mqtt@1.6.0 start: `node index.js`
zigbee2mqtt       | npm ERR! Exit status 1
zigbee2mqtt       | npm ERR!
zigbee2mqtt       | npm ERR! Failed at the zigbee2mqtt@1.6.0 start script.
zigbee2mqtt       | npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
zigbee2mqtt       |
zigbee2mqtt       | npm ERR! A complete log of this run can be found in:
zigbee2mqtt       | npm ERR!     /root/.npm/_logs/2019-11-06T16_16_08_112Z-debug.log
zigbee2mqtt exited with code 1
Koenkk commented 4 years ago

Can you disable the report and availability feature?

Bruceforce commented 4 years ago

Should I first wait until it crashes again with the environment variable set or disable these features right away?

I want to emphasize that these features didn't cause a crash on the old zigbee2mqtt versions and it would be great if they would work with zigbee herdsman as well.

Koenkk commented 4 years ago

I only expect that they can provide some troubles on startup, but I don't expect it to have anything to do with the crash after some uptime. So if zigbee2mqtt starts fine don't disable them.

Bruceforce commented 4 years ago

Yes they indeed cause some trouble at the start (devices show unavailable for a few minutes) but this settles after around 5 minutes max.

So I let it run with the environment set and post the new log when it crashes again. After that I'll try to disable report and availability just to make sure.

Bruceforce commented 4 years ago

Today it crashed again. Unfortunately zigbee2mqtt somehow didn't create a logfile by itself, so all I've got for you is the compose log. But I think it should contain the same as the zigbee2mqtt log. I included only a few minutes before the crash into the gist. If you want the full log, see the here compose_logs.txt.gz.

FYI: I noticed that it crashed right after I turned on power of three GU10 bulbs (ceiling spots).

https://gist.github.com/Bruceforce/3604d48e12afc9ce61657b7159d40d8d

Does this help or should I try something else?

Koenkk commented 4 years ago

The stick crashes, please disable the report and availability feature and check if things improve.

Bruceforce commented 4 years ago

Seems to be working - at least for now. But will this be the long time solution? That would be really bad, because without reporting enabled I very often get the wrong state of the lights. For example if I toggle them via Google Assistant.

Koenkk commented 4 years ago

If this works you can try enabling reporting again (without availability, as that will put the most stress on the CC2531). If you really want to keep using both, you might want to take a look at the CC2652R (#1429), which will be supported by the next zigbee2mqtt release.

Bruceforce commented 4 years ago

Okay. Nevermind. It crashed again with reporting and availability disabled. But I forget to collect debug logs. I'll enable them again and send you the logs.

Thanks for the CC2652R hint. I'll check it out.

Koenkk commented 4 years ago

How many devices do you have in your network?

Bruceforce commented 4 years ago

6 Ikea gu 10 bulbs 2 Ikea smart plugs 2 Osram smart plugs 1 Ikea E14 bulb 2 Xiaomi human body sensors 1 Xiaomi Switch 1 Ikea Dimmer

Edit: I think this should be it graphviz

Koenkk commented 4 years ago

That's not a very big network, I guess this should also work with the CC2531, did you already try the source routing firmware? This should also improve stability: https://github.com/Koenkk/Z-Stack-firmware/tree/master/coordinator/Z-Stack_Home_1.2/bin/source_routing

Bruceforce commented 4 years ago

Today it crashed again with all reporting disabled.

Here are the logs: Compose log: https://gist.githubusercontent.com/Bruceforce/42c380e9fcadd5405a011ccfec804b38/raw/b56b0995b7c32afe088a95c5ddbabe4d7d13310c/compose.log

zigbee2mqtt log in logs folder does not contain the extra logs from DEBUG=zigbee-herdsman:* that's why I provided the compose log.

I will now flash the source routing firmware and try again.

I have one question about this firmware: Is it true that I can't control which devices are paired to the stick directly and if somehow 5 end devices decide to pair directly to the stick than I would be unable to pair a router since there are already 5 directly connected devices?

Koenkk commented 4 years ago

You cannot control this but in practice I have never seen this go wrong. My assumption is that it prefers connections to routers instead of end devices.

Bruceforce commented 4 years ago

I tried the source routing firmware and it's working until now (with reporting and availability disabled. Do you use these features in your z2m network or did you disable them?

If it keeps running with source routing firmware for another week I'll close here.

Koenkk commented 4 years ago

I dont have them enabled but I dont use the CC2531 but the CC2652R (which is powerful enough to handle this)

Bruceforce commented 4 years ago

Closing this since it's working with the source routing firmware. So for conclusion: CC2531 isn't powerful enough for all zigbee2mqtt features like reporting and availability if the network gets bigger.