Closed pauloon closed 1 year ago
having this issue using Zigbee2MQTT here is the last part of the logs
'{"battery":100,"linkquality":212,"occupancy":true,"power_outage_count":26,"voltage":3035}' Zigbee2MQTT:info 2023-02-19 20:58:49: MQTT publish: topic 'zigbee2mqtt/Motion sensor', payload '{"battery":100,"linkquality":212,"occupancy":false,"power_outage_count":26,"voltage":3035}' Zigbee2MQTT:info 2023-02-19 21:08:48: MQTT publish: topic 'zigbee2mqtt/Button', payload '{"action":null,"battery":100,"click":null,"linkquality":200,"power_outage_count":637,"voltage":3042}' Zigbee2MQTT:info 2023-02-19 21:32:56: MQTT publish: topic 'zigbee2mqtt/Motion sensor', payload '{"battery":100,"linkquality":212,"occupancy":false,"power_outage_count":26,"voltage":3035}' Zigbee2MQTT:error 2023-02-19 22:24:14: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:24:41: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:25:40: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:26:52: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:28:08: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:29:33: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:31:35: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:33:37: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:35:17: Not connected to MQTT server! Zigbee2MQTT:error 2023-02-19 22:38:38: Not connected to MQTT server! <--- Last few GCs ---> [8:0x7fb8e1fe33c0] 46065449 ms: Mark-sweep 1851.4 (2008.8) -> 1835.4 (2002.0) MB, 3565.4 / 0.7 ms (average mu = 0.174, current mu = 0.143) allocation failure scavenge might not succeed [8:0x7fb8e1fe33c0] 46068097 ms: Mark-sweep 1851.1 (2002.0) -> 1835.3 (1999.5) MB, 2564.2 / 0.8 ms (average mu = 0.116, current mu = 0.032) allocation failure scavenge might not succeed <--- JS stacktrace ---> FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
I have only 2 ZigBee devices the xiaomi motion sensor and button. connected to Sonoff hub with ZHA ZBBridge Tasmota version 12.4.0(zbbridge)
An issue identical to the eschar's has occurred to me recently. After a week of normal operation, MQTT connection dropped and out of memory occurred. I am running Zigbee2mqtt on very limited resources (Rpi B), docker installation via docker-compose. Should we open a new ticket for this?
It seems I am having a similar issue which could be related to https://github.com/Koenkk/zigbee2mqtt/issues/12732
I am running zigbee2mqtt on a RPi Zero W which worked flawlessly for weeks. Just yesterday, after I updated the MQTT server on a different machine this started to happen (seemingly).
I am running with herdsman debug but no obvious issues. Before this happens, the node process runs with 100% CPU and is extremely laggy and unresponsive. After ~30min it is killed with this message:
zigbee-herdsman:controller:endpoint Request Queue (0x94deb8fffe7bc6f1/1): send checkinRsp request immediately (sendWhen=immediate) +3ms
zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,23,68,129,0,0,32,0,241,41,1,1,0,21,0,170,0,124,0,0,3,9,58,0,65,229,28,97,254,5,69,196,68,68,1,198,44,111,254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,51,48,124,0,0,3,9,12,0,198,44,28,49,254,23,68,129,0,0,32,0,249,239,1,1,0,91,0,128,62,124,0,0,3,9,98,0,108,155,28,254,254,5,69,196,161,26,1,62,232,232,254,23,68,129,0,0,32,0,161,26,1,1,0,54,0,30,128,124,0,0,3,9,78,0,62,232,28,19,254,23,68,129,0,0,32,0,178,110,1,1,0,51,0,137,163,124,0,0,3,9,45,0,178,110,29,173,254,5,69,196,37,6,1,108,155,81,254,5,69,196,241,41,1,65,229,249,254,23,68,129,0,0,32,0,37,6,1,1,0,87,0,100,243,124,0,0,3,9,89,0,108,155,28,213,254,23,68,129,0,0,32,0,241,41,1,1,0,21,0,178,245,124,0,0,3,9,59,0,65,229,28,141,254,5,69,196,68,68,1,198,44,111,254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,219,38,125,0,0,3,9,13,0,198,44,28,207,254,23,68,129,0,0,32,0,249,239,1,1,0,91,0,40,53,125,0,0,3,9,99,0,108,155,28,93,254,5,69,196,161,26,1,62,232,232,254,23,68,129,0,0,32,0,161,26,1,1,0,54,0,117,118,125,0,0,3,9,79,0,62,232,28,142,254,23,68,129,0,0,32,0,178,110,1,1,0,54,0,32,153,125,0,0,3,9,46,0,178,110,29,57,254,5,69,196,37,6,1,108,155,81,254,5,69,196,241,41,1,65,229,249,254,23,68,129,0,0,32,0,37,6,1,1,0,87,0,180,233,125,0,0,3,9,90,0,108,155,28,29,254,23,68,129,0,0,32,0,241,41,1,1,0,21,0,122,235,125,0,0,3,9,60,0,65,229,28,93,254,5,69,196,68,68,1,198,44,111,254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,34,30,126,0,0,3,9,14,0,198,44,28,14] +183ms
zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 23 - 2 - 4 - 129 - [0,0,32,0,241,41,1,1,0,21,0,170,0,124,0,0,3,9,58,0,65,229,28] - 97 +2ms
zigbee-herdsman:adapter:zStack:znp:AREQ <-- AF - incomingMsg - {"groupid":0,"clusterid":32,"srcaddr":10737,"srcendpoint":1,"dstendpoint":1,"wasbroadcast":0,"linkquality":21,"securityuse":0,"timestamp":8126634,"transseqnumber":0,"len":3,"data":{"type":"Buffer","data":[9,58,0]}} +176ms
zigbee-herdsman:controller:log Received 'zcl' data '{"frame":{"Header":{"frameControl":{"frameType":1,"manufacturerSpecific":false,"direction":1,"disableDefaultResponse":false,"reservedBits":0},"transactionSequenceNumber":58,"manufacturerCode":null,"commandIdentifier":0},"Payload":{},"Command":{"ID":0,"parameters":[],"name":"checkin"}},"address":10737,"endpoint":1,"linkquality":21,"groupID":0,"wasBroadcast":false,"destinationEndpoint":1}' +136ms
zigbee-herdsman:controller:device:log check-in from 0x8cf681fffe2a1662: declining fast-poll +135ms
zigbee-herdsman:controller:endpoint Command 0x8cf681fffe2a1662/1 genPollCtrl.checkinRsp({"startFastPolling":false,"fastPollTimeout":0}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) +123ms
zigbee-herdsman:controller:endpoint Request Queue (0x8cf681fffe2a1662/1): send checkinRsp request immediately (sendWhen=immediate) +2ms
zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,5,69,196,68,68,1,198,44,111,254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,51,48,124,0,0,3,9,12,0,198,44,28,49,254,23,68,129,0,0,32,0,249,239,1,1,0,91,0,128,62,124,0,0,3,9,98,0,108,155,28,254,254,5,69,196,161,26,1,62,232,232,254,23,68,129,0,0,32,0,161,26,1,1,0,54,0,30,128,124,0,0,3,9,78,0,62,232,28,19,254,23,68,129,0,0,32,0,178,110,1,1,0,51,0,137,163,124,0,0,3,9,45,0,178,110,29,173,254,5,69,196,37,6,1,108,155,81,254,5,69,196,241,41,1,65,229,249,254,23,68,129,0,0,32,0,37,6,1,1,0,87,0,100,243,124,0,0,3,9,89,0,108,155,28,213,254,23,68,129,0,0,32,0,241,41,1,1,0,21,0,178,245,124,0,0,3,9,59,0,65,229,28,141,254,5,69,196,68,68,1,198,44,111,254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,219,38,125,0,0,3,9,13,0,198,44,28,207,254,23,68,129,0,0,32,0,249,239,1,1,0,91,0,40,53,125,0,0,3,9,99,0,108,155,28,93,254,5,69,196,161,26,1,62,232,232,254,23,68,129,0,0,32,0,161,26,1,1,0,54,0,117,118,125,0,0,3,9,79,0,62,232,28,142,254,23,68,129,0,0,32,0,178,110,1,1,0,54,0,32,153,125,0,0,3,9,46,0,178,110,29,57,254,5,69,196,37,6,1,108,155,81,254,5,69,196,241,41,1,65,229,249,254,23,68,129,0,0,32,0,37,6,1,1,0,87,0,180,233,125,0,0,3,9,90,0,108,155,28,29,254,23,68,129,0,0,32,0,241,41,1,1,0,21,0,122,235,125,0,0,3,9,60,0,65,229,28,93,254,5,69,196,68,68,1,198,44,111,254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,34,30,126,0,0,3,9,14,0,198,44,28,14] +94ms
zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 5 - 2 - 5 - 196 - [68,68,1,198,44] - 111 +2ms
zigbee-herdsman:adapter:zStack:znp:AREQ <-- ZDO - srcRtgInd - {"dstaddr":17476,"relaycount":1,"relaylist":[11462]} +96ms
zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,51,48,124,0,0,3,9,12,0,198,44,28,49,254,23,68,129,0,0,32,0,249,239,1,1,0,91,0,128,62,124,0,0,3,9,98,0,108,155,28,254,254,5,69,196,161,26,1,62,232,232,254,23,68,129,0,0,32,0,161,26,1,1,0,54,0,30,128,124,0,0,3,9,78,0,62,232,28,19,254,23,68,129,0,0,32,0,178,110,1,1,0,51,0,137,163,124,0,0,3,9,45,0,178,110,29,173,254,5,69,196,37,6,1,108,155,81,254,5,69,196,241,41,1,65,229,249,254,23,68,129,0,0,32,0,37,6,1,1,0,87,0,100,243,124,0,0,3,9,89,0,108,155,28,213,254,23,68,129,0,0,32,0,241,41,1,1,0,21,0,178,245,124,0,0,3,9,59,0,65,229,28,141,254,5,69,196,68,68,1,198,44,111,254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,219,38,125,0,0,3,9,13,0,198,44,28,207,254,23,68,129,0,0,32,0,249,239,1,1,0,91,0,40,53,125,0,0,3,9,99,0,108,155,28,93,254,5,69,196,161,26,1,62,232,232,254,23,68,129,0,0,32,0,161,26,1,1,0,54,0,117,118,125,0,0,3,9,79,0,62,232,28,142,254,23,68,129,0,0,32,0,178,110,1,1,0,54,0,32,153,125,0,0,3,9,46,0,178,110,29,57,254,5,69,196,37,6,1,108,155,81,254,5,69,196,241,41,1,65,229,249,254,23,68,129,0,0,32,0,37,6,1,1,0,87,0,180,233,125,0,0,3,9,90,0,108,155,28,29,254,23,68,129,0,0,32,0,241,41,1,1,0,21,0,122,235,125,0,0,3,9,60,0,65,229,28,93,254,5,69,196,68,68,1,198,44,111,254,23,68,129,0,0,32,0,68,68,1,1,0,76,0,34,30,126,0,0,3,9,14,0,198,44,28,14] +5ms
zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 23 - 2 - 4 - 129 - [0,0,32,0,68,68,1,1,0,76,0,51,48,124,0,0,3,9,12,0,198,44,28] - 49 +2ms
zigbee-herdsman:adapter:zStack:znp:AREQ <-- AF - incomingMsg - {"groupid":0,"clusterid":32,"srcaddr":17476,"srcendpoint":1,"dstendpoint":1,"wasbroadcast":0,"linkquality":76,"securityuse":0,"timestamp":8138803,"transseqnumber":0,"len":3,"data":{"type":"Buffer","data":[9,12,0]}} +7ms
<--- Last few GCs --->
[1278:0x4583a58] 2596805 ms: Mark-sweep 120.0 (129.3) -> 119.5 (129.0) MB, 8820.0 / 0.0 ms (average mu = 0.092, current mu = 0.023) allocation failure; scavenge might not succeed
[1278:0x4583a58] 2597020 ms: Scavenge 120.2 (129.0) -> 119.7 (129.0) MB, 13.1 / 0.0 ms (average mu = 0.092, current mu = 0.023) allocation failure;
[1278:0x4583a58] 2597095 ms: Scavenge 120.2 (129.0) -> 119.7 (129.3) MB, 4.5 / 0.0 ms (average mu = 0.092, current mu = 0.023) allocation failure;
<--- JS stacktrace --->
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Aborted
zigbee2mqtt@rfpi:~$
@Koenkk any advice?
zigbee2mqtt@rfpi:~$ node --version
v18.15.0
I updated to latest version (via git pull
and npm ci
) but issue still occurs...
Can you provide the herdsman debug logging from starting z2m until this and provide me the time cpu jumps to 100%?
See https://www.zigbee2mqtt.io/guide/usage/debug.html on how to enable the herdsman debug logging. Note that this is only logged to STDOUT and not to log files.
Thanks @Koenkk One issue: CPU is at 100% (or a bit less sometimes) from the beginning on. z2m behaves extremely laggy and after ~30min it dies with a message like above.
Note it has been working flawlessly for weeks and the issue randomly showed up 2 days ago. Apart from upgrading mosquitto, I also upgraded the firmware of some devices (mainly Ikea blinds and plugs). Is there a way to isolate the issue without having to rebuild my entire Zigbee network (>>50 devices)?
I have created a full log (few MBs, z2m ran for ~55min at 100% CPU before dying with the message above). What is the best way to provide you with the log file (in case it's even useful)? Since it contains lots of private info I'd prefer not uploading it where it's publically accessible, if possible at all.
You can send it on telegram (@koenkk)
@Koenkk Thanks! I tried installing Telegram but I get an error message when registering with my phone number (Im not using Telegram yet). Is it (hopefully) also ok to just use this link?
https://www.dropbox.com/s/xdsbv9mdn1pvzc9/debug.log.gz?dl=0
I'll remove the file once you downloaded it.
Thank you very much!!
This is the normal z2m debug logging, I need the herdsman debug logging.
See https://www.zigbee2mqtt.io/guide/usage/debug.html on how to enable the herdsman debug logging. Note that this is only logged to STDOUT and not to log files.
Oh indeed. It was on the screen but got lost in the file. I guess you want to say “ Note that this is only logged to STDERR and not to log files.”
I added “2>&1|tee…” and it goes into the file now. I’m repeating…
Ok, now the full log is here: https://www.dropbox.com/s/qymi5hlssci74qg/debug.log.gz?dl=0
Can you try https://github.com/Koenkk/zigbee2mqtt/issues/17923 ? I see the IKEA blinds are checking-in a lot.
Thanks, this nearly sounds like it! (Issue exactly started to happen after I did these upgrades).
I have done
git fetch origin dev
git checkout latest-dev
git pull
npm ci
which I believe should get me the patch (version shown as "1.31.0-dev commit: fd1622b") but the CPU is still at 100%...
Can you re-configure your blinds and make sure that succeeds? (yellow refresh button in the z2m frontend -> device page)
Thanks for the suggestion.
Ok I actually removed the batteries from all the IKEA blinds. That should make them silent, right?
But yeah, sadly it’s still at 100%…
Turns out the startup took 10min. CPU is down to normal now (3-5% for node). Yaaaaaay!!!
regarding the dev branch I’m actually not sure if I really got it. I think the documentation is outdated. I posted a question here: https://github.com/Koenkk/zigbee2mqtt/discussions/17938
But it might just be worth waiting for the fox to reach master branch. How long do you think this will take roughly? Days, weeks, months?
I will create a hotfix release today.
Hi @Koenkk,
Looks like the same issue reproduces for Perenio PEHPL0X plugs. As soon as I add them into network, they report a lot of information and eventually MQTT connection is lost and Z2M crashes with out of memory.
What happened?
After last update I've noticed Z2M is stopping service with a fatal error, out of the blue.
I'm using HASS.IO all updated, in a i5 machine with 8 Gb do memory and 128 GB SSD.
What did you expect to happen?
No response
How to reproduce it (minimal and precise)
Just leave it running. After one or two days it stops working (service drops).
Zigbee2MQTT version
1.28.2 commit: unknown
Adapter firmware version
20220219
Adapter
SONOFF USB Dongle
Debug log
I did not have the debug log active when this happened. Posting normal log. ... Zigbee2MQTT:info 2022-11-07 11:45:15: MQTT publish: topic 'zigbee2mqtt/Smart Plug 15', payload '{"child_lock":"UNLOCK","current":0.04,"energy":6.45,"indicator_mode":"off/on","last_seen":"2022-11-07T11:45:13-03:00","linkquality":102,"power":0,"power_outage_memory":"restore","state":"ON","update":{"state":"idle"},"update_available":false}' <--- Last few GCs ---> [7:0x7fa21c7993c0] 61013533 ms: Mark-sweep 2044.2 (2085.3) -> 2042.2 (2085.3) MB, 2087.8 / 0.0 ms (average mu = 0.133, current mu = 0.010) allocation failure scavenge might not succeed [7:0x7fa21c7993c0] 61015639 ms: Mark-sweep 2044.3 (2085.3) -> 2042.2 (2085.3) MB, 2082.8 / 0.0 ms (average mu = 0.074, current mu = 0.011) allocation failure scavenge might not succeed <--- JS stacktrace ---> FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory