Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔹
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.74k stars 1.64k forks source link

Devices going offline, extreme slowness and various errors #23329

Open santanar00 opened 1 month ago

santanar00 commented 1 month ago

What happened?

I have a zigbee mesh with around 80 devices, half of which are Routers. My network has always worked well, however, since the last updates to the HA Core and Z2M my network has become completely unstable.

My devices go more than half offline out of nowhere, router, for example, the connection drops. I also have several timeout errors in Z2M.

Devices that are plugged into Tuya or Sonoff work fine. I using the last version driver (

I've already changed my PC (I currently use a dedicated MINI S beelink, Sonoff plus P coordinator. I don't even know where to start analyzing further, as everything I could do in terms of analysis, repairs I've already done. I followed different websites and topics on the internet, but nothing helped me.

Has anyone here ever been through something like this and can help with something?

What did you expect to happen?

No response

How to reproduce it (minimal and precise)

No response

Zigbee2MQTT version

1.39.0-1

Adapter firmware version

20230507

Adapter

Sonoff Dongle Plus-P

Setup

Add-on on Home Assistant OS on Beelink Intel NUC

Debug log

No response

Great-Chart commented 1 month ago

I've noted similar traits (on 1.39.0 with same adapter and firmware version on x86 mini PC also); invariably a restart of Z2M (via Settings / Add-ons) will recover all unavailable devices but some stubborn ones need unplugging / replugging etc. I can usually find unavailable devices accessible via Exposes tab where (for mains units at least) they can be cycled and will often restore. But seem to get periods of sluggishness and/or random operations mixed in also.

As I've been away I've not had chance to investigate and likely like you not wholly sure where I'd start but thought I'd chip in and indicate it's not just you! I recall similar sorts of things previously and they get fixed quickly so likely need to ride this one out for now.

klio-klio commented 1 month ago

I have the same problem. If I go to 1.39.0 version. Number of participants in Mosquitto broker is reduced from 114 to 66 and everything works very sluggishly. I have Sonoff dongle Plus-P and dongle with 2538 chip

santanar00 commented 1 month ago

Strange errors keep occurring. Unexpectedly, the Z2M system crashes. Sometimes I can't even start Z2M, generating the error below.

[2024-07-12 02:24:55] error:    zh:controller: Failed to keep permit join alive: Error: SRSP - ZDO - mgmtPermitJoinReq after 6000ms

[2024-07-12 02:57:52] error:    zh:controller: Failed to disable join on stop: Error: SREQ '--> AF - dataRequestExt - {"dstaddrmode":2,"dstaddr":"0x000000000000fffd","destendpoint":242,"dstpanid":0,"srcendpoint":242,"clusterid":33,"transid":233,"options":0,"radius":30,"len":6,"data":{"type":"Buffer","data":[25,72,2,10,0,0]}}' failed with status '(0x02: INVALID_PARAM)' (expected '(0x00: SUCCESS)')

[14:19:34] INFO: Preparing to start...
[14:19:34] INFO: Socat not enabled
[14:19:34] INFO: Starting Zigbee2MQTT...
Starting Zigbee2MQTT without watchdog.
[2024-07-12 14:20:12] error:    z2m: Error while starting zigbee-herdsman
[2024-07-12 14:20:12] error:    z2m: Failed to start zigbee
[2024-07-12 14:20:12] error:    z2m: Check https://www.zigbee2mqtt.io/guide/installation/20_zigbee2mqtt-fails-to-start.html for possible solutions
[2024-07-12 14:20:12] error:    z2m: Exiting...
[2024-07-12 14:20:12] error:    z2m: Error: SRSP - ZDO - simpleDescReq after 6000ms
    at Object.start (/app/node_modules/zigbee-herdsman/src/utils/waitress.ts:63:23)
    at /app/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:305:45
    at Queue.execute (/app/node_modules/zigbee-herdsman/src/utils/queue.ts:35:26)
    at Znp.request (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:293:27)
    at /app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/zStackAdapter.ts:173:32
    at Queue.execute (/app/node_modules/zigbee-herdsman/src/utils/queue.ts:35:20)
    at Controller.start (/app/node_modules/zigbee-herdsman/src/controller/controller.ts:180:29)
    at Zigbee.start (/app/lib/zigbee.ts:63:27)
    at Controller.start (/app/lib/controller.ts:139:27)
    at start (/app/index.js:154:5)

I feel less tense knowing that there are more people with this problem, however, I don't see many complaints about the topic, leaving the doubt as to whether there is actually someone from Z2M working on this. I really don't know what to do anymore.

Does anyone know if it is possible to rollback just Z2M to version 1.38.0, without using a backup? Because it's been several days since I updated, I no longer have backups from that time.

Thank you for the community support.

rosicenko commented 1 month ago

Same here. Zigbee2MQTT is totally broken and useless since last update with ember. Devices disconnecting, errors here and there, map is totally broken. I can only operate from time to time less than half of all the devices. "Error ZCL" is the most common one.

Skeletorjus commented 1 month ago

I get lots of "Failed to ping"-errors on pretty much all my devices when this happens. [2024-07-13 19:02:27] warning: z2m: Failed to ping 'Bod Taklampe' (attempt 1/1, ZCL command 0xc4988600000ce8bf/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)) Have rolled back to 1.38.0, but the problem persists, and I haven't found out what triggers it. It could be days or hours, but all I can do to fix this is to restart Zigbee2MQTT.

Sonoff Plus P (20230507), running in Docker.

wizardofozzie commented 1 month ago

I am in the same or a similar situation. It's driving me nuts. I may actually go back to ZHA if I can't fix this.

I have had profound issues with my devices, both routers and otherwise. The best solution I have is winding back to 1.35.1 (I have no older backups). When I attempt to re-pair unresponsive devices by unpairing and resetting the device is never detected (no window popping up) despite detect devices being on

Device = Sonoff USB 3.0 Dongle-P on Raspberry Pi 4

As it stands half my devices are offline but the battery shows it's correct level, so I wonder how the device can detect battery levels offline. Similarly, some of my Tuya smart plugs (routers) show up as offline (how??)

I have had this issue for >4 months.

wizardofozzie commented 1 month ago

I get lots of "Failed to ping"-errors on pretty much all my devices when this happens. [2024-07-13 19:02:27] warning: z2m: Failed to ping 'Bod Taklampe' (attempt 1/1, ZCL command 0xc4988600000ce8bf/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)) Have rolled back to 1.38.0, but the problem persists, and I haven't found out what triggers it. It could be days or hours, but all I can do to fix this is to restart Zigbee2MQTT.

Sonoff Plus P (20230507), running in Docker.

Where do you get this log? From Z2M or HA? It seems we have the same problem and Sonoff Dongle P device.

You mention restarting Z2M works, which it doesn't for me.

@Skeletorjus does your system detect devices which you force remove then reset? I cannot get re-adding devices to work ever. Even brand new devices wouldn't add recently.

Skeletorjus commented 1 month ago

@wizardofozzie, the messages are from Z2M. I haven't had any issues pairing devices, but I sometimes have to force them to pair via a certain device to make it work - they won't always find the path themselves.

NRGizzer commented 1 month ago

Same problems here. Migrated from a ConBeeII setup to a SonOff Dongle E setup, using neweset 7.4.3.0 Ember Firmware an 1.39.0 zigbe2mqtt Version. Deleted old Database and repaired every device from scratch and my network is really unstable. Most of the time these errors accure:

[2024-07-15 08:26:35] error: z2m: Failed to read state of 'Schreibtischlampe Jannik' after reconnect (ZCL command 0xa4c138e29f8a3a2d/1 genOnOff.read(["onOff"], {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed ({"target":45406,"apsFrame":{"profileId":260,"clusterId":6,"sourceEndpoint":1,"destinationEndpoint":1,"options":4416,"groupId":0,"sequence":119},"zclSequence":139,"commandIdentifier":1} timed out after 10000ms))

Resulting in either slow response time when for example using switches or resulting in a completly unresponsible state, so i have to restart the whole system. I really need a solution here, because there are importend things that are controlled via Home Assistant. I now ordered the Dongle P, maybe zstack works better.

santanar00 commented 1 month ago

Just to test, I completely removed Z2M and the sensors created, installed ZHA, reparing all my 90 devices and the network is working perfectly, without delays or crashes.

This leads me to believe that, in fact, it is the Z2M version that has a problem.

Unfortunately rtunately my knowledge is limited to be able to deepen my analysis, but I would be grateful if any member with deeper knowledge of Z2M could support the topic.

I am willing to do tests and send logs or any activity that is necessary.

kafisc1 commented 1 month ago

Same error here. Did anyone of you tried to use the dev/edge branch?

escobarin3 commented 1 month ago

I've been struggling for two weeks, trying to figure out what's been going wrong with my HA installation. I've been using VirtualBox on Windows without issues for over 2 years, and everything has always worked fine. But for the past few weeks, everything's been failing—super slow system, taking ages to restart, and no clear way to identify the problem. After a lot of trial and error, I realized the issue lies with Z2MQTT. I didn't think it could be that because I only have 10 devices through Z2M... all the others are on ZHA. But after several attempts, I found that stopping the Z2MQTT add-on entirely reduces the CPU usage to around 8%. When it's running, CPU usage shoots up to 80%. The virtual machine has 4GB of RAM, 4 CPUs, and 100GB of storage, so it's definitely not the machine. The curious thing is that even restoring a backup to the add-on version 1.36.1-1 and core 2024.6.4, the problem persists... I don't remember it being like this before. The issue is that I don't have any more backups to go back further. Any ideas?

escobarin3 commented 1 month ago

Same error here. Did anyone of you tried to use the dev/edge branch?

Yes.. no luck. Issue persists.

wizardofozzie commented 1 month ago

@wizardofozzie, the messages are from Z2M. I haven't had any issues pairing devices, but I sometimes have to force them to pair via a certain device to make it work - they won't always find the path themselves.

Can you elaborate on this?

wizardofozzie commented 1 month ago

@Koenkk are these issues best kept here or are these new and separate issues we should lodge?

thecobra666 commented 1 month ago

Since updating yesterday my lights sometimes don't turn off when sending the (group) commands.

Docker on pi 4, with conbee 2 stick.

klio-klio commented 1 month ago

I've managed to get back on the Stand Core version core-2024.5.5 and Zigbee2MQTT 1.38.0-1 backgrated. Problem was already after restoring the backup. Half of Zigbee2MQTT were offline. I could quickly bring the switches online after I pressed on/off in Zigbee2MQTT. I could usually take out 11 battery-operated sensors with a battery and put them back in to revive. I would have to pair 3.

Koenkk commented 1 month ago

failed (SRSP - AF - dataRequest after 6000ms)' means the coordinator is unstable, I suggest trying the firmware https://github.com/Koenkk/Z-Stack-firmware/discussions/505 from since20230507` might not be stable

santanar00 commented 1 month ago

failed (SRSP - AF - dataRequest after 6000ms)' means the coordinator is unstable, I suggest trying the firmware https://github.com/Koenkk/Z-Stack-firmware/discussions/505 from since20230507` might not be stable

Hi @Koenkk how are you?

Thank you for supporting the theme! I'm already using this indicated firmware. I'm using all the recommended and most current versions, both Home Assistant and Z2M and the coordinator, and even so the problems continue.

To test, I uninstalled Z2M and put ZHA using the same coordinator and the same Home Assistant and I have no problem. For this reason, I believe that in fact the problem is in Z2M.

If I can do any test or provide a more detailed log to support the analysis of the problem, tell me what to do.

Thank you

Harry-1976 commented 1 month ago

Same issue, same errors for me.

Z2M addon even crashing from time to time.

Koenkk commented 1 month ago

@santanar00 could you provide the debug logging from starting z2m until it crashes with the 20240710 firmware?

See this on how to enable debug logging.

Skeletorjus commented 1 month ago

@wizardofozzie, the messages are from Z2M. I haven't had any issues pairing devices, but I sometimes have to force them to pair via a certain device to make it work - they won't always find the path themselves.

Can you elaborate on this?

The "Permit join"-button has an arrow to the right where you can select a device to pair through. This way you can force a route which sometimes help in the pairing process.

I still have the crashes, and I think it could be related to sending commands to multiple devices at once - like when you press a button to turn off several lights. I'm not at all sure that this is what triggers the errors, but it seems to happen before going to bed or when I arrive home and many lights have to turn on or off.

Haven't updated from 20230507 yet. Will do soon, but it has been running solid for me for about a year until a couple of weeks ago, so not sure that it really is the culprit.

Great-Chart commented 1 month ago

I commented earlier to advise I was suffering somewhat random devices dropping offline / "crashes" etc; and also noted the response to revise coordinator firmware due to 20230507 being advised as unstable; so have recently upgraded firmware. https://github.com/Koenkk/Z-Stack-firmware/discussions/505

I've had days when the network is rock solid and stays on-line for days and then periods where things go offline very quickly (hours) and as I'm drafting this I'm watching device after device drop-off over a circa 30 minute period.

I'd earlier fathomed out how to pull off the debug logs and also noted from github dialogue Z2M map that the new firmware 20240710 has reduced capacity for devices that will connect directly to the coordinator. There seemed to be no end devices connected to my coordinator either. As I restarted Z2M a few times I'm sure I observed a differing "set" of router devices that would be connected to the coordinator (in the order of 30).

My coordinator is Sonoff Dongle Plus P It was on 20230507 which was stable until a few weeks ago (ala this thread) and I'd gone through an assortment of odd experiences that were not consistent enough to get a handle on.

Currently have 119 devices; 61 router (lots of Ikea plugs etc) / 57 end devices (temp sensors and buttons) and 1 unknown (spotted as the system had gone offline)

But logs attached as follows: 240725_16-07 RESTART Z2M log.log The above log taken after a restart earlier today and after all offline router devices had become available (online).

Log2 - begins circa 16.25 hrs log2.log

Log 1 begins circa 16:45 hrs log1.log

Log begins circa 17:14 hrs and probably with all the mains devices offline and only end devices not yet updated to reflect being offline. log.log

I'll post logs into https://github.com/Koenkk/Z-Stack-firmware/discussions/505 also as contingency for this being firmware related and not just interference of other issues. Like many here I've largely had a pretty reliable network and great experience with Z2M and there would seem to be something recent upsetting things but expect different parties have differing issues.

If the logs info isn't correct and/or I can do anything else to monitor and troubleshoot please advise - accepting that I can't make sense of the logs and thus would appreciate any recommendations or advice on corrective changes or things to try.

Edit - Z2M map AFTER restart following the above with 32 router devices shown with DIRECT connection to the coordinator and the others somewhat more hidden that do NOT have direct connections

240725_18-23 Direct connections to Coordinator

Skeletorjus commented 1 month ago

I provoced the error and crashed my network by accident today, and I'm pretty sure that in my case it boils down to excessive zigbee traffic. I have used a custom card called Bubble Card. This specific slider behaves in a ... interesting way. Instead of sending a single command when the slider is released, it sends commands on every slight movement. In addition I had an automation that synced states between some of the lights. Needless to say, this generated the perfect storm of messages and in turn took down the whole network and made other bulbs throughout the house start flashing because they became disconnected.

I have disabled the automation and use different cards to control the lights now, It will be interesting to see if this keeps the network up and all devices paired and online.

[2024-08-01 23:39:47] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/Rom TVStue', payload '{"brightness":236,"color":{"x":0.4599,"y":0.4106},"color_mode":"xy","color_temp":371,"state":"ON"}'
[2024-08-01 23:39:47] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/Kontor Taklampe', payload '{"brightness":219,"last_seen":1722548387918,"linkquality":47,"power_on_behavior":null,"state":"ON"}'
[2024-08-01 23:39:47] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/Rom TVStue', payload '{"brightness":254,"color":{"x":0.4599,"y":0.4106},"color_mode":"xy","color_temp":371,"state":"ON"}'
[2024-08-01 23:39:47] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/TVStue StÄlampe Topp', payload '{"brightness":254,"color":{"x":0.4599,"y":0.4106},"color_mode":"xy","color_options":null,"color_temp":371,"last_seen":1722548387921,"level_config":{"on_level":"previous"},"linkquality":105,"power_on_behavior":"on","state":"ON","update":{"installed_version":587806257,"latest_version":587806257,"state":"idle"},"update_available":false}'
[2024-08-01 23:39:47] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/TVStue StÄlampe', payload '{"brightness":254,"color":{"x":0.4599,"y":0.4106},"color_mode":"xy","color_temp":371,"state":"ON"}'
[2024-08-01 23:39:47] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/Kontor Taklampe', payload '{"brightness":254,"last_seen":1722548387918,"linkquality":47,"power_on_behavior":null,"state":"ON"}'
[2024-08-01 23:39:47] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/TVStue Taklampe', payload '{"brightness":233,"last_seen":1722548387939,"linkquality":76,"power_on_behavior":null,"state":"ON"}'
[2024-08-01 23:39:53] error:    z2m: Publish 'set' 'brightness' to 'TVStue Taklampe' failed: 'Error: ZCL command 0x187a3efffe99d5ec/1 genLevelCtrl.moveToLevelWithOnOff({"level":153,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:39:59] error:    z2m: Publish 'set' 'brightness' to 'TVStue StÄlampe' failed: 'Error: Command 12 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":0}) failed (SRSP - AF - dataRequestExt after 6000ms)'
[2024-08-01 23:40:05] error:    z2m: Publish 'set' 'brightness' to 'Kontor Taklampe' failed: 'Error: ZCL command 0x6c5cb1fffed6a6b6/1 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:40:11] error:    z2m: Publish 'set' 'brightness' to 'TVStue StÄlampe' failed: 'Error: Command 12 genLevelCtrl.moveToLevelWithOnOff({"level":233,"transtime":0}) failed (SRSP - AF - dataRequestExt after 6000ms)'
[2024-08-01 23:40:17] error:    z2m: Publish 'set' 'brightness' to 'TVStue Taklampe' failed: 'Error: ZCL command 0x187a3efffe99d5ec/1 genLevelCtrl.moveToLevelWithOnOff({"level":251,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:40:23] error:    z2m: Publish 'set' 'brightness' to 'Oppbevaringsrom Taklampe' failed: 'Error: ZCL command 0x943469fffeee2caa/1 genLevelCtrl.moveToLevelWithOnOff({"level":219,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - UTIL - assocGetWithAddress after 6000ms)'
[2024-08-01 23:40:29] warning:  z2m: Failed to ping 'Elvira Lysbryter' (attempt 1/1, ZCL command 0x54ef441000666260/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
[2024-08-01 23:40:41] error:    z2m: Publish 'set' 'brightness' to 'TVStue Taklampe' failed: 'Error: ZCL command 0x187a3efffe99d5ec/1 genLevelCtrl.moveToLevelWithOnOff({"level":153,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:40:47] error:    z2m: Publish 'set' 'brightness' to 'Oppbevaringsrom Taklampe' failed: 'Error: ZCL command 0x943469fffeee2caa/1 genLevelCtrl.moveToLevelWithOnOff({"level":136,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:40:53] warning:  z2m: Failed to ping 'Ute Veranda 2etg' (attempt 1/2, ZCL command 0x000b57fffec50971/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
[2024-08-01 23:40:59] error:    z2m: Publish 'set' 'brightness' to 'Kontor Taklampe' failed: 'Error: ZCL command 0x6c5cb1fffed6a6b6/1 genLevelCtrl.moveToLevelWithOnOff({"level":136,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:41:05] error:    z2m: Publish 'set' 'brightness' to 'TVStue Taklampe' failed: 'Error: ZCL command 0x187a3efffe99d5ec/1 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:41:11] error:    z2m: Publish 'set' 'brightness' to 'Oppbevaringsrom Taklampe' failed: 'Error: ZCL command 0x943469fffeee2caa/1 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:41:17] warning:  z2m: Failed to ping 'Ute Veranda 2etg' (attempt 2/2, ZCL command 0x000b57fffec50971/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
[2024-08-01 23:41:20] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/Ute Veranda 2etg/availability', payload '{"state":"offline"}'
[2024-08-01 23:41:23] error:    z2m: Publish 'set' 'brightness' to 'Kontor Taklampe' failed: 'Error: ZCL command 0x6c5cb1fffed6a6b6/1 genLevelCtrl.moveToLevelWithOnOff({"level":219,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:41:29] error:    z2m: Publish 'set' 'brightness' to 'TVStue Taklampe' failed: 'Error: ZCL command 0x187a3efffe99d5ec/1 genLevelCtrl.moveToLevelWithOnOff({"level":251,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:41:35] warning:  z2m: Failed to ping 'Ute Inngangsparti' (attempt 1/2, ZCL command 0x000b57fffe9f5c70/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
[2024-08-01 23:41:41] error:    z2m: Publish 'set' 'brightness' to 'Kontor Taklampe' failed: 'Error: ZCL command 0x6c5cb1fffed6a6b6/1 genLevelCtrl.moveToLevelWithOnOff({"level":136,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:41:48] error:    z2m: Publish 'set' 'brightness' to 'TVStue Taklampe' failed: 'Error: ZCL command 0x187a3efffe99d5ec/1 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:41:54] warning:  z2m: Failed to ping 'Ute Inngangsparti' (attempt 2/2, ZCL command 0x000b57fffe9f5c70/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
[2024-08-01 23:41:57] info:     z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/Ute Inngangsparti/availability', payload '{"state":"offline"}'
[2024-08-01 23:42:00] error:    z2m: Publish 'set' 'brightness' to 'Kontor Taklampe' failed: 'Error: ZCL command 0x6c5cb1fffed6a6b6/1 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:42:06] error:    z2m: Publish 'set' 'brightness' to 'TVStue Taklampe' failed: 'Error: ZCL command 0x187a3efffe99d5ec/1 genLevelCtrl.moveToLevelWithOnOff({"level":224,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:42:12] warning:  z2m: Failed to ping 'Ute Vegg' (attempt 1/2, ZCL command 0x040d84fffed90348/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms))
[2024-08-01 23:42:18] error:    z2m: Publish 'set' 'brightness' to 'TVStue Taklampe' failed: 'Error: ZCL command 0x187a3efffe99d5ec/1 genLevelCtrl.moveToLevelWithOnOff({"level":254,"transtime":0}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)'
[2024-08-01 23:42:24] warning:  z2m: Failed to ping 'Ute Vegg' (attempt 2/2, ZCL command 0x040d84fffed90348/1 genBasic.read(["zclVersion"], {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SRSP - AF - dataRequest after 6000ms)
escobarin3 commented 1 month ago

I've been struggling for two weeks, trying to figure out what's been going wrong with my HA installation. I've been using VirtualBox on Windows without issues for over 2 years, and everything has always worked fine. But for the past few weeks, everything's been failing—super slow system, taking ages to restart, and no clear way to identify the problem. After a lot of trial and error, I realized the issue lies with Z2MQTT. I didn't think it could be that because I only have 10 devices through Z2M... all the others are on ZHA. But after several attempts, I found that stopping the Z2MQTT add-on entirely reduces the CPU usage to around 8%. When it's running, CPU usage shoots up to 80%. The virtual machine has 4GB of RAM, 4 CPUs, and 100GB of storage, so it's definitely not the machine. The curious thing is that even restoring a backup to the add-on version 1.36.1-1 and core 2024.6.4, the problem persists... I don't remember it being like this before. The issue is that I don't have any more backups to go back further. Any ideas?

After a lot of trial and error... Unistalling Passive BLE monitor integration (HACS) solved the issue... Don't ask me how or why... it just did. I noticed that BT devices kept "going offline" also... so this just solved it.