dresden-elektronik / deconz-rest-plugin

deCONZ REST-API plugin to control ZigBee devices
BSD 3-Clause "New" or "Revised" License
1.89k stars 496 forks source link

slow response of bulbs and lost commands #3735

Closed capstan1 closed 3 years ago

capstan1 commented 3 years ago

Describe the question or issue you are having

I have some problems with my Deconz setup with lights reacting very slowly to turning on/off commands. This happens when using battery operated switches and apps (Home Assistant/Phoscon/Hue Essentials) Problems:

Screenshots

My deconz view: 1 Squares to show the rooms: 2 Device types written next to the entities: 3

Environment

deCONZ Logs

Additional context

Smanar commented 3 years ago

Hello, using your logs, can you check, if the delay

BTW you can too add light in a group (without remote) and toogle the group using automation (If you don't already do that ?)

It happen suddently ? I have already see this kind of issue caused by 1 fautly device.

capstan1 commented 3 years ago

Thanks for your answer! I'm not exactly sure where I could check all this together. (Logs from Deconz, HA and Node Red in one place) Do you have any idea? But overall I think it's a combination of both, because: a) some switches are not configured in node red, but using hue essentials b) there is a delay between using the remote and the event getting shown in home assistant (HA receiving remote command) AND there is a delay when switching lights using home assistant (bulb receiving command)

I already add various lights in light groups and control them together. I just wanted to say that I have controller groups, where you add lights to a controller directly (those controls even work when deconz is turned off). Those groups tend to work better (possibly because there is a direct connection).

Fruchuxs commented 3 years ago

I have a similar problem with my enocean friends of hue switch - but the switch is configured directly in deconz. With the hue bridge, the switch always reacts quickly.

In general some lights reacts slowly or stucks on / off or in a very low brightness (hue bulbs stucks on low light, deconz lightstrip controllers doesn't reacts). Will open a new ticket in the afternoon.

capstan1 commented 3 years ago

I can try configuring one of the aqara switches in deconz directly if that helps. Just haven't done that because deconz doesn't support setting "toggle" (at least with the aqara switches"

Mimiix commented 3 years ago

What SD Card are you using? I often see this behavior with a slow/old SD card. Might be worth to check that .

capstan1 commented 3 years ago

Im using a 32GB Samsung Evo Class 10, which is around 6 months old (I guess). Tried it with an sandisk Extreme USB stick yesterday, which made the issue worse. I ordered a second sd card which is unused, gonna try this one out later! (Just assumed that the sd card won't affect deconz, but I guess, the rules are written to the container, right?)

Mimiix commented 3 years ago

It shouldn't.

The thing is mostly that the read/write is not fast enough to keep up. It might be worth to check the current load on HA with the Glances addon. Can you share a screen with that addon open?

capstan1 commented 3 years ago

Here you go image

Fruchuxs commented 3 years ago

What SD Card are you using? I often see this behavior with a slow/old SD card. Might be worth to check that .

SanDisk Extreme microSDHC 32GB It is possible to optimize the performance via tempfs?

SwoopX commented 3 years ago

You wrote

only 2 lights are controlled by a remote group in deconz (hue, tradfri remote), these usually respond quickly to presses

as the Aqara switches don't support remote groups they are controlled through node red or via rules (button events) written to deconz using Hue Essentials toggling the lights on button presses

Has the reaction been differently in the past regarding the button presses? To me, it sound like expected behaviour when not using zigbee groups for switches to switch lights. Had cases here where it took up to 10 secs to put on all lights.

Mimiix commented 3 years ago

Your ram is almost completely in use. That might cause delays. I also side with Swoop on this one.

capstan1 commented 3 years ago

To make it a bit clearer:

Both these options (1 and 2) are controlling groups (if applicable) and there's no difference between the options. Option A) tends to be fast but sometimes still get cought by the error. Also there are times where controlling lights directly from Home Assistant is very slow.

Smanar commented 3 years ago

I have ever see this kind of issue for badly device, one time for a philips bulb and an osram plug, after removing it the network was fine. Or another exemple with ikea repeater, but without solution (still a mystery) https://github.com/dresden-elektronik/deconz-rest-plugin/issues/2967

capstan1 commented 3 years ago

Is there any way to debug this? Don't know which device to remove first. Ordered some tradfri bulbs to exchange others in my system, but trial and error will need a lot of time.

Smanar commented 3 years ago

Depend of the frequency you have the bug. If you have it all the time, just power off 50% of your bulb, wait for the zigbee network stabilise, and make tries. If you don't have issues, you have found a wy to explore, else, the problem is from the 50% resting or it s another problem.

Fruchuxs commented 3 years ago

I have a similar problem with my enocean friends of hue switch - but the switch is configured directly in deconz. With the hue bridge, the switch always reacts quickly.

In general some lights reacts slowly or stucks on / off or in a very low brightness (hue bulbs stucks on low light, deconz lightstrip controllers doesn't reacts). Will open a new ticket in the afternoon.

Maybe my problem is another one. I had a toggle on_off rule on alle Buttons configured in deconz. To switch the lights on, it always works, but since today, the toggle off is a problem. The Logs say:

08:28:00:335 GW firmware version: 0x26660700
08:28:00:808 0x001788010434AEAA error APSDE-DATA.confirm: 0xE1 on task
08:28:06:229 skip binding for attribute reporting of ep: 0x00 cluster 0x0001 (end-device might sleep)
08:28:07:389 GP srcId: 0x01715B1D cmd: 0x12 frameCounter: 6892
08:28:07:390 [INFO] - Button 4000 (null)
08:28:07:393 rule event /sensors/33/state/lastupdated: 0 -> 0
08:28:07:625 GP srcId: 0x01715B1D cmd: 0x16 frameCounter: 6893
08:28:07:625 [INFO] - Button 4002 (null)
08:28:07:628 rule event /sensors/33/state/lastupdated: 0 -> 0
08:28:07:629 trigger rule 7 - Rule TOOGLE_ON
08:28:08:075 0x00212EFFFF038983 error APSDE-DATA.confirm: 0xE1 on task
08:28:08:458 GP srcId: 0x01715B1D cmd: 0x12 frameCounter: 6894
08:28:08:459 [INFO] - Button 4000 (null)
08:28:08:462 rule event /sensors/33/state/lastupdated: 0 -> 0
08:28:08:670 GP srcId: 0x01715B1D cmd: 0x16 frameCounter: 6895
08:28:08:670 [INFO] - Button 4002 (null)
08:28:08:674 rule event /sensors/33/state/lastupdated: 0 -> 0
08:28:08:674 trigger rule 8 - Rule TOOGLE_OFF
08:28:08:836 0x0000000000000000 error APSDE-DATA.confirm: 0xE1 on task
08:28:09:528 Set sensor check interval to 1000 milliseconds
08:28:10:278 dev /dev/ttyAMA0
08:28:10:278 GW firmware version: 0x26660700
08:28:10:472 GP srcId: 0x01715B1D cmd: 0x13 frameCounter: 6896
08:28:10:473 [INFO] - Button 3000 (null)

But I have no idea which device is my switch. Maybe it's a good idea to show up the addresses in the deconz web gui.

Smanar commented 3 years ago

https://github.com/dresden-elektronik/deconz-rest-plugin/wiki/Zigbee-Error-Codes-in-the-Log

From the wiki

Indicates interference issues when displayed very often. It is issued when a message couldn't be send due the medium (air) is too busy, the radio uses a listen-before-talk mechanism before sending out any message.

capstan1 commented 3 years ago

Thanks! The only thing that's possible is interference from my wifi. The emptiest spot is the far right of the spectrum (25 for zigbee). However, I'm not able to change the channel. It always stops.

22:20:21:105 Websocket disconnected 127.0.0.1:35708, state: 0, close-code: 1000, reason: 22:20:23:453 channel is 15 but should be 25, start channel change 22:20:23:454 Skip automatic channel change, TODO warn user 22:20:33:453 channel is 15 but should be 25, start channel change 22:20:33:454 Skip automatic channel change, TODO warn user

Any idea?

Smanar commented 3 years ago

Lol, I ask tips for this bug every 15 days ^^, if you have the GUI is easy to solve it, just set correctly the channel and using join / quit. But using phoscon I don't know the magic trick, I m interested too , if someone have the perfect method ?

capstan1 commented 3 years ago

So, apparently changing the channel did something. All my lights are disconnected 😅 Weirdly I can see their state in deconz (wethere they are on or off) but they won't connect, even after restarts. image

So, gonna use the weekend to repair. As the battery operated devices are still connected, reconnecting only lights should be okay. Or do someone think I should just reset the Gateway and start from scratch? (Would add a lot more hassle, but if it helps 🤷‍♂️ )

Smanar commented 3 years ago

Haaoww, you have lost all routers but no end devices ? Strange, it s not the reverse that happen usually ? You have correctly following the actions in the good order to change the channel ? https://github.com/dresden-elektronik/deconz-rest-plugin/wiki/Network-lost-and-configuration-restore-does-not-help

Fruchuxs commented 3 years ago

https://github.com/dresden-elektronik/deconz-rest-plugin/wiki/Zigbee-Error-Codes-in-the-Log

From the wiki

Indicates interference issues when displayed very often. It is issued when a message couldn't be send due the medium (air) is too busy, the radio uses a listen-before-talk mechanism before sending out any message.

My Pi lies between some USB (Mouse, Keyboard) and Display Port Cables. Seems to interference zigbee. Moved my pi with the raspbee to another location. Now I just have "delay" entries.

capstan1 commented 3 years ago

Haaoww, you have lost all routers but no end devices ? Strange, it s not the reverse that happen usually ? You have correctly following the actions in the good order to change the channel ? https://github.com/dresden-elektronik/deconz-rest-plugin/wiki/Network-lost-and-configuration-restore-does-not-help

So this didn't help, so I tried to reset the raspbee (hit reset gateway, deinstalled deconz addon from HA). Weirdly sometimes it saw the old entities again, so I tried various times. Yesterday I joined most lights, the delay was still there and huge. Worse: during the afternoon all connection were lost again. What should I do? Gonna try to start over today agin with just some lights, but is there any other idea?

Btw: changed the sd-card to a nearly new one, which didn't help. Also memory and swap stays near 90-100%. Ordered a SSD which will arrive next week.

12:08:18:016 discovery server date: Sun, 29 Nov 2020 11:08:17 GMT 12:08:18:017 local time seems to be ok 12:08:18:019 discovery found version 2.04.35 for update channel stable 12:08:18:502 COM timout query bootloader, assume application 12:08:18:578 Device firmware version 0x26390500 RaspBee 12:08:18:648 unlocked max nodes: 200 12:08:18:849 Device protocol version: 0x010C 12:08:18:897 new node - ext: 0x00212effff024710, nwk: 0x0000 12:08:19:269 SensorNode 4 set node 0x0017880102d51c79 12:08:19:306 SensorNode 5 set node 0x0017880103c89f02 12:08:19:338 LightNode 7: Lightstrip added 12:08:19:353 LightNode 3: Esstisch links added 12:08:19:364 LightNode 4: Esstisch rechts added 12:08:19:374 LightNode 5: Sofa added 12:08:19:384 LightNode 6: Weltkarte added 12:08:19:392 LightNode 2: Deckenlampe Küche added 12:08:19:397 SensorNode 2 set node 0x00158d0002fd0084 12:08:19:408 SensorNode 3 set node 0x00158d0002fcfdfa 12:08:19:420 LightNode 8: Schlafzimmerlampe added 12:08:19:428 LightNode 9: Bad added 12:08:19:437 LightNode 10: Strip added 12:08:19:447 LightNode 11: Handyladekabel added 12:08:19:454 LightNode 12: Nachttisch added 12:08:19:460 SensorNode 6 set node 0x00158d0002b89af7 12:08:19:469 SensorNode 7 set node 0x00158d0002afc194 12:08:19:482 SensorNode 8 set node 0x000d6ffffe1c3b9e 12:08:19:585 CTRL [00:15:8d:00:02:6b:a5:36] not viable in zll.db, skip load cache (101) 12:08:19:813 invalid current channel 0 (TODO) 12:08:19:933 Current channel 11 12:08:19:978 CTRL ANT_CTRL 0x03 12:08:20:010 Power Configuration client cluster not present, append cluster 12:08:20:012 Poll Control client cluster not present, append cluster 12:08:20:079 Device protocol version: 0x010C 12:08:20:222 Current channel 11 12:08:20:266 CTRL ANT_CTRL 0x03 12:08:20:298 Power Configuration client cluster not present, append cluster 12:08:20:299 Poll Control client cluster not present, append cluster 12:08:23:011 LightNode 1: Configuration tool 1 added 12:08:23:015 DB UPDATE device_descriptors SET data = x'0104010500010300000a00190003010020000005', timestamp = 1606648103 WHERE device_id = (SELECT id FROM devices WHERE mac = '00:21:2e:ff:ff:02:47:10') AND endpoint = 1 AND type = 4 12:08:23:292 ZDP discovery done in 371 ms12:08:27:067 dev /dev/ttyAMA0 12:08:27:075 GW update firmware found: /usr/share/deCONZ/firmware/deCONZ_Rpi_0x26390500.bin.GCF 12:08:27:077 GW firmware version: 0x26390500 12:08:27:078 GW firmware version is up to date: 0x26390500 12:08:33:083 saved node state in 4 ms 12:08:33:516 sync() in 432 ms 12:08:56:031 Current channel 11 12:08:56:053 Device TTL 1809 s flags: 0x7

Smanar commented 3 years ago

Also memory and swap stays near 90-100%

Not normal, swoop is using deconz on a raspberry 0 (but he swap to headless when he don't need the GUI) When you take a look in log you haven't a line that appear more than 1 time by second ? (kind of spam)

Have you tried to power off (using circuit breaker, faster) some device ? to check if you still have the issue ?

capstan1 commented 3 years ago

So, an update now: 1) I repaired all devices to the raspbee 2) moved Home Assistant to a SSD

Sadly, it didn't really help.

Here are some logs:

06:10:26:569 Current channel 11 06:10:26:591 Device TTL 1141 s flags: 0x7 PROTO: CRC error 06:10:28:023 binding for cluster 0x0006 of 0x00158D000327DF24 exists (verified by reporting) 06:10:28:026 skip configure report for cluster: 0x0006 attr: 0x0000 of node 0x00158D000327DF24 (seems to be active) 06:10:28:029 binding for cluster 0x0008 of 0x00158D000327DF24 exists (verified by reporting) 06:10:28:031 skip configure report for cluster: 0x0008 attr: 0x0000 of node 0x00158D000327DF24 (seems to be active) 06:10:28:034 binding for cluster 0x0300 of 0x00158D000327DF24 exists (verified by reporting) 06:10:28:035 skip configure report for cluster: 0x0300 attr: 0x0007 of node 0x00158D000327DF24 (seems to be active) 06:10:28:036 skip configure report for cluster: 0x0300 attr: 0x0003 of node 0x00158D000327DF24 (wait reading or unsupported) 06:10:28:037 skip configure report for cluster: 0x0300 attr: 0x0004 of node 0x00158D000327DF24 (wait reading or unsupported) 06:10:28:412 ZCL configure reporting rsp seq: 78 0x00158D000327DF24 for ep: 0x01 cluster: 0x0300 attr: 0x0008 status: 0xC1 06:10:28:465 verified group capacity: 6 and group count: 2 of LightNode 0x00158d000327df24

PROTO: CRC error 10:02:09:846 device state timeout ignored in state 3 10:02:11:147 0x00178801081C4ACD error APSDE-DATA.confirm: 0xA7 on task 10:02:25:893 Current channel 11 10:02:25:919 Device TTL 2400 s flags: 0x7 PROTO: CRC error 10:02:39:847 device state timeout ignored in state 3 10:02:44:347 binding for cluster 0x0000 of 0x000B57FFFE250728 exists (verified by reporting) 10:02:44:349 binding for cluster 0x0006 of 0x000B57FFFE250728 exists (verified by reporting) 10:02:44:352 skip configure report for cluster: 0x0006 attr: 0x0000 of node 0x000B57FFFE250728 (seems to be active) 10:02:44:355 binding for cluster 0x0008 of 0x000B57FFFE250728 exists (verified by reporting) 10:02:44:358 skip configure report for cluster: 0x0008 attr: 0x0000 of node 0x000B57FFFE250728 (seems to be active) PROTO: CRC error 10:02:44:856 verified group capacity: 255 and group count: 2 of LightNode 0x000b57fffe250728

08:02:50:961 Device TTL 931 s flags: 0x7 08:03:12:967 dev /dev/ttyAMA0 08:03:12:970 GW firmware version: 0x26390500 08:03:12:971 GW firmware version is up to date: 0x26390500 08:03:15:370 ZCL attribute report 0x086BD7FFFE0008EC for cluster: 0x0006, ep: 0x01, frame control: 0x08, mfcode: 0x0000 08:03:15:556 0x00178801030C3CD6 error APSDE-DATA.confirm: 0xE9 on task 08:03:16:348 0x7CB03EAA00ACDDC4 error APSDE-DATA.confirm: 0xE9 on task 08:03:16:819 ZCL attribute report 0x00158D000327DF24 for cluster: 0x0008, ep: 0x01, frame control: 0x08, mfcode: 0x0000 08:03:50:932 Current channel 11 08:03:50:954 Device TTL 871 s flags: 0x7 08:03:54:647 0x00178801030C3CD6 error APSDE-DATA.confirm: 0xE9 on task 08:04:50:932 Current channel 11 08:04:50:954 Device TTL 811 s flags: 0x7 08:04:57:014 [INFO] - No button handler for: lumi.remote.b186acn01 endpoint: 0x01 cluster: BASIC (0x0000) command: 0x0A payload[0]: 001 08:04:57:015 ZCL attribute report 0x00158D0002B8113D for cluster: 0x0000, ep: 0x01, frame control: 0x1C, mfcode: 0x115F

So this PROTO: CRC error is new. Didn't find much, googling it. Anyways: do you think I should exchange the ikea bulbs fro tradfri ones, as those are the ones disconnecting. It's working faster right now (with both disconnected) but it's a period too short to be sure.

Mimiix commented 3 years ago

08:03:54:647 0x00178801030C3CD6 error APSDE-DATA.confirm: 0xE9 on task 08:03:15:556 0x00178801030C3CD6 error APSDE-DATA.confirm: 0xE9 on task 08:03:16:348 0x7CB03EAA00ACDDC4 error APSDE-DATA.confirm: 0xE9 on task

These are the ones i'm interested in.

https://github.com/dresden-elektronik/deconz-rest-plugin/wiki/Zigbee-Error-Codes-in-the-Log

capstan1 commented 3 years ago

one of those is the bulb that disconnected various times (Hue), the other I can't identify (maybe its one of the Osram sockets).

I tried to listen to deconz_events in HA when pushing a button and it's mixed. From 10 pushes, 1-2 get delayed. Sometimes the event arrives late in HA, sometimes it arrives fast, but the lights needs time to react.

stale[bot] commented 3 years ago

As there hasn't been any response in 21 days, this issue has been automatically marked as stale. At OP: Please either close this issue or keep it active It will be closed in 7 days if no further activity occurs.

capstan1 commented 3 years ago

As an addition to this thread: what helped me at the end was moving the Pi as far as possible away from my router. Now, everything works like a charm! I just don't get, why this didn't affect me from the beginning. But hey, it works now!