dresden-elektronik / deconz-rest-plugin

deCONZ REST-API plugin to control ZigBee devices
BSD 3-Clause "New" or "Revised" License
1.9k stars 501 forks source link

Device turn on alone #5626

Closed LeoeLeoeL closed 2 years ago

LeoeLeoeL commented 2 years ago

Describe the bug

There are some devices (bulbs and plugs) that perform a power reset and then turn on or off according their Power on/off configuration ## Steps to reproduce the behavior

Expected behavior

Normal behaviour of devices without "uncontrolled" turning on/off ## Screenshots

Environment

deCONZ Logs

[18.55.12.txt](https://github.com/dresden-elektronik/deconz-rest-plugin/files/7796615/18.55.12.txt) [13.28.txt](https://github.com/dresden-elektronik/deconz-rest-plugin/files/7796616/13.28.txt) ## Additional context the time in the title of the logs is when the issue appears Devices involved are: Osram Undercabinet TW Z3 iCasa ICZB-IW11D Lidl smartplugs Lidl bulbs I don't know if there is something that trigs but all the dvices involved reset almost simultaneously. It's not the 230V line in my house because Lidl bulbs linked to HUE bridge don't suffer this issue
LeoeLeoeL commented 2 years ago

One thing more. When bulbs turn on after the reset Deconz/Phoscon reports still off.

Mimiix commented 2 years ago

Hi,

First off: the best wishes to you for this year :)

I've edited the title as it is a bit subjective. Just like to keep it on point.

What log levels are involved here?

Kind regards,

Mimiix commented 2 years ago

Also, can you please add: A list of the nodes + their Mac ID and provide me a screenshot of your deCONZ gui.

First thing i notice is that the vimar seems to keep rebooting:

13:24:51:300 DeviceAnnce of LightNode: 0xf4ce366bab1b414f Permit Join: 0 13:24:51:313 Websocket 192.168.1.111:35508 send message: {"attr":{"id":"44","lastannounced":"2021-12-26T12:24:51Z","lastseen":"2021-12-26T12:24Z","manufacturername":"Vimar","modelid":"Window_Cov_v1.0","name":"Finestra Camino","swversion":null,"type":"Window covering device","uniqueid":"f4:ce:36:6b:ab:1b:41:4f-0a"},"e":"changed","id":"44","r":"lights","t":"event","uniqueid":"f4:ce:36:6b:ab:1b:41:4f-0a"} (ret = 348) 13:24:51:315 Websocket 192.168.1.21:57362 send message: {"attr":{"id":"44","lastannounced":"2021-12-26T12:24:51Z","lastseen":"2021-12-26T12:24Z","manufacturername":"Vimar","modelid":"Window_Cov_v1.0","name":"Finestra Camino","swversion":null,"type":"Window covering device","uniqueid":"f4:ce:36:6b:ab:1b:41:4f-0a"},"e":"changed","id":"44","r":"lights","t":"event","uniqueid":"f4:ce:36:6b:ab:1b:41:4f-0a"} (ret = 348) 13:24:51:318 Websocket 192.168.1.111:35508 send message: {"e":"changed","id":"44","r":"lights","state":{"bri":254,"lift":100,"on":true,"open":false,"reachable":true},"t":"event","uniqueid":"f4:ce:36:6b:ab:1b:41:4f-0a"} (ret = 161) 13:24:51:320 Websocket 192.168.1.21:57362 send message: {"e":"changed","id":"44","r":"lights","state":{"bri":254,"lift":100,"on":true,"open":false,"reachable":true},"t":"event","uniqueid":"f4:ce:36:6b:ab:1b:41:4f-0a"} (ret = 161) 13:24:51:406 0xF4CE366BAB1B414F error APSDE-DATA.confirm: 0xD0 on task

LeoeLeoeL commented 2 years ago

Not so subjective. Devices that activate on their own; Think if somebody has a device to close/open the door to enter at home. Logs involved are INFO and INFO 2. The issue seems to increase slowly. It began with 2 devices (EKET BARS and SALA CINEMA) then the 3 DISIMPEGNO-1, then the plugs (AMPLI, SUBWOOFER and MOGE CAM) and now (after my first post) the Sonoff ZBMini called VIRTUAL SCALA SOPRA. Now I moved temporarly to Windows. I imported the backup and I'm monitoring if issue still happens. But for that I had to move the Conbbe II to another location in the house (1 floor below) and in that position the coordinator lost 6 devices so that I don't know if this test will be reliable

Mimiix commented 2 years ago

Not so subjective. Devices that activate on their own;

I have yet to see a bug like this. Never theless, i'm happy to explore it as a bug. But once we can determine it's setup related , i have to close it as it doesn't comply with #5113 .

Can you please provide me logs with: INFO, INFO L2, ERROR , ERROR L2, APS, APS L2 and ZCL?

In addition, can you provide me the things requested earlier: list of all nodes + Macs Screenshot of the deCONZ gui.

manup commented 2 years ago

I'm not sure for Lidl devices, but with OSRAM/Ledvance we've already seen that at times they reboot on their own for unknown reasons. In that case you would see "Device Announce" messages in the logs.

Mimiix commented 2 years ago

I'm not sure for Lidl devices, but with OSRAM/Ledvance we've already seen that at times they reboot on their own for unknown reasons. In that case you would see "Device Announce" messages in the logs.

I did see some of these. But i want to be sure. If that's the case ,and the other "affected" devices are routed trough these, i am closing the issue and forward the user to the forums.

LeoeLeoeL commented 2 years ago

This is the screen: image

Catching the log is not so easy because the "debug view" covers 7 minutes only. I have to be "so lucky" to see the reset while is performing. I hope asap.

Mimiix commented 2 years ago

This is the screen: image

Catching the log is not so easy because the "debug view" covers 7 minutes only. I have to be "so lucky" to see the reset while is performing. I hope asap.

We still need the list of nodes and the ones that you say are affected.

LeoeLeoeL commented 2 years ago

I added again the Osram lamp (Eket Bars) image

image image

Nodes involved are: Eket Bars Luce Sala Cinema Luce Disimpegno C2 and C3 (C1 is under Hue Bridge now) Ampli Subwoofer Luce Moge C Moge Cam Maybe Luce Cucina 1 and 2 too but it's difficult to say because they are under a physical switch; When they are OFF they can't turn on alone and when they are on they stay on after a reset due to configuration. But sometimes I see a very very fast dimming and brigtning .

Mimiix commented 2 years ago

It's really hard to see the nodes and how they are connected. Can you please make it easier to see connections between each of them?

LeoeLeoeL commented 2 years ago

I think everything is connected to the coordinator. For instance, Moge Cam and Luce Moge C are located at "floor +1" and do not have any direct connection to Eket Bars, Ampli etc. that are located at "floor -1".

image

LeoeLeoeL commented 2 years ago

19.05.txt

Event happened between 19.04 and 19.05. I think Eket Bars turned on few seconds before Luce Disimpegno-1 C2 and C3. Luce Living C and Luce Moge C turned on too.

LeoeLeoeL commented 2 years ago

Home Assistant reports 19.05.33 for all the devices above. Ampli and Subwoofer too.

manup commented 2 years ago

The only requests towards the "Eket Bars" are neighbor queries, since reporting is intact the nodes attributes aren't queried. The log otherwise has no indication of Device Announce messages.

What is noticeable here is that the overall mesh signal strength seems to be quite weak. Might be worth to check that the Zigbee channel doesn't interfere with WiFi / Bluetooth (streaming) nearby, the WiFi can be checked with a phone WiFi scanner app.

Mimiix commented 2 years ago

@manup the earlier logs did have device announcements

manup commented 2 years ago

Ahh ok, then it seems similar to the issue we saw in the other setup.

LeoeLeoeL commented 2 years ago

Channel is 25, It should be outside wifi. I don't know how is possible to have stronger signal. The only green connections are among devices that "touch" each other.

manup commented 2 years ago

To get a better picture I'd suggest setting the display LQI filter higher, e.g. to 100: "Panels > Source Routing > Minimum LQI Display" this this hide all weaker than 100 LQI links.

Channel is 25, It should be outside wifi.

Depends, for example WiFi channel 13 still overlaps. But it's best to look in a WiFi sniffer app on your phone to get a better picture, WiFi might also come from a neighbor :)

image

LeoeLeoeL commented 2 years ago

I'l try to set channels 1 and 6. But neighbours...... My Hue Bridge has channel 20 and works flawlessy. I have another log. 21.54.txt This time only 3 devices are involved because the others were off (with poweron/off OFF) or on ( with poweron/off ON) EKET BARS turned on alone and after 5-10 seconds Luce Disimpegno C2 and C3 followed.

Correction; another lamp turned on. It was Luce Living C

LeoeLeoeL commented 2 years ago

A log with many events and all the device previously written in this thread. Many events.txt Eket bars is always the first but I don't think it's the cause because the issue happens also when I unpower this lamp.

manup commented 2 years ago

From the logs there are several lights restarting, with MAC prefixes belonging to Ledvance/Osram and Nordic (here Vimar). I don't see any hint that this could be caused by the coordinator. I checked the ASDU payloads that for ZCL reports the proper ZCL Default Responses are returned, but this looks fine.

The logs show that after each reboot the devices change their NWK addresses. Perhaps there is a bug in the lights firmware that they think there is an NWK address conflict, there was actually a case of this a few years ago but that was a different vendor.

Running out of ideas here, I think it's a problem which needs to be fixed in the lights firmware.

LeoeLeoeL commented 2 years ago

Fixes to so many different brands? We are speaking about IKEA, Lidl (Tuya), iCasa and Osram. I can't say about Hue because these lamps are connected to hue bridge.

As soon as I can, when I catch it, I'll send an event triggered by Ikea bulbs when physically switched on. Maybe it could clarify better.

P.S. Which is the Vimar device involved? Finestra Dispensa?

LeoeLeoeL commented 2 years ago

Is it the same? https://forum.phoscon.de/t/lights-turn-on-randomly/1299

mabe1983 commented 2 years ago

I can confirm the same behavior with Ikea GU10 bulbs. I have this issue since I've upgraded Deconz to the latest version. In previous versions I never noticed any issues like this.

WhimsySpoon commented 2 years ago

I had a light group activate at 2am on the latest version and firmware on conbee ii

LeoeLeoeL commented 2 years ago

Sometimes the issue is "triggered" by a physical switch-on of an IKEA lamp ( after the switch-on I can see the lamp to dim and bright fast one time and then other devices star to switch) sometimes IMHO Deconz activates the devices; If I restart Deconz the problem quits to exist for a while.

LeoeLeoeL commented 2 years ago

This bug continues everyday many times per day. Again!.txt Is it so impossible to understand where is the problem? Because Deconz has a problem. Do I have to buy a second Hue bridge to solve the problem myself? I hope no. Thanks in advance.

WhimsySpoon commented 2 years ago

This is continuing to happen for me too; usually overnight, although I had one earlier this morning. Unfortunately I wasn't able to grab the logs in time. So far it's happened to a light group, plus an individual light.

The light group contains a mix of Ikea and Innr lights. The individual light was an Innr flexlight.

GitHAMG commented 2 years ago

Hello, I have the same issue. I am running deconz inside Home Assistant, don't know if that is important or has something to do with it. My impression is that with the December? update (update of deconz to 2.13.04 and firmware update of ConBeeII) of the deconz plugin the issues started. Since then I have the same issues that suddenly lights turn on without any user activity (often during the night). We also have zigbee remote switches which worked flawlessly for months, and now since the update the connected lights switch on/off only after some random delay or not at all. I also have the impression that the connection between controller and devices is worse now (at least the display in deconz is far worse, much less "green" connections, seems to be same oberservation as for other issue above). My wife gets really unhappy that the switches do no longer work reliably and that some lights switch on without interaction, it is bad if there is light during evening/night in a room where the blinds are still open. If there is no fix, I fear I have to switch to another gateway to not destroy WAF fully. Thanks a lot to the whole team and thanks in advance for looking at the issue!

WhimsySpoon commented 2 years ago

I too am running deconz as an HA addon.

WhimsySpoon commented 2 years ago

Just happened again. Different light this time, again an Innr one. Deconz wasn't initially showing the status change to on.

manup commented 2 years ago

It's really hard to tell what is exactly going on, but if the lights reboot on their own this is a firmware issue of the lights. In contrast to my knowledge Hue lights don't have this problem at all. The Ikea lights used to have a bug where they completely stopped responding after receiving a certain amount of Parent Announce messages, which can be send by any router in the mesh, the error happens earlier the more end-devices are in the network which may switch between routers, causing these messages.

I'm not sure if this is still the case with recent Ikea firmware, or if they have added a watchdog to detect this and reboot the light, which isn't too bad as long as the previous state is preserved. In that regard it might be worth to try this mode in the On/Off Cluster:

image

(double click on the attribute to change the configuration)

The commands we're sending to the lights haven't changed in a long time, it's basically querying the descriptors, neighbor tables, control and attribute query commands, nothing which should bring a light down.

As shown in above screenshot https://github.com/dresden-elektronik/deconz-rest-plugin/issues/5626#issuecomment-1004215670, the mesh has a overall weak signal range (more red' colors). This can have various reasons as WiFi interference or streaming audio/video or simply the lights are too far apart.

Note that in recent deCONZ versions the coloring of the node links has changed to better show weak spots, this is only a visual change.

Here is how the network should look normally under healthy RF conditions (more green):

image

The coordinator or deCONZ can't influence this as this purely depends on the RF interference and distance between nodes.

You may also try to enable source routing under "Panels > Source Routing" but I'm afraid the RF problems should be addressed first to let it work properly, otherwise the routes would need to use quite low minimum LQI values.

LeoeLeoeL commented 2 years ago

It's long time my Deconz suffers this bug. Ikea bulbs with new firmware continue to reset. Osram smartplugs (ZLL) and bulbs (HA) reset too. Lidl bulbs and smartplug reset. I can agree with you about firmware problems but, I have 8 (eight) Lidl bulbs connected to a HUE bridge and the problem never happened to them. So, if devices connected to ConbeeII reset while devices connected to HUE don't, there is a bug in Deconz. Yesterday evening resets happened many times in few seconds. So, I installed the last Deconz available on phoscon.de in a new SD and restored a backup. I didn't have any reset until today at 9 o' clock AM.

mabe1983 commented 2 years ago

Update: Three days ago I updated the firmware of all Ikea lamps to the latest version and switched on the source routing mode - unfortunately the problem still exists. The problem now occurs with almost all Ikea lights ...and it's only happening since I've upgraded Deconz to version 2.13.04. Before this upgrade everything worked well for more than one year.

WhimsySpoon commented 2 years ago

I only have one Ikea light in my setup and it's unaffected. The lights that keep turning on are Innr GU10s and Innr Flex Lights. I tried changing the attribute as @manup suggests, but it's readonly on these devices.

I've owned the affected Innr lights for +3 years and this has only started happening recently.

Some things to note:

LeoeLeoeL commented 2 years ago

"PREVIOUS" could be a mitigation for lamps if there is only a reset but if you have many resets in seconds the lamps "flash". "PREVIOUS" could be an useless mitigation for smartplugs; For instance, I have a videocam connected and, when the smartplug resets, it reboots,

Regarding interacions between Wifi and Zigbee: Conbee: CH 25 HUE: CH 20 Wifi: CH 1 I dont'think there are overlaps.

manup commented 2 years ago

So, if devices connected to ConbeeII reset while devices connected to HUE don't, there is a bug in Deconz.

Unfortunately it's not that simple. As mentioned above deCONZ doesn't send any commands which could reset/reboot a light. It's all just standard ZCL commands. If the light firmware is robust it must work regardless at what commands are thrown at it, even unsupported or bogus commands – we are only sending standard commands...

If the light decides to reboot it could be an internal watchdog which catches a bug like hanging firmware, or perhaps this is done since the light thinks network doesn't work properly.

For a realistic comparison with Hue bridge you'd need to connect all your lights and sensors of mixed brands, which are currently connected deCONZ to the Hue bridge. We have already seen that battery powered end-devices can bring down Ikea lights when they change parents a lot due the Parent Announce messages. This can only be fixed by Ikea.

In a another issue we have seen in the sniffer that Osram lights rebooted and changed their NWK address, which they shouldn't (can also be seen in the logs) this caused sensors to reconnect to different parent, and after enough Parent Announce messages from various lights, the Ikea lights went downhill.

I'm not 100% sure if Ikea, Osram and Innr all use the same MCU vendor (Silabs) but I feel that there are bugs in those Zigbee stacks which aren't addressed yet. Philips also uses Silabs but they tend to test the hell out there devices and fix bugs in vendor stacks.

To match the Hue bridge exactly we could:

Non of this would be pretty, and I'm afraid it would not fix the problem. Since these reboots only happen in some networks I think the problem is escalated by a mix of certain devices, but that's just a theory.

Here is a similar issue with a different USB dongle on zigbee2mqtt, which overall sends the same commands as deCONZ to the lights: https://github.com/Koenkk/zigbee2mqtt/issues/4878

To mimic the Hue bridge you could try: (perhaps over night as a test)

(note at this point no control commands or queries would be done, only commands received from devices)

If the lights still reboots it would indicate it's not related to our commands, and vica versa.

LeoeLeoeL commented 2 years ago

I opened some time ago another issue when Sonoff ZBMini 1.0 ?) went in "pairing mode" alone. It could be the same problem. To use them I had to migrate them to HUE Bridge. The strange thing is ZBmini 2.0 don't suffer that issue. Maybe could be useful to know the differences between the 2 releases. Unfortunatly, it's not possible to update ZBminis with older firmware; Sonoff bridge doesn't have that option.

It's impossible to "replicate" all the mesh over HUE because: 1) HUE manages until 50 devices 2) HUE doesn't manage NON HUE sensors and switches.

You say Deconz doesn't reset our devices but we have evidence of that. I understand the purpose of Deconz is to work with all the zigbee devices available in the world (while HUE, AQARA , SONOFF and TUYA can afford to take care their products only) but not at cost of reliability.

manup commented 2 years ago

It's impossible to "replicate" all the mesh over HUE because:

HUE manages until 50 devices HUE doesn't manage NON HUE sensors and switches.

Indeed, the limit can't be worked around but it is possible to join ZLL/ZB3.0 Zigbee devices to the Hue bridge during light search, sensors and switches won't show up in the app but they are in the network.

You say Deconz doesn't reset our devices but we have evidence of that.

No, what I tried to say is that we only send standard commands, none of this should bring a Zigbee device down, a reboot command doesn't exist.

Oversimplified example, if we send an On command to a light, and it reboots/crashes, one could say it's a bug in deCONZ, I'd argue that the light firmware needs to be fixed.

This may sound silly: but an actual Ikea example from the past — when a control command was send, while the light was dimming, the light crashed. For Ikea there are already a few workarounds implemented to mitigate light firmware bugs, but this isn't always possible. For example recently Ikea broke the groupcast feature in their remotes with an firmware update and there is nothing we can do about it :/

LeoeLeoeL commented 2 years ago

Indeed, the limit can't be worked around but it is possible to join ZLL/ZB3.0 Zigbee devices to the Hue bridge during light search, sensors and switches won't show up in the app but they are in the network.

I don't know if it is really possible. I read about touchlink for that (but works for few devices). Anyway, if not present in the app, It would be not easy to unpair the devices in the Hue bridge later.

To mimic the Hue bridge you could try: (perhaps over night as a test) Disable neighbor table queries: click on the "CRE" button and uncheck "Routers and Coordinator" Disable all ZCL requests to a light: "Plugins > REST API Plugin > Uncheck Plugin Active" (note at this point no control commands or queries would be done, only commands received from devices)

Does it mean I would lose the % of a window covering device?

GitHAMG commented 2 years ago

Hello all, I have multiple devices being affected by sudden switch on on their own: Osram Tibea, outdoor lantern, outdoor Flex RGBW LED, E14 bulb, E27 bulb, and also a Paulmann 93999 (Zigbee Controller). They all worked fine - at least felt - until December update of deconz in Home Assistant. I did a test now, I did a screenshot of my zigbee network in deconz inside Home Assistant (see picture 1). And then I used the same ConbeeII device and plugged it into a Windows PC (50 cm next to my Home Assistant PC), imported a backup of my Home Assistant deconz config, and then also did a screenshot (see picture 2). It looks way different I would say. The deconz version of Windows PC is 2.12.3. deconz01 windows02

What is alos weird - the device count is different in HA deconz says 32 nodes, in Windows it says 45 nodes

manup commented 2 years ago

I don't know if it is really possible. I read about touchlink for that (but works for few devices). Anyway, if not present in the app, It would be not easy to unpair the devices in the Hue bridge later.

No Touchlink needed here, for ZLL/ZB 3.0 devices the normal joining should work, when starting light search in the Hue app.

Does it mean I would lose the % of a window covering device?

Devices won't get lost they are just not controllable during the test (same as when deCONZ is closed).

LeoeLeoeL commented 2 years ago

@ GitHAMG: From 2.13 the visualization of nodes changed. We lost many green lines. :-( I asked to have the possibility to choose between both visualization but...

Devices won't get lost they are just not controllable during the test (same as when deCONZ is closed). I'm speaking about the possibility to read the % open/close in the cluster.

manup commented 2 years ago

Hello all, I have multiple devices being affected by sudden switch on on their own: Osram Tibea, outdoor lantern, outdoor Flex RGBW LED, E14 bulb, E27 bulb, and also a Paulmann 93999 (Zigbee Controller). They all worked fine - at least felt - until December update of deconz in Home Assistant. I did a test now, I did a screenshot of my zigbee network in deconz inside Home Assistant (see picture 1). And then I used the same ConbeeII device and plugged it into a Windows PC (50 cm next to my Home Assistant PC), imported a backup of my Home Assistant deconz config, and then also did a screenshot (see picture 2). It looks way different I would say. The deconz version of Windows PC is 2.12.3. deconz01 windows02

That's the visual change mentioned above, in recent versions deCONZ shows more realistic colors for links based on the minmum LQI value. A link always has two LQI values 1) how A sees B 2) how B sees A. You can click on the LQI button to inspect the numeric values, since these were more green in earlier versions I'd suspect there is a higher value and a very weak LQI value, like 60/200.

Important: this is a pure visual change to better show weak spots in the mesh, functionality isn't affected.

Good values are around 200 and above, anything lower than 170 should be considered problematic imho. Resons for low values are usually: interference as WiFi/Bluetooth, or simply too large distance between devices, or metal housings of lights.... everything that hinders the signal to propagate. The distance issue can be mitigated by putting routers between weak spots.

Mimiix commented 2 years ago

I asked to have the possibility to choose between both visualization but...

That makes no sense and your just fooling yourself.. Just the threshold is changed. Nothing else

manup commented 2 years ago

@ GitHAMG: From 2.13 the visualization of nodes changed. We lost many green lines. :-( I asked to have the possibility to choose between both visualization but...

Devices won't get lost they are just not controllable during the test (same as when deCONZ is closed). I'm speaking about the possibility to read the % open/close in the cluster.

While the plugin is disabled the values won't be updated in the REST-API but incoming attribute reports are shown in deCONZ Cluster Info panel.

LeoeLeoeL commented 2 years ago

I asked to have the possibility to choose between both visualization but...

That makes no sense and your just fooling yourself.. Just the threshold is changed. Nothing else

Sometimes there are nodes without lines. With the "alternate" view I could see if connections exist without entering in Source Routing and set 1

Mimiix commented 2 years ago

@manup just to get this thing sorted as this is getting derailed: is there a bug here?

LeoeLeoeL commented 2 years ago

You are right. Let's refocus again on the real problem.