dresden-elektronik / deconz-rest-plugin

deCONZ REST-API plugin to control ZigBee devices
BSD 3-Clause "New" or "Revised" License
1.88k stars 485 forks source link

deCONZ losing lights? #33

Closed ebaauw closed 6 years ago

ebaauw commented 7 years ago

When I keep deCONZ running overnight, it seems to "lose" lights. These "lost" lights show up red in the deCONZ GUI. Executing a command or reading the attributes from the Cluster Info panel fails. In the REST API, the light shows state.reachable as false. Changing the light state through the REST API returns error 3 (resource not available).

The only remedy to this situation seems to exit and restart deCONZ. When I disconnect deCONZ from the RaspBee, I tries to reconnect but crashes while discovering the lights. I do get core dumps for these crashes (30MB, still 5MB when gzipped).

After restarting deCONZ, the REST API only shows resources for the lights discovered in this session. It does immediately show all the sensor resources for the devices discovered in previous sessions.

I still have the RaspBee configured as router on the Hue bridge network. The lights that deCONZ has "lost", can be accessed from the Hue bridge.
My ZIgBee network contains 58 nodes: the RaspBee, the Hue bridge, 39 lights (mostly Philips Hue, some Osram, some innr, and one Ikea), 10 Hue motion sensors, 6 Hue dimmers, and 1 Ikea remote. Typically, the innr lights are the first to get "lost", but ultimately, this doesn't seem related to the light brand nor type.

As an aside: the Hue bridge regularly shows some random lights with state.reachable as false. I suspect this happens when the bridge doesn't receive an answer when polling the light. Typically the light state can be updated through the Hue bridge anyway, and state.reachable becomes true again in a couple of minutes, when the bridge receives an answer, the next time it polls the light.

ebaauw commented 7 years ago

Yesterday, I did a complete re-install from scratch of my Raspberry Pi, installed deCONZ v2.04_40 (the REST API plugin was already there, I didn't even install the compiled library) and configured the RaspBee as coordinator for a new ZigBee network. I only connected only two lights (three now with the D1): an innr RB162 and the Trådfri 980 lm, two Hue motion sensors, one Hue dimmer switch and the Trådfri remote.

Not sure if it's the smaller network (I've 56 nodes on the Hue bridge network), the RaspBee as coordinator, or interference by the Hue bridge, but I haven't "lost" any lights yet. Also, the sensors and switches seem more responsive when retrieving attributes through the Cluster Info panel in the deCONZ GUI.

I was able to create bindings from the motion sensor On/Off and Level Control clusters to the lights, they even showed as black lines in the deCONZ GUI, but are lost after a restart. Even before the restart, the lights don't turn on when the sensors detects motion, though.

ebaauw commented 7 years ago

I've been running the RaspBee as coordinator for over a month and the problem hasn't appeared.

ebaauw commented 7 years ago

I moved my entire home automation setup from the Hue bridge to deCONZ v2.04.57. In total 61 ZigBee nodes: 39 lights, 12 Hue motion sensors, 7 Hue dimmer switches, 1 Trådfri Remote, 1 Aqara Door sensor, and, of course, the RaspBee (as coordinator). Also two Hue taps, 26 CLIP sensors, 5 schedules, and 143 rules.

Once, again, deCONZ is "losing" my lights. They show red in the deCONZ GUI, and are marked unreachable in the REST API. Typically they do still react to group commands, but they're inaccessible from the API. Also trying to read their attributes from the deCONZ GUI fails. Powering the light off and back on typically brings it back. As before, the non-Philips lights seem to go first. The sensors remain reachable.

When restarting, it takes deCONZ well over 10 minutes to discover all my lights. During that time, the resources for the yet undiscovered lights are missing from the REST API, even though they are persisted in the database.

It's more or less the same setup as I used to have on my 2nd gen Hue bridge (except for the Ikea and ubisys devices and two additional Hue motion sensors). I'm running deCONZ and homebridge on a Raspberry pi 3, which shows below 10% CPU usage. I'm using ZigBee channel 25 (as I used on the Hue bridge, which is now powered off), as that would cause least interference with WiFi.

manup commented 7 years ago

I had a similar problem this week, in my home setup the IKEA next to an OSRAM drops out at times and becomes a zombie (red). The bulbs are still reachable in the network but deCONZ will block outgoing requests from API and cluster info till they become normal again.

In our research we've seen a strange thing, OSRAM and IKEA and some china devices from Climax, Xiaomi and Oujiabao won't build a mesh, there are never green lines between them in deCONZ. But they mesh with Philips and dresden elektronik devices. We are not sure why that is not, takes more time to research on network level.

I don't think it's the main problem but it's part of the problem which confuses deCONZ. Therefore deCONZ must be changed to do some magic for these devices, should be landing in the next releases. I have some ideas but it will take a few clicks to try them out.

Meanwhile instead of turning the lights on/off you can achieve the same by selecting a red zombie node and press 0 (normal zero key, not the one on the num-pad). This will send a NWK address request broadcast through the network and typically the node responds and becomes non zombie in deCONZ again.

markbeee commented 7 years ago

Regarding the mesh network - which imho should be implemented by default because of the ZigBee layer - I recognized the same thing with deCONZ. I also sniffed the network with a XBEE (as coordinator for the ZigBee network instead of RaspBee) and it confirmed the mesh configurations. The IKEA lights do not show up as meshed. Maybe this image helps: https://twitter.com/MarkusUlsass/status/831094339300372480 I'm using the XBEEs and Philips lights as router devices, because IKEA would not be reachable in some parts of my home. I'm still not sure why they didn't implement the mesh functionality with Tradfri.

manup commented 7 years ago

Not sure if they don't have it at all or if it just works differently from Philips and the like. In the sniffer I see that the IKEA bulbs send NWK Route Record commands to the coordinator, even over hops. Maybe they use tree routing here. From the ZLL certification standpoint they should support mesh routing too.

ebaauw commented 7 years ago

I see multiple green lines to/from my Trådfri bulb. It's a colour temperature light (TRADFRI bulb E27 WS�opal 980lm). Also, it hasn't turned zombie on me (knock on wood).
I was able to revive the zombie ORSAM plug (Plug - LIGHTIFY) and E14 dimmable lights (Classic B40 TW - LIGHTIFY) by pressing 0 when the node is selected, but not the innr lights (PL 110, UC 110, and RB 162), nor my Philips Bloom and 1st-gen Lightstrip colour lights (LLC011 and LST001).
I hope I press the right key - my Raspberry is headless, and I use standard macOS screen sharing from my MacBook to connect to it over VNC. Only the selected node wakes - doesn't seem like broadcast to me?

Figured I needed some more coffee. On my way to the kitchen, I double-checked that my automation was still working (it's based on recalling scenes and sending group commands). Coming back, all lights had been revived somehow, but now the ubisys dimmer (D1 (5503)) turned on me, and won't wake on pressing 0. I see multiple green lines to/from the Philips lights, the OSRAM plug and lights, and the innr RB 162s (but not between the OSRAM and innr). The innr UC 110 and PL 110 (who use the same ZigBee module) only show a single line.

From the ZLL certification standpoint they should support mesh routing too.

The ubisys dimmer and Ikea lights aren't ZLL certified (they're ZHA). Neither are the innr lights, although they are ZLL-ish.

manup commented 7 years ago

By looking more into the sniffer logs it becomes more clear, the IKEA and likely the OSRAM and Innr too are using source routing instead just mesh routing, that's ok but deCONZ firmware doesn't handle this properly yet and after a while established mesh routes will timeout and cause the zombie apocalypse.

I'll bake a handler for received route records into the RaspBee/ConBee firmware to keep the established routes fresh.

ebaauw commented 7 years ago

Thanks, Manual.

Wouldn't it also be related to polling the lights? I didn't see this problem when I had only 11 lights connected to deCONZ. I ran deCONZ with four innr, two OSRAM, three Philips, one Ikea, and one ubisys light for weeks and never saw any zombies.

ebaauw commented 7 years ago

I suppose setting up attribute reporting for the OSRAM and ubisys nodes will stop them from turning zombie? What max reporting interval would you recommend? I set it to 5 sec, and now they're blinking blue constantly in the deCONZ GUI.
I double-checked: the innr and Ikea lights don't support attribute reporting.

I do get a websocket notification when a light becomes reachable, but not when it becomes unreachable. Also, I don't see these when the lights is first discovered, after deCONZ is started, but I do see these after a disconnect in the deCONZ GUI.

manup commented 7 years ago

Wouldn't it also be related to polling the lights? I didn't see this problem when I had only 11 lights connected to deCONZ. I ran deCONZ with four innr, two OSRAM, three Philips, one Ikea, and one ubisys light for weeks and never saw any zombies.

Yes polling also is involved here, deCONZ has become more optimized to not poll too often which has now these side effects. I'd like to solve the real issue on the network level before adjusting polling for the devices which don't support reporting.

I suppose setting up attribute reporting for the OSRAM and ubisys nodes will stop them from turning zombie? What max reporting interval would you recommend?

Hopefully it helps, but needs some testing to verify.

The version 2.04.58 will create bindings witch defaults to Min 5 s and Max 180 s. Note Min is the minimum time between two reports and Max is the maximum time after which a report is sent even if no value has changed. So Max can be fairly large and Min should not be too small, especially in larger networks.

https://github.com/dresden-elektronik/deconz-rest-plugin/blob/master/bindings.cpp#L563

I do get a websocket notification when a light becomes reachable, but not when it becomes unreachable. Also, I don't see these when the lights is first discovered, after deCONZ is started, but I do see these after a disconnect in the deCONZ GUI.

Will be checked, it also takes some time until lights become seen as not reachable to not get false positives in large networks.

ebaauw commented 7 years ago

Yesterday, I set minInterval: 1, maxInterval: 300, reportableChange: 1 for the OSRAM plug (cluster 0x0006), the OSRAM lights (0x0006, 0x0008, and 0x0300), and the ubisys (cluster 0x0006 and 0x0008). Haven't seen these turning zombie since.

Could you also setup a binding and attribute reporting for cluster 0x0300, attribute 0x07 (colour temperature) for the OSRAM lights? I'd be happy to create a pull request for this, but I cannot seem to find where you setup the binding.

manup commented 7 years ago

Good news, attribute reporting works for IKEA https://github.com/dresden-elektronik/deconz-rest-plugin/commit/8cfb4e4846a51a43594e9c74bd3c0f7673480fde

Maybe also with the Innr lights, I'll need to check it. Yes color temperature will be included too (IKEA? and OSRAM), but needs some more code and checks, should be landing soon.

ebaauw commented 7 years ago

Good news indeed!

To test the innr and Ikea tuneable light, I should just write the attribute reporting configuration from the deCONZ GUI, ignoring that it cannot be read?

ebaauw commented 7 years ago

For my Trådfri tunable white, in the deCONZ GUI v2.04.59, I wrote the attribute reporting config for 0x0006/0x00 (On/Off - OnOff), 0x0008/0x00 (Level Control - Current Level) and 0x0300/0x07 (Color control / Color Temperature). Of course, deCONZ reported write failed. Then I setup a binding for each of these clusters to the ConBee. As expected, deCONZ reported a timeout.
Nevertheless, attribute reporting seems to work: I now get state.bri notifications when dimming the light with the Trådfri remote. I also get a state.on notification when powering on the light (when it was off before powering off).

I won't be able to test the innr lights until next week - keep you posted.

manup commented 7 years ago

For my Trådfri tunable white, in the deCONZ GUI v2.04.59, I wrote the attribute reporting config for 0x0006/0x00 (On/Off - OnOff)

The reporting for these will be created by the REST plugin automatically after running a while, it will create bindings as well as configure the reporting for the related attributes (note these are two steps).

As expected, deCONZ reported a timeout.

Is this on Raspberry Pi? The timeout problem in binding dropbox was fixed in version 2.04.59 (RPi).

I won't be able to test the innr lights until next week - keep you posted.

I've testet them yesterday, sadly same as Philips they don't support reporting. It's possible to create bindings but configuring the attribute reporting is not supported and no reports are send.

ebaauw commented 7 years ago

Yes, on the Pi. v2.04.59, firmware 0x26110500.

I must have confused the commits, I thought the setup of reporting wasn't yet in a release. But indeed, I can now read back attribute reporting and the MaxInterval has been set to 180 by the REST API (I set them to 300).

ebaauw commented 6 years ago

Not sure if they don't have it at all or if it just works differently from Philips and the like. In the sniffer I see that the IKEA bulbs send NWK Route Record commands to the coordinator, even over hops. Maybe they use tree routing here. From the ZLL certification standpoint they should support mesh routing too.

The Ikea lights and other devices are now listed as certified ZLL devices: http://www.zigbee.org/zigbee-products-2/#zigbeecertifiedproducts/?view_30_search=Ikea&view_30_page=1

donnib commented 6 years ago

ebaauw does that mean they should do meshing in a normal fashion as other ZLL devices ?

I am interested in knowing whether OSRAM and IKEA lights will mesh together if both are ZLL devices ?

ebaauw commented 6 years ago

I haven't see the issue lately. Now on 2.04.67.

I'd say the non-Hue lights mesh perfectly fine. I colour-coded the nodes: Yellow: IKEA, Orange: OSRAM, Blue: innr, Red: ubisys, Purple: Xiaomi. The rest is Philips.

untitled

ebaauw commented 6 years ago

v2.04.70: All the lines from the deCONZ GUI have disappeared:

untitled

The red nodes are all innr lights (and all the innr lights are red). They do come back after power cycling (pressing 0 never works on the innr lights), but the lines remain missing. Apart from the red lights (which show reachable false in the API), other nodes are responsive.

The REST API plugin seems stuck trying to identify lights:


20:14:38:711 failed to add task type: 19, too many tasks
20:14:38:711 read attributes of 0x00178801104F1238 cluster: 0x0006: [ 20:14:38:711 0x0000 20:14:38:711 ]
20:14:38:711 failed to add task type: 19, too many tasks
20:14:38:711 read attributes of 0x00178801104F1238 cluster: 0x0008: [ 20:14:38:711 0x0000 20:14:38:711 ]
20:14:38:711 failed to add task type: 19, too many tasks
20:14:38:711 read attributes of 0x001788011041C0E6 cluster: 0x0006: [ 20:14:38:711 0x0000 20:14:38:711 ]
20:14:38:711 failed to add task type: 19, too many tasks
20:14:38:711 read attributes of 0x001788011041C0E6 cluster: 0x0008: [ 20:14:38:711 0x0000 20:14:38:711 ]
20:14:38:711 failed to add task type: 19, too many tasks
20:14:38:711 read attributes of 0x001788011041C0E6 cluster: 0x0300: [ 20:14:38:711 0x0000 20:14:38:711 0x0001 20:14:38:711 0x0003 20:14:38:711 0x0004 20:14:38:711 0x0007 20:14:38:711 0x0008 20:14:38:711 0x4000 20:14:38:712 0x4002 20:14:38:712 ]
20:14:38:712 failed to add task type: 19, too many tasks
20:14:38:712 read attributes of 0x0017880100F05F1F cluster: 0x0000: [ 20:14:38:712 0x4000 20:14:38:712 ]
20:14:38:712 failed to add task type: 19, too many tasks
20:14:38:712 read attributes of 0x0017880100F05F1F cluster: 0x0006: [ 20:14:38:712 0x0000 20:14:38:712 ]
20:14:38:712 failed to add task type: 19, too many tasks
20:14:38:712 read attributes of 0x0017880100F05F1F cluster: 0x0008: [ 20:14:38:712 0x0000 20:14:38:712 ]
20:14:38:712 failed to add task type: 19, too many tasks
20:14:38:712 read attributes of 0x0017880100F05F1F cluster: 0x0300: [ 20:14:38:712 0x0000 20:14:38:712 0x0001 20:14:38:712 0x0003 20:14:38:712 0x0004 20:14:38:712 0x0007 20:14:38:712 0x0008 20:14:38:712 0x4000 20:14:38:712 0x4002 20:14:38:712 ]
20:14:38:712 failed to add task type: 19, too many tasks
20:14:38:712 read attributes of 0x001788011036CE87 cluster: 0x0000: [ 20:14:38:712 0x4000 20:14:38:712 ]
20:14:38:712 failed to add task type: 19, too many tasks
20:14:38:713 read attributes of 0x001788011036CE87 cluster: 0x0006: [ 20:14:38:713 0x0000 20:14:38:713 ]
20:14:38:713 failed to add task type: 19, too many tasks
20:14:38:713 read attributes of 0x001788011036CE87 cluster: 0x0008: [ 20:14:38:713 0x0000 20:14:38:713 ]
20:14:38:713 failed to add task type: 19, too many tasks
20:14:38:713 failed to add task type: 21, too many tasks
20:14:38:713 failed to add task type: 23, too many tasks
20:14:38:713 read scenes membership for group: 0x00D2 rejected
20:14:38:713 read attributes of 0x001788011027C7E6 cluster: 0x0006: [ 20:14:38:713 0x0000 20:14:38:713 ]
20:14:38:714 failed to add task type: 19, too many tasks
20:14:38:714 read attributes of 0x001788011027C7E6 cluster: 0x0008: [ 20:14:38:714 0x0000 20:14:38:714 ]
20:14:38:714 failed to add task type: 19, too many tasks
20:14:38:714 read attributes of 0x001788011027C7E6 cluster: 0x0300: [ 20:14:38:714 0x0000 20:14:38:714 0x0001 20:14:38:714 0x0003 20:14:38:714 0x0004 20:14:38:714 0x0007 20:14:38:714 0x0008 20:14:38:714 0x4000 20:14:38:714 0x4002 20:14:38:714 ]
20:14:38:714 failed to add task type: 19, too many tasks
20:14:38:714 failed to add task type: 21, too many tasks
20:14:38:714 failed to add task type: 23, too many tasks
20:14:38:714 read scenes membership for group: 0x4C61 rejected
20:14:38:714 failed to add task type: 23, too many tasks
20:14:38:714 read scenes membership for group: 0x00E6 rejected
20:14:38:714 read attributes of 0x00178801104F10E6 cluster: 0x0000: [ 20:14:38:714 0x4000 20:14:38:714 ]
20:14:38:714 failed to add task type: 19, too many tasks
20:14:38:714 read attributes of 0x00178801104F10E6 cluster: 0x0006: [ 20:14:38:714 0x0000 20:14:38:714 ]
20:14:38:714 failed to add task type: 19, too many tasks
20:14:38:714 read attributes of 0x00178801104F10E6 cluster: 0x0008: [ 20:14:38:715 0x0000 20:14:38:715 ]
20:14:38:715 failed to add task type: 19, too many tasks
20:14:38:715 failed to add task type: 21, too many tasks
20:14:38:715 failed to add task type: 23, too many tasks
20:14:38:715 read scenes membership for group: 0x012C rejected
20:14:38:715 failed to add task type: 21, too many tasks
20:14:38:715 failed to add task type: 23, too many tasks
20:14:38:715 read scenes membership for group: 0x00E6 rejected
20:14:38:715 failed to add task type: 23, too many tasks
20:14:38:715 read scenes membership for group: 0x4C61 rejected
20:14:38:715 read attributes of 0x001788011052ADE6 cluster: 0x0000: [ 20:14:38:715 0x4000 20:14:38:715 ]
20:14:38:715 failed to add task type: 19, too many tasks
20:14:38:715 read attributes of 0x001788011052ADE6 cluster: 0x0006: [ 20:14:38:715 0x0000 20:14:38:715 ]
20:14:38:715 failed to add task type: 19, too many tasks
20:14:38:715 read attributes of 0x001788011052ADE6 cluster: 0x0008: [ 20:14:38:715 0x0000 20:14:38:715 ]
20:14:38:715 failed to add task type: 19, too many tasks
20:14:38:715 failed to add task type: 21, too many tasks
20:14:38:715 failed to add task type: 23, too many tasks
20:14:38:715 read scenes membership for group: 0x006E rejected
20:14:38:716 read attributes of 0x0017880102027F90 cluster: 0x0406: [ 20:14:38:716 0x0010 20:14:38:716 ]
20:14:38:716 failed to add task type: 19, too many tasks
20:14:38:717 read attributes of 0x001788010200D05B cluster: 0x0406: [ 20:14:38:717 0x0010 20:14:38:717 ]
20:14:38:717 failed to add task type: 19, too many tasks
20:14:38:717 read attributes of 0x001788010200AC85 cluster: 0x0406: [ 20:14:38:717 0x0010 20:14:38:717 ]
20:14:38:717 failed to add task type: 19, too many tasks
20:14:38:717 read attributes of 0x0017880102032FFD cluster: 0x0406: [ 20:14:38:717 0x0010 20:14:38:717 ]
20:14:38:717 failed to add task type: 19, too many tasks
20:14:38:717 read attributes of 0x0017880102031475 cluster: 0x0406: [ 20:14:38:717 0x0010 20:14:38:717 ]
manup commented 6 years ago

It seems that the deCONZ core aps queues might be full. The core usually sends Mgtm_Lqi_req (neighbor table discovery) to lights periodically, the green lights are established on those and will timeout if not refreshed.

It would be interesting to see what is in the queues, maybe too many requests to sleeping devices.

I'll add more debug info for --dbg-aps=<level> and --dbg-zdp=<level> to see what's going on here.

Meanwhile here is a new version with relaxed discovery handling, but it won't solve the above issue.

http://www.dresden-elektronik.de/rpi/deconz/beta/deconz-2.04.71-qt5.deb

donnib commented 6 years ago

@manup so for me running mainly ikea nodes i still see the problem with lights that Goes red so sometimes my lights work sometimes not

manup commented 6 years ago

@manup so for me running mainly ikea nodes i still see the problem with lights that Goes red so sometimes my lights work sometimes not

Mine work fine, but I just have a small mixed network of 7 lights and 15 sensors/switches, not much meshing. I'll will put some IKEAs and innr bulbs into one of the company test networks which have 85 lights, this should help to debug the case.

ebaauw commented 6 years ago

Meanwhile here is a new version with relaxed discovery handling, but it won't solve the above issue.

Just installed v2.04.71. After starting deCONZ, it now takes "only" 20 minutes instead of a full hour to discover and make available my entire network (70 nodes, 40 of which lights). I am not exaggerating: in v2.04.70, the first light reported reachable true at 20:29:25; the last at 21:29:20. For v2.04.71 it's 22:01:06 and 22:21:02.

ebaauw commented 6 years ago

Tried d63849c559278346df62106d82bead3713416bfa (had to change WARMUP_TIME_S to WARMUP_TIME in de_web_plugin.cpp to get it compiled), but it doesn't seem to make a difference. Lights appear yellow upon discovery, but turn red immediately after. After a some time the light becomes yellow again. Shortly after startup, this takes a long time (minutes). The log shows a filled up ZDP queue:

10:29:01:750 void deCONZ::zmNode::setFetched(deCONZ::RequestId, bool) fetched item: 8, node: 0x4D53
10:29:01:817 Node 0x0017880100CB6334 is known by 1 neighbors, last seen 17 s
10:29:02:058 remove discovery request duplicate
10:29:02:058 remove discovery request duplicate
10:29:02:058 remove discovery request duplicate
10:29:02:058 remove discovery request duplicate
10:29:02:058 device discover rotate, too busy (4 ZDP requests in que)
10:29:02:058     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:02:058     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:02:058     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:02:058     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:02:298 Node 0x001788010200D05B is known by 0 neighbors, last seen 0 s
10:29:02:521 Erase task req-id: 199, type: 14 zcl seqno: 74 send time 0, profileId: 0x0104, clusterId: 0x0006
10:29:02:538 remove discovery request duplicate
10:29:02:538 device discover rotate, too busy (4 ZDP requests in que)
10:29:02:538     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:02:538     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:02:538     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:02:538     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:02:698 schedule 4 diff 46858, 
10:29:02:778 Node 0x00158D0000CDD7F7 is known by 1 neighbors, last seen 0 s
10:29:03:017 device discover rotate, too busy (4 ZDP requests in que)
10:29:03:017     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:017     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:017     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:017     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:279 Node 0x001788011052ADC9 is known by 1 neighbors, last seen 0 s
10:29:03:497 trigger rule 48 - Living Room off
10:29:03:498 device discover rotate, too busy (4 ZDP requests in que)
10:29:03:498     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:498     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:498     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:498     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:697 schedule 4 diff 46857, 
10:29:03:738 Node 0x0017880102031475 is known by 0 neighbors, last seen 7 s
10:29:03:978 device discover rotate, too busy (4 ZDP requests in que)
10:29:03:978     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:978     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:978     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:03:978     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:04:217 Node 0x84182600000A564D is known by 0 neighbors, last seen 2 s
10:29:04:371 Erase task req-id: 210, type: 14 zcl seqno: 75 send time 1, profileId: 0x0104, clusterId: 0x0006
10:29:04:594 neigbor 0x001788010200ac85 is unknown child
10:29:04:594 neigbor 0x8418260000061ded is unknown child
10:29:04:683 MAC Poll 0x02 0x069A
10:29:04:683 verify 0x0017880102027f90 is child node after 774912 s
10:29:04:698 schedule 4 diff 46856, 
10:29:04:698 Node 0x0017880100CB6334 is known by 1 neighbors, last seen 0 s
10:29:04:706 verify neighbor status: APP_SUCCESS (0x00) 
10:29:05:056 neigbor 0x001788011027c93d is unknown child
10:29:05:056 neigbor 0x0017880102315d84 is unknown child
10:29:05:177 Node 0x001788010200D05B is known by 0 neighbors, last seen 0 s
10:29:05:418 device discover rotate, too busy (5 ZDP requests in que)
10:29:05:418     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:05:418     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:05:418     cl 0x0031 tp 0x84182600000A564D 0x4D53
10:29:05:418     cl 0x0031 tp 0x0017880100CB6334 0x4B76
10:29:05:418     cl 0x0031 tp 0x0017880100CB6334 0x4B76

Somehow the REST API plugin manages to empty the queue after a couple of minutes. At this time, newly discovered lights go yellow -> red -> yellow within a minute or so. Except for the innr lights and the Hue color lights (two Blooms and two gen-1 lightstrips). These remain red for 10s of minutes, before finally turning yellow. These are also the lights that don't react to pressing 0 on the GUI.

The event for state.reachable becoming false is issued now.

Please let me know if I can provide more info.

manup commented 6 years ago

Tried d63849c (had to change WARMUP_TIME_S to WARMUP_TIME in de_web_plugin.cpp to get it compiled), but it doesn't seem to make a difference. Lights appear yellow upon discovery, but turn red immediately after. After a some time the light becomes yellow again. Shortly after startup, this takes a long time (minutes). The log shows a filled up ZDP queue:

I've looked deeper into the APS request, confirm and indication cycle and what causes the hick-ups which lead to above problem. 2.04.73 will arrive quite soon and should be much more solid here.

The zombie-dance should also be fixed and discovery more speedy.

The new version also prevents outgoing NWK address request to innr lights, since these don't respond to them for some reason.

donnib commented 6 years ago

@manup in my case out of 17 GU10 IKEA spots 4 of them went red after few days and at the moment everything is quiet at home since I am on vacation. I don't know if that would make it worse.

manup commented 6 years ago

Here we go, hopefully this fixes a few of the above issues:

http://www.dresden-elektronik.de/rpi/deconz/beta/deconz-2.04.73-qt5.deb

ebaauw commented 6 years ago

Thanks! Much better, at least for during startup. No more red lights during discovery and all lights found within a couple of minutes. The lines are also appearing much faster/earlier.

manup commented 6 years ago

Cool, there's still room for improvement but hopefully it will run stable for now.

donnib commented 6 years ago

@manup sorry but i still get red nodes with latest version. Typically it happens after a day. I tried 0 but no change. Latest being 2.04.75 and firmware 0x26160500

Of course it's an IKEA bulb and it's of type GU10 running latest firmware.

ebaauw commented 6 years ago

Latest RaspBee/ConBee firmware is 0x26190500, delivered with 2.04.75.

Occasionally I see a red node but less often than the Hue bridge would mark a light as unreachable. The node turns yellow again spontaneously after a while.

donnib commented 6 years ago

Hadn't noticed a new firmware was available. I will do an update right away. Right now the noise stays red so no going back to yellow. I'll report back.

manup commented 6 years ago

I'm afraid firmware version 0x26190500 won't improve the problem compared to 0x26160500 it only includes some security capabilities, which will be used for various purposes like individual default password and Wi-Fi PSK.

ebaauw commented 6 years ago

Cool, there's still room for improvement but hopefully it will run stable for now.

deCONZ runs stable for some hours or days, but then the RaspBee "chokes". I think it's related to too many ZigBee requests and/or too many neighbour entries in the routers.

Symptoms:

Here's an excerpt from the log:

20:17:46:187 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:17:46:187 device state timeout ignored in state 3
20:17:46:252 trigger rule events took 2 ms
20:17:47:251 trigger rule events took 2 ms
20:17:47:748 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:17:47:748 device state timeout ignored in state 3
20:17:48:242 Current channel 25
20:17:48:256 Current channel 25
20:17:48:265 trigger rule events took 3 ms
20:17:49:212 CTRL got nwk update id 0
20:17:49:253 trigger rule events took 3 ms
20:17:49:274 CTRL got nwk update id 0
20:17:49:340 CTRL ANT_CTRL 0x03
20:17:50:015 CTRL read param resp id: 0x1D, status: 0x04
20:17:50:247 Idle timer triggered
20:17:50:248 Force read attributes for SensorNode Bathroom Temperature
20:17:50:252 trigger rule events took 2 ms
20:17:50:938 CTRL read param resp id: 0x1D, status: 0x04
20:17:51:252 trigger rule events took 2 ms
20:17:51:459 CTRL read param resp id: 0x1D, status: 0x04
20:17:51:934 MASTER kill cmd 0x0A (TIMEOUT)
20:17:51:940 CTRL read param resp id: 0x01, status: 0x03
20:17:52:252 trigger rule events took 2 ms
20:17:52:890 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:17:52:890 device state timeout (handled)
20:17:52:890 MASTER kill cmd 0x0A (ERROR)
20:17:52:896 CTRL read param resp id: 0x01, status: 0x05
20:17:52:896 MASTER kill cmd 0x0A (ERROR)
20:17:52:902 CTRL read param resp id: 0x02, status: 0x05
20:17:52:903 MASTER kill cmd 0x0A (ERROR)
20:17:52:909 CTRL read param resp id: 0x02, status: 0x05
20:17:52:909 MASTER kill cmd 0x0A (ERROR)
20:17:52:915 CTRL read param resp id: 0x02, status: 0x05
20:17:52:915 MASTER kill cmd 0x0A (ERROR)
20:17:52:920 CTRL read param resp id: 0x01, status: 0x05
20:17:52:920 MASTER kill cmd 0x0A (ERROR)
20:17:52:926 CTRL read param resp id: 0x01, status: 0x05
20:17:52:926 MASTER kill cmd 0x0A (ERROR)
20:17:52:931 CTRL read param resp id: 0x01, status: 0x05
20:17:52:931 MASTER kill cmd 0x0A (ERROR)
20:17:52:937 CTRL read param resp id: 0x01, status: 0x05
20:17:52:937 MASTER kill cmd 0x0A (ERROR)
20:17:52:942 CTRL read param resp id: 0x01, status: 0x05
20:17:52:942 MASTER kill cmd 0x0A (ERROR)
20:17:52:948 CTRL read param resp id: 0x01, status: 0x05
20:17:52:948 MASTER kill cmd 0x0A (ERROR)
20:17:52:954 CTRL read param resp id: 0x01, status: 0x05
20:17:52:954 MASTER kill cmd 0x0A (ERROR)
20:17:52:958 CTRL read param resp id: 0x01, status: 0x05
20:17:52:958 MASTER kill cmd 0x0A (ERROR)
20:17:52:961 CTRL read param resp id: 0x01, status: 0x05
20:17:52:961 MASTER kill cmd 0x0A (ERROR)
20:17:52:964 CTRL read param resp id: 0x01, status: 0x05
20:17:52:964 MASTER kill cmd 0x0A (ERROR)
20:17:52:967 CTRL read param resp id: 0x01, status: 0x05
20:17:52:967 MASTER kill cmd 0x0A (ERROR)
20:17:52:969 CTRL read param resp id: 0x01, status: 0x05
20:17:52:969 MASTER kill cmd 0x0A (ERROR)
20:17:52:972 CTRL read param resp id: 0x01, status: 0x05
20:17:52:972 MASTER kill cmd 0x0A (ERROR)
20:17:52:975 CTRL read param resp id: 0x01, status: 0x05
20:17:52:975 MASTER kill cmd 0x0A (ERROR)
20:17:52:978 CTRL read param resp id: 0x01, status: 0x05
20:17:52:978 MASTER kill cmd 0x0A (ERROR)
20:17:52:980 CTRL read param resp id: 0x01, status: 0x05
20:17:52:980 MASTER kill cmd 0x0A (ERROR)
20:17:52:984 CTRL read param resp id: 0x01, status: 0x05
20:17:52:984 MASTER kill cmd 0x0A (ERROR)
20:17:52:986 CTRL read param resp id: 0x01, status: 0x05
20:17:52:986 MASTER kill cmd 0x0A (ERROR)
20:17:52:989 CTRL read param resp id: 0x01, status: 0x05
20:17:52:990 MASTER kill cmd 0x0A (ERROR)
20:17:52:993 CTRL read param resp id: 0x01, status: 0x05
20:17:52:994 MASTER kill cmd 0x0A (ERROR)
20:17:52:997 CTRL read param resp id: 0x01, status: 0x05
20:17:52:998 MASTER kill cmd 0x0A (ERROR)
20:17:53:000 CTRL read param resp id: 0x02, status: 0x05
20:17:53:001 MASTER kill cmd 0x0A (ERROR)
20:17:53:004 CTRL read param resp id: 0x02, status: 0x05
20:17:53:005 MASTER kill cmd 0x0A (ERROR)
20:17:53:008 CTRL read param resp id: 0x02, status: 0x05
20:17:53:009 MASTER kill cmd 0x0A (ERROR)
20:17:53:011 CTRL read param resp id: 0x01, status: 0x05
20:17:53:012 MASTER kill cmd 0x0A (ERROR)
20:17:53:015 CTRL read param resp id: 0x01, status: 0x05
20:17:53:016 MASTER kill cmd 0x0A (ERROR)
20:17:53:019 CTRL read param resp id: 0x09, status: 0x05
20:17:53:060 CTRL read param resp id: 0x1D, status: 0x04
20:17:53:062 void zmMaster::handleStateIdle(zmMaster::MasterEvent) not connected goto OFF state
20:17:53:062 device state timeout (handled)
20:17:53:064 Serial com disconnected, reason: 1
20:17:53:064 MASTER kill cmd 0x04 (ERROR)
20:17:53:260 COM: /dev/ttyAMA0 : 
20:17:53:260 COM: /dev/ttyS0 : 
20:17:53:260 dev /dev/ttyS0 (/dev/ttyS0)
20:17:53:271 COM: /dev/ttyAMA0 : 
20:17:53:271 COM: /dev/ttyS0 : 
20:17:53:271 dev /dev/ttyS0 (/dev/ttyS0)
20:17:53:275 auto connect com /dev/ttyS0
20:17:53:276 Serial com connected
20:17:53:279 trigger rule events took 1 ms
20:17:53:754 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:17:53:754 device state timeout ignored in state 2
20:17:54:252 trigger rule events took 3 ms
20:17:54:747 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:17:54:747 device state timeout ignored in state 2
20:17:55:247 Idle timer triggered
20:17:55:248 Force read attributes for SensorNode Downstairs Light Level
20:17:55:248 skip binding for attribute reporting of cluster 0x0000 (end-device might sleep)
20:17:55:248 Force binding of attribute reporting for node Downstairs Light Level
20:17:55:253 trigger rule events took 3 ms
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x22
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x01
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x05
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x08
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x07
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x0A
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x09
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x0E
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x0D
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x10
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x0B
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x15
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x1C
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x1E
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x21
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x24
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x23
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x1D
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x20
20:17:55:555 int zmMaster::readParameterWithArg(ZM_DataId_t, uint8_t*, uint8_t) 0x18
20:17:55:555 int zmMaster::readParameterWithArg(ZM_DataId_t, uint8_t*, uint8_t) 0x13
20:17:55:555 int zmMaster::readParameterWithArg(ZM_DataId_t, uint8_t*, uint8_t) 0x13
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x16
20:17:55:555 int zmMaster::readParameter(ZM_DataId_t) 0x25
20:17:56:250 trigger rule events took 1 ms
20:17:57:210 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:17:57:210 device state timeout ignored in state 3
20:17:57:252 trigger rule events took 2 ms
20:17:58:249 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:17:58:249 device state timeout ignored in state 3
20:17:58:252 trigger rule events took 2 ms
20:17:59:248 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:17:59:248 device state timeout ignored in state 3
20:17:59:252 trigger rule events took 2 ms
20:18:00:248 Idle timer triggered
20:18:00:248 Force read attributes for SensorNode Living Room Remote
20:18:00:248 binding for attribute reporting of cluster 0x0001 seems to be active
20:18:00:248 skip binding for attribute reporting of cluster 0x1000 (end-device might sleep)
20:18:00:249 Force binding of attribute reporting for node Living Room Remote
20:18:00:249 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:00:249 device state timeout ignored in state 3
20:18:00:253 trigger rule events took 2 ms
20:18:01:249 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:01:249 device state timeout ignored in state 3
20:18:01:252 trigger rule events took 2 ms
20:18:01:974 Device firmware version 0x26190500
20:18:02:251 trigger rule events took 2 ms
20:18:03:251 trigger rule events took 2 ms
20:18:03:400 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:03:400 device state timeout ignored in state 3
20:18:03:561 unlocked max nodes: 200
20:18:03:561 has feature: max nodes (0x11), status: 0x00
20:18:04:252 trigger rule events took 2 ms
20:18:04:813 unlocked max nodes: 200
20:18:04:813 has feature: max nodes (0x11), status: 0x00
20:18:04:840 unlocked max nodes: 200
20:18:04:840 has feature: max nodes (0x11), status: 0x00
20:18:05:248 Idle timer triggered
20:18:05:248 binding for attribute reporting SensorNode Kitchen Motion of cluster 0x0406 seems to be active
20:18:05:248 Force read attributes for SensorNode Kitchen Motion
20:18:05:252 trigger rule events took 2 ms
20:18:06:252 trigger rule events took 2 ms
20:18:06:265 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:06:265 device state timeout ignored in state 3
20:18:07:249 read attributes of 0x001788010200D061 cluster: 0x0406: [ 20:18:07:249 0x0010 20:18:07:249 ]
20:18:07:249 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:07:249 device state timeout ignored in state 3
20:18:07:253 trigger rule events took 2 ms
20:18:08:248 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:08:248 device state timeout ignored in state 3
20:18:08:252 trigger rule events took 2 ms
20:18:09:247 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:09:247 device state timeout ignored in state 3
20:18:09:251 trigger rule events took 2 ms
20:18:10:185 has feature: std security (0x01), status: 0x00
20:18:10:247 Idle timer triggered
20:18:10:247 Force read attributes for SensorNode Downstairs Temperature
20:18:10:247 skip binding for attribute reporting of cluster 0x0000 (end-device might sleep)
20:18:10:247 Force binding of attribute reporting for node Downstairs Temperature
20:18:10:251 trigger rule events took 1 ms
20:18:10:300 has feature: std security (0x01), status: 0x00
20:18:10:379 has feature: link security (0x02), status: 0x00
20:18:11:252 trigger rule events took 2 ms
20:18:11:805 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:11:805 device state timeout ignored in state 3
20:18:12:251 trigger rule events took 2 ms
20:18:12:516 has feature: high security (0x03), status: 0x04
20:18:12:530 has feature: high security (0x03), status: 0x04
20:18:12:540 has feature: ethernet (0x04), status: 0x04
20:18:12:762 has feature: mass storage device (MSD) (0x05), status: 0x04
20:18:12:769 has feature: device firmware upgrade (DFU) (0x06), status: 0x04
20:18:13:252 trigger rule events took 2 ms
20:18:14:248 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:14:248 device state timeout ignored in state 3
20:18:14:252 trigger rule events took 2 ms
20:18:15:248 Idle timer triggered
20:18:15:248 Force read attributes for SensorNode Bedroom Light Level
20:18:15:248 skip binding for attribute reporting of cluster 0x0000 (end-device might sleep)
20:18:15:248 Force binding of attribute reporting for node Bedroom Light Level
20:18:15:248 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:15:249 device state timeout ignored in state 3
20:18:15:253 trigger rule events took 2 ms
20:18:16:252 trigger rule events took 2 ms
20:18:17:158 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:17:158 device state timeout ignored in state 3
20:18:17:251 trigger rule events took 2 ms
20:18:18:208 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:18:208 device state timeout ignored in state 3
20:18:18:252 trigger rule events took 2 ms
20:18:18:446 has feature: ZigBee Light Link (ZLL) (0x07), status: 0x04
20:18:19:252 trigger rule events took 2 ms
20:18:19:871 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:19:871 device state timeout ignored in state 3
20:18:20:247 Idle timer triggered
20:18:20:248 binding for attribute reporting SensorNode Downstairs Motion of cluster 0x0406 seems to be active
20:18:20:248 Force read attributes for SensorNode Downstairs Motion
20:18:20:248 skip binding for attribute reporting of cluster 0x0000 (end-device might sleep)
20:18:20:248 Force binding of attribute reporting for node Downstairs Motion
20:18:20:252 trigger rule events took 2 ms
20:18:21:049 has feature: ZigBee Light Link (ZLL) (0x07), status: 0x04
20:18:21:252 trigger rule events took 2 ms
20:18:21:636 has feature: interpan (0x08), status: 0x04
20:18:21:640 has feature: interpan (0x08), status: 0x04
20:18:21:644 has feature: button 1 (0x09), status: 0x04
20:18:21:647 has feature: button 1 (0x0A), status: 0x04
20:18:21:651 has feature: LED 1 (0x0B), status: 0x00
20:18:21:654 has feature: LED 2 (0x0C), status: 0x00
20:18:21:658 has feature: LED 3 (0x0D), status: 0x04
20:18:21:662 has feature: LED 4 (0x0E), status: 0x04
20:18:21:665 has feature: LED 5 (0x0F), status: 0x04
20:18:21:729 has feature: LED RGB (0x10), status: 0x04
20:18:21:732 has feature: max nodes (0x11), status: 0x04
20:18:21:736 Device protocol version: 0x0105
20:18:21:954 void deCONZ::zmNode::setFetched(deCONZ::RequestId, bool) fetched item: 1, node: 0x0000
20:18:22:251 trigger rule events took 1 ms
20:18:22:998 read attributes of 0x001788010200A85B cluster: 0x0406: [ 20:18:22:998 0x0010 20:18:22:998 ]
20:18:23:252 trigger rule events took 3 ms
20:18:24:252 trigger rule events took 2 ms
20:18:24:450 int zmMaster::readParameterWithArg(ZM_DataId_t, uint8_t*, uint8_t) 0x19
20:18:25:247 Wait 1s till query finished
20:18:25:250 trigger rule events took 1 ms
20:18:26:247 Idle timer triggered
20:18:26:247 binding for attribute reporting SensorNode Bedroom Motion of cluster 0x0406 seems to be active
20:18:26:247 Force read attributes for SensorNode Bedroom Motion
20:18:26:247 skip binding for attribute reporting of cluster 0x0000 (end-device might sleep)
20:18:26:248 Force binding of attribute reporting for node Bedroom Motion
20:18:26:248 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:26:248 device state timeout ignored in state 3
20:18:26:252 trigger rule events took 2 ms
20:18:27:247 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:27:247 device state timeout ignored in state 3
20:18:27:252 trigger rule events took 2 ms
20:18:28:248 read attributes of 0x001788010200AC7C cluster: 0x0406: [ 20:18:28:249 0x0010 20:18:28:249 ]
20:18:28:252 trigger rule events took 2 ms
20:18:28:291 Current channel 25
20:18:29:127 CTRL got nwk update id 0
20:18:29:142 CTRL got nwk update id 0
20:18:29:248 CTRL ANT_CTRL 0x03
20:18:29:260 trigger rule events took 1 ms
20:18:29:458 CTRL read param resp id: 0x1D, status: 0x04
20:18:29:461 CTRL read param resp id: 0x20, status: 0x04
20:18:30:251 trigger rule events took 3 ms
20:18:30:886 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:30:886 device state timeout ignored in state 3
20:18:31:247 Wait 1s till query finished
20:18:31:252 trigger rule events took 2 ms
20:18:31:836 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:31:836 device state timeout ignored in state 3
20:18:32:248 Idle timer triggered
20:18:32:248 Force read attributes for SensorNode Kitchen Sink Dimmer
20:18:32:248 binding for attribute reporting of cluster 0x0001 seems to be active
20:18:32:248 skip binding for attribute reporting of cluster 0xFC00 (end-device might sleep)
20:18:32:248 Force binding of attribute reporting for node Kitchen Sink Dimmer
20:18:32:252 trigger rule events took 3 ms
20:18:33:252 trigger rule events took 2 ms
20:18:33:792 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:33:792 device state timeout ignored in state 3
20:18:34:252 trigger rule events took 2 ms
20:18:34:747 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:34:747 device state timeout ignored in state 3
20:18:35:252 trigger rule events took 2 ms
20:18:35:748 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:35:748 device state timeout ignored in state 3
20:18:36:252 trigger rule events took 2 ms
20:18:37:247 Idle timer triggered
20:18:37:247 Force read attributes for SensorNode Bathroom Air Pressure
20:18:37:247 don't create binding for attribute reporting of end-device Bathroom Air Pressure
20:18:37:247 Force binding of attribute reporting for node Bathroom Air Pressure
20:18:37:249 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:37:249 device state timeout ignored in state 3
20:18:37:252 trigger rule events took 2 ms
20:18:38:247 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:38:247 device state timeout ignored in state 3
20:18:38:252 trigger rule events took 2 ms
20:18:39:252 trigger rule events took 2 ms
20:18:39:747 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:39:747 device state timeout ignored in state 3
20:18:40:198 CTRL HMAC: d5c4d12a1d951f08b2911833db9c7e20dfe16df470113cfec299c741eb3220b0
20:18:40:252 trigger rule events took 2 ms
20:18:41:252 trigger rule events took 2 ms
20:18:42:247 Idle timer triggered
20:18:42:247 Force read attributes for SensorNode Bathroom Humidity
20:18:42:247 don't crQGraphicsScene::addItem: item has already been added to this scene
eate binding for attribute reporting of end-device Bathroom Humidity
20:18:42:248 Force binding of attribute reporting for node Bathroom Humidity
20:18:42:252 trigger rule events took 2 ms
20:18:43:252 trigger rule events took 2 ms
20:18:43:396 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:43:396 device state timeout ignored in state 3
20:18:44:252 trigger rule events took 2 ms
20:18:44:346 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:44:346 device state timeout ignored in state 3
20:18:45:104 DeviceAnnce of SensorNode: 0x001fee0000001963
20:18:45:105 DeviceAnnce of SensorNode: 0x001fee0000001963
20:18:45:105 APS-DATA.indication from unknown node 0x3DEB
20:18:45:105 CTRL restore cached node 0x001fee0000001963
20:18:45:120 void deCONZ::zmNode::setFetched(deCONZ::RequestId, bool) fetched item: 2, node: 0xC046
20:18:45:120 void deCONZ::zmNode::setFetched(deCONZ::RequestId, bool) fetched item: 1, node: 0xC046
20:18:45:120 new node - ext: 0x001fee0000001963, nwk: 0xC046
20:18:45:120 LightNode 234: Kitchen Table updated
20:18:45:121 SensorNode id: 238 (Kitchen Switch Left) no longer available
20:18:45:121 SensorNode id: 239 (Kitchen Switch Right) no longer available
20:18:45:123 SensorNode id: 227 (Living Room Motion 2) available
20:18:45:126 SensorNode id: 227 (Living Room Motion 2) available
20:18:45:129 SensorNode id: 227 (Living Room Motion 2) available
20:18:45:132 SensorNode id: 227 (Living Room Motion 2) available
20:18:45:135 SensorNode id: 227 (Living Room Motion 2) available
20:18:45:139 SensorNode id: 227 (Living Room Motion 2) available
20:18:45:142 SensorNode id: 227 (Living Room Motion 2) available
20:18:45:145 SensorNode id: 227 (Living Room Motion 2) available
20:18:45:167 discard sensor config push for config/reachable (already pushed)
20:18:45:168 discard sensor config push for config/reachable (already pushed)
20:18:45:252 trigger rule events took 2 ms
20:18:45:296 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:45:296 device state timeout ignored in state 3
20:18:45:583 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:46:063 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:46:249 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:46:249 device state timeout ignored in state 3
20:18:46:252 trigger rule events took 2 ms
20:18:46:543 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:46:997 read attributes of 0x001FEE0000001963 cluster: 0x0006: [ 20:18:46:997 0x0000 20:18:46:997 ]
20:18:46:998 drop request to zombie
20:18:47:023 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:47:248 Wait 19s till query finished
20:18:47:252 trigger rule events took 2 ms
20:18:47:503 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:47:983 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:48:176 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:48:176 device state timeout ignored in state 3
20:18:48:248 Wait 18s till query finished
20:18:48:252 trigger rule events took 2 ms
20:18:48:463 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:48:497 read attributes of 0x001FEE0000001963 cluster: 0x0008: [ 20:18:48:497 0x0000 20:18:48:497 ]
20:18:48:498 drop request to zombie
20:18:48:547 drop request to zombie
20:18:48:943 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:49:226 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:49:226 device state timeout ignored in state 3
20:18:49:247 Wait 17s till query finished
20:18:49:252 trigger rule events took 2 ms
20:18:49:423 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:49:437 DeviceAnnce of SensorNode: 0x001fee0000001963
20:18:49:438 DeviceAnnce of SensorNode: 0x001fee0000001963
20:18:49:438 APS-DATA.indication from unknown node 0xF1B7
20:18:49:472 CTRL restore cached node 0x0017880102031475
20:18:49:479 void deCONZ::zmNode::setFetched(deCONZ::RequestId, bool) fetched item: 2, node: 0x6229
20:18:49:480 void deCONZ::zmNode::setFetched(deCONZ::RequestId, bool) fetched item: 1, node: 0x6229
20:18:49:480 new node - ext: 0x0017880102031475, nwk: 0x6229
20:18:49:481 SensorNode id: 238 (Kitchen Switch Left) no longer available
20:18:49:481 SensorNode id: 239 (Kitchen Switch Right) no longer available
20:18:49:482 SensorNode 263 set node 0x0017880102031475
20:18:49:482 SensorNode 262 set node 0x0017880102031475
20:18:49:483 SensorNode id: 227 (Living Room Motion 2) available
20:18:49:483 SensorNode 264 set node 0x0017880102031475
20:18:49:485 SensorNode id: 263 (Toilet Light Level) available
20:18:49:485 SensorNode id: 262 (Toilet Motion) available
20:18:49:486 SensorNode id: 227 (Living Room Motion 2) available
20:18:49:486 SensorNode id: 264 (Toilet Temperature) available
20:18:49:488 SensorNode id: 263 (Toilet Light Level) available
20:18:49:489 SensorNode id: 262 (Toilet Motion) available
20:18:49:489 SensorNode id: 227 (Living Room Motion 2) available
20:18:49:489 SensorNode id: 264 (Toilet Temperature) available
20:18:49:903 Node 0x0017880102031475 is known by 0 neighbors, last seen 0 s
20:18:49:997 drop request to zombie
20:18:50:047 drop request to zombie
20:18:50:248 Wait 19s till query finished
20:18:50:252 trigger rule events took 2 ms
20:18:50:382 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:50:863 Node 0x0017880102031475 is known by 0 neighbors, last seen 1 s
20:18:51:247 Wait 18s till query finished
20:18:51:252 trigger rule events took 2 ms
20:18:51:343 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:51:823 Node 0x0017880102031475 is known by 0 neighbors, last seen 2 s
20:18:52:247 Wait 17s till query finished
20:18:52:249 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:52:249 device state timeout ignored in state 3
20:18:52:252 trigger rule events took 2 ms
20:18:52:303 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:52:783 Node 0x0017880102031475 is known by 0 neighbors, last seen 3 s
20:18:53:248 Wait 16s till query finished
20:18:53:248 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:53:248 device state timeout ignored in state 3
20:18:53:252 trigger rule events took 2 ms
20:18:53:262 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:53:743 Node 0x0017880102031475 is known by 0 neighbors, last seen 4 s
20:18:54:223 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:54:247 Wait 15s till query finished
20:18:54:252 trigger rule events took 2 ms
20:18:54:703 Node 0x0017880102031475 is known by 0 neighbors, last seen 5 s
20:18:54:950 Erase task req-id: 203, type: 19 zcl seqno: 191 send time 47, profileId: 0x0104, clusterId: 0x0406
20:18:55:183 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:55:248 Wait 14s till query finished
20:18:55:252 trigger rule events took 2 ms
20:18:55:663 Node 0x0017880102031475 is known by 0 neighbors, last seen 6 s
20:18:55:743 node 0001FEE0000001963 leave wait state
20:18:56:143 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:56:248 Wait 13s till query finished
20:18:56:252 trigger rule events took 2 ms
20:18:56:623 Node 0x0017880102031475 is known by 0 neighbors, last seen 7 s
20:18:57:103 Node 0x001FEE0000001963 is known by 0 neighbors, last seen 0 s
20:18:57:248 Wait 12s till query finished
20:18:57:252 trigger rule events took 2 ms
20:18:57:252 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:57:252 device state timeout (handled)
20:18:57:252 MASTER kill cmd 0x17 (ERROR)
20:18:57:272 Serial com disconnected, reason: 1
20:18:57:272 hide node: 0x6229
20:18:57:272 hide node: 0xC046
20:18:57:272 LightNode removed 0x001fee0000001963
20:18:57:388 error APSDE-DATA.confirm: 0xA7 on task
20:18:57:388 Erase task req-id: 12, type: 19 zcl seqno: 192 send time 35, profileId: 0x0104, clusterId: 0x0406
20:18:57:466 error APSDE-DATA.confirm: 0xA7 on task
20:18:57:467 Erase task req-id: 34, type: 19 zcl seqno: 193 send time 29, profileId: 0x0104, clusterId: 0x0406
20:18:57:747 void zmMaster::handleStateIdle(zmMaster::MasterEvent) not connected goto OFF state
20:18:57:747 device state timeout ignored in state 1
20:18:58:269 COM: /dev/ttyAMA0 : 
20:18:58:269 COM: /dev/ttyS0 : 
20:18:58:269 dev /dev/ttyS0 (/dev/ttyS0)
20:18:58:289 auto connect com /dev/ttyS0
20:18:58:292 Serial com connected
20:18:58:295 trigger rule events took 2 ms
20:18:58:771 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:58:771 device state timeout ignored in state 2
20:18:59:247 Wait 10s till query finished
20:18:59:251 trigger rule events took 2 ms
20:18:59:749 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:18:59:749 device state timeout ignored in state 2
20:19:00:248 Wait 9s till query finished
20:19:00:252 trigger rule events took 2 ms
20:19:00:747 void zmMaster::handleStateStatus(zmMaster::MasterEvent) EVENT_TIMEOUT
20:19:00:747 device state timeout ignored in state 2
20:19:01:234 int zmMaster::readParameter(ZM_DataId_t) 0x22
20:19:01:234 int zmMaster::readParameter(ZM_DataId_t) 0x01
20:19:01:234 int zmMaster::readParameter(ZM_DataId_t) 0x05
20:19:01:234 int zmMaster::readParameter(ZM_DataId_t) 0x08
20:19:01:234 int zmMaster::readParameter(ZM_DataId_t) 0x07
20:19:01:235 int zmMaster::readParameter(ZM_DataId_t) 0x0A
Kane610 commented 6 years ago

I had a 3 device setup working stable for a week, all Ikea devices, 1 gui10 white, 1 motion detector and 1 wireless dimmer. When I added a Hue wireless remote the light immediately stopped working. I am not sure if this is the exact same issue as in this thread, but wanted to mention it. The light started working again as soon as I removed the Hue remote.

Part of error log: 14:50:51:156 CTRL ANT_CTRL 0x02 14:50:57:462 device announce 0x000B57FFFEBDBE89 (0x758D) mac capabilities 0x8E 14:50:57:464 CTRL restore cached node 0x000b57fffebdbe89 14:50:57:474 new node - ext: 0x000b57fffebdbe89, nwk: 0x758D 14:50:57:482 LightNode 1: Light 1 added 14:51:05:250 Content not completely loaded (got 0 of 0), wait 20ms 14:51:05:251 add task 33 type 14 to group 0xD833 cluster 0x0006 req.id 139 14:51:05:281 error APSDE-DATA.confirm: 0xAE on task 14:51:05:282 Erase task req-id: 139, type: 14 zcl seqno: 0 send time 0, profileId: 0x0104, clusterId: 0x0006 14:51:07:000 scan finished 14:51:12:218 DeviceAnnce of SensorNode: 0x000b57fffe92cced 14:51:12:219 device announce 0x000B57FFFE92CCED (0xE02E) mac capabilities 0x80 14:51:12:220 set fast probe address to 0x000B57FFFE92CCED (0xE02E) 14:51:12:221 CTRL restore cached node 0x000b57fffe92cced 14:51:12:225 new node - ext: 0x000b57fffe92cced, nwk: 0xE02E 14:51:12:226 SensorNode id: 2 (TRADFRI motion sensor ) no longer available 14:51:12:227 SensorNode 2 set node 0x000b57fffe92cced 14:51:12:240

manup commented 6 years ago

The light started working again as soon as I removed the Hue remote.

Often it should be enough to power cycle the lights.

Do the lights have the latest firmware? Note IKEA lights can be updated with deCONZ.

Kane610 commented 6 years ago

@manup probably not. Have issues getting deconz to run without sudo so I can do upgrades. Gonna look some more in the different issues to see if I can find any ideas, else I'll create a new issue to request help.

joch commented 6 years ago

Seems I have a similar issue with an Osram lightstrip. The first pairing went fine and it worked great. After having been disconnected for a month because of moving, I plugged it back in, but it never became available in deconz again (all other lights and sensors worked immediately). I removed it from deconz and reset the lightstrip, but now it never wants to pair again.

The only thing I’m seeing in the log is right when I turn on the lightstrip and open the network:

15:06:36:104 device announce 0x7CB03EAA00A5B1C6 (0xDA60) mac capabilities 0x8E

But then nothing else related to that light is seen in the log.

Edit: So what I had to do to get the light working was to stop deconz, remove the entry in session.default and zll.db. When starting deconz, the light just appeared and is working! Guess there could be a bug somewhere.

Edit 2: I enabled zdp debugging and saw that the light was actually in the network even though there was no indication of that in the UI.