Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.74k stars 1.64k forks source link

Device offline but can still control via zigbee group #11650

Closed millionsofjeffries closed 2 years ago

millionsofjeffries commented 2 years ago

What happened?

I have a few devices that randomly start to show as offline and can't be controlled. For example, an offline bulb is part of a zigbee group of 3 identical bulbs and doesn't respond to any issued commands. It works just fine if I control the group instead. However, this doesn't bring it back to online and it still can't be controlled individually. After a random period of hours, it will come back online on its own and be reachable individually again for a while, but will go offline again at some point.

Here is an example of the debug log for a state request and it doesn't seem to receive a response from the bulb:

zigbee2mqtt | Zigbee2MQTT:error 2022-02-28 13:18:20: Publish 'get' 'state' to 'Bulb_BarSpot2' failed: 'Error: Read 0x84fd27fffe92a61a/11 genOnOff(["onOff"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (no response received)' zigbee2mqtt | Zigbee2MQTT:debug 2022-02-28 13:18:20: Error: Read 0x84fd27fffe92a61a/11 genOnOff(["onOff"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (no response received) zigbee2mqtt | at DeconzAdapter.sendZclFrameToEndpoint (/app/node_modules/zigbee-herdsman/src/adapter/deconz/adapter/deconzAdapter.ts:649:23) zigbee2mqtt | at runMicrotasks (<anonymous>) zigbee2mqtt | at runNextTicks (internal/process/task_queues.js:60:5) zigbee2mqtt | at processTimers (internal/timers.js:497:9) zigbee2mqtt | at Endpoint.sendRequest (/app/node_modules/zigbee-herdsman/src/controller/model/endpoint.ts:299:20) zigbee2mqtt | at Endpoint.read (/app/node_modules/zigbee-herdsman/src/controller/model/endpoint.ts:472:28) zigbee2mqtt | at Object.convertGet (/app/node_modules/zigbee-herdsman-converters/converters/toZigbee.js:293:13) zigbee2mqtt | at Object.convertGet (/app/node_modules/zigbee-herdsman-converters/converters/toZigbee.js:898:17) zigbee2mqtt | at Publish.onMQTTMessage (/app/lib/extension/publish.ts:272:21)

And here is an example of a successful group command that completed successfully (notice the last seen for the problem bulb 'Bulb_BarSpot2' is when it went offline):

zigbee2mqtt | Zigbee2MQTT:debug 2022-02-28 13:24:13: Received MQTT message on 'zigbee2mqtt/BarSpots/set' with data '{"state": "ON", "color_temp": 352}' zigbee2mqtt | Zigbee2MQTT:debug 2022-02-28 13:24:13: Skipping state because of Home Assistant zigbee2mqtt | Zigbee2MQTT:debug 2022-02-28 13:24:13: Publishing 'set' 'color_temp' to 'BarSpots' zigbee2mqtt | Zigbee2MQTT:debug 2022-02-28 13:24:13: Received Zigbee message from 'Coordinator', type 'commandMoveToColorTemp', cluster 'lightingColorCtrl', data '{"colortemp":352,"transtime":0}' from endpoint 1 with groupID 1, ignoring since it is from coordinator zigbee2mqtt | Zigbee2MQTT:info 2022-02-28 13:24:13: MQTT publish: topic 'zigbee2mqtt/Bulb_BarSpot1', payload '{"brightness":254,"color":{"h":33,"hue":33,"s":79,"saturation":79,"x":0.4488,"y":0.4078},"color_mode":"color_temp","color_temp":352,"color_temp_startup":65535,"last_seen":"2022-02-28T13:23:03+00:00","linkquality":255,"state":"ON"}' zigbee2mqtt | Zigbee2MQTT:info 2022-02-28 13:24:13: MQTT publish: topic 'zigbee2mqtt/BarSpots', payload '{"brightness":254,"color":{"h":33,"hue":33,"s":79,"saturation":79,"x":0.4488,"y":0.4078},"color_mode":"color_temp","color_temp":352,"state":"ON"}' zigbee2mqtt | Zigbee2MQTT:info 2022-02-28 13:24:13: MQTT publish: topic 'zigbee2mqtt/Bulb_BarSpot2', payload '{"brightness":254,"color":{"h":33,"hue":33,"s":79,"saturation":79,"x":0.4488,"y":0.4078},"color_mode":"color_temp","color_temp":352,"color_temp_startup":65535,"last_seen":"2022-02-28T09:38:34+00:00","linkquality":240,"state":"ON"}' zigbee2mqtt | Zigbee2MQTT:info 2022-02-28 13:24:13: MQTT publish: topic 'zigbee2mqtt/Bulb_BarSpot3', payload '{"brightness":254,"color":{"h":33,"hue":33,"s":79,"saturation":79,"x":0.4488,"y":0.4078},"color_mode":"color_temp","color_temp":352,"color_temp_startup":65535,"last_seen":"2022-02-28T13:21:49+00:00","linkquality":240,"state":"ON"}'

Bulbs are Ajax Online Ltd AJ_ZB120_GU10. However, I've had the same thing happen occasionally with Ikea control outlets. Mesh is good (48 routers) and the 3 bulbs in the group are only 30 cm away from each other. Have tried the latest dev version but results are the same. zigbee2mqtt is running on a rpi4 in a docker container with a conbee2 usb on an extension cord. All devices are on the latest firmware I could find available. Thanks in advance for your assistance!

What did you expect to happen?

If bulb can be controlled as part of a group it should be reachable individually too

How to reproduce it (minimal and precise)

Via the zigbee2mqtt frontend, try to control a bulb that is showing as offline and there is no response from the bulb and eventually a timeout message is logged. Try to control the same bulb as part of a group and the bulb responds as instructed.

Zigbee2MQTT version

1.23.0-dev commit: afe94a7

Adapter firmware version

266e0700

Adapter

Conbee 2

Debug log

No response

sjorge commented 2 years ago

Might he a routing issue, groups use broadcast traffic so all routers just repeat the message.

When controlling the device directly it’s going to just a single device and it might be nothing has a route.

Aside from maybe power cycling the device I’m not sure how to fix this though. Normally z2m will try to do route discovery if once is not available.

@Koenkk the discovery but is not unique to TI firmware right? Conbee should do the same.

millionsofjeffries commented 2 years ago

Hi @sjorge and thanks for your post. Indeed, power cycling the device brings it straight back online, but it does tend to go offline hours later.

Forgive my ignorance of how it works but I assume the coordinator holds a routing table of the route to each device. Is there a way of forcing a refresh / route discovery of this table manually or does the adapter firmware handle all this? I guess power cycling the device updates the routing table on the coordinator.

Perhaps there's an automated refresh that is happening from time to time which is why it comes back online on its own sometimes but I haven't logged the exact timings yet. Thanks for your help!

sjorge commented 2 years ago

Sort of, the coordinator does keep a table, but every device does not need to be in it.

If it needs one that it's missing it will try to discovery it. For battery powered devices usually the router it is using will have a route to it and reply to those discovery requests... although my exact knowledge of how the mesh/discovery/routing works is a bit hazy as I mostly deal with adding new devices and not the lower layers that interact with the mesh.

Power cycling a device will make it send a new Device Announcement if it somehow dropped from the mesh it will re-announce it's presence

okastl commented 2 years ago

@Koenkk the discovery but is not unique to TI firmware right? Conbee should do the same.

I frequently see this problem after a restart of z2m. Usually 1-3 out of 85 devices are no longer controllable. If they are members of a group, they react to group commands, and I use a TI stick zStack3x0. So I am not sure, if this is Conbee related. Power cycling a non responsive device brings it back, but this can be a major pain sometimes. E.g. for fixed light panels or blinds I need to cut power with the circuit breaker / fuse. The devices which get "lost" are different. Sometimes a Paulmann light panel, sometimes a Hue bulb, a Hue Bloom light... They are not located nearby, so it is unlikely there is one "bad" router in the way. I am still trying to see a pattern...

Koenkk commented 2 years ago

@chrishae is this a known issue with the Conbee II adapter? Ideally the conbee should do a route request when it doesn't have a network route.

sehraf commented 2 years ago

Not sure if this is exactly the same or something different, but here you go:

I'm seeing something similar, too - from time to time, not withing hours, though.

Mostly affects my IKEA LED1732G11 (i have two). Seemingly random, one turns "unavailable". Controlling via a group works fine. As @millionsofjeffries said, power cycling always helps. They do not come back online on their own (at least i haven't observed this yet, they simply stay unavailable). I've also seen this with a Philips 8718696485880 but only a few times (i have three). So this might just be a fluke. Also, this only happens to the above mentioned devices, other lamps stay "available" as expected.

I'm using a TI CC2531 with version 20211115 (and whatever was the last official version before). And always the latest release of Zigbee2MQTT.

ChrisHae commented 2 years ago

The ConBee2 FW takes care of routes discovery automatically. I would recommend to install the newest FW for the latest routing fixes. Using GCFFlasher shipped with deCONZ (https://github.com/dresden-elektronik/deconz-rest-plugin/wiki/Update-deCONZ-manually) or installed independently https://github.com/dresden-elektronik/gcfflasher

millionsofjeffries commented 2 years ago

Hi @ChrisHae I'm already using 0x26720700.bin on my conbee2, which I believe is the latest. Are there any newer or beta versions to try?

ChrisHae commented 2 years ago

26720700 is the latest. And you say you have a good mesh with high LQI values? Is it possible for you to sniff the Zigbee traffic with Wireshark?

millionsofjeffries commented 2 years ago

@ChrisHae is that possible with my configuration? I have a conbee2 running as the coordinator - I've seen there's a sniffing firmware for it but if I flash that then I won't have a zigbee network running to sniff would I? Yes, have a good mesh with high LQI values on all devices, including the ones that keep dropping off.

ChrisHae commented 2 years ago

You would need a second usb stick - ConBee (https://phoscon.de/en/conbee/software#zshark) or CC2531 (https://www.zigbee2mqtt.io/advanced/zigbee/04_sniff_zigbee_traffic.html#_3-sniffing-traffic).

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

Klikini commented 2 years ago

Can we reopen this? I am having the same problem with my Conbee 2 running the latest firmware (0x26720700) and Zigbee2MQTT 1.25.1. I have IKEA, Hue, Xiaomi, Stelpro, and some other devices, and it's only the Hue devices (bulbs & Bloom) that drop off. Sending commands to their group still works.

I still have my CC2652R stick (which worked for everything but the Stelpro thermostats, so I switched to the Conbee). Would this work for sniffing?

ezfrag2021 commented 2 years ago

Same issue with me and only for a single Hue Iris. I don't want to remove and repair it because the Hue Iris is really difficult to get into pairing mode. It can only be done by repairing with a Philips Hue hub and then unpairing to put it into pairing mode. This is a huge pain as I would have to go and dig the Hue hub out of the cupboard and set it up with the app again so I may just leave it as I only ever use the bulb as part of a group.

radionurd commented 1 year ago

I'm having the same issue with a single Hue Iris. Other Hue light bulbs and Hue motion sensors are working fine. Recently removed my hue bridge and connected all hue lights to Zigbee2mqtt with a Conbee II stick. I already had Tradfri lights and switches on z2m, all working fine. Just this one Hue Iris disconnecting after a few minutes.

Klikini commented 1 year ago

@ezfrag2021 do you have a Hue dimmer switch? I use it to put bulbs into pairing mode regardless of what they're already paired to, if anything. Hold it really close to the bulb (nearly touching works best) and then press and hold the power and hue buttons at the same time until the bulb starts blinking. If it doesn't work, try holding it in a different spot on the bulb.

radionurd commented 1 year ago

@ezfrag2021 do you have a Hue dimmer switch? I use it to put bulbs into pairing mode regardless of what they're already paired to, if anything. Hold it really close to the bulb (nearly touching works best) and then press and hold the power and hue buttons at the same time until the bulb starts blinking. If it doesn't work, try holding it in a different spot on the bulb.

I have a dimmer switch and tried your solution several times. Iris starts blinking and you can pair it again to z2m. All seems to work fine but after a while Iris becomes unresponsive again.

Klikini commented 1 year ago

@radionurd I was just responding to "the Hue Iris is really difficult to get into pairing mode". I still haven't found a solution to the devices becoming unresponsive in Z2M (aside from controlling groups) so I moved mine back to the Hue bridge for now :/

I still think this issue should be reopened.

TheJulianJES commented 1 year ago

After having moved my TRÅDFRI lights to a separate Zigbee network, I never had any issues with my Hues going unavailable for hours as at a time again (where I could still control them through groups).

formatBCE commented 1 year ago

I got this issue with Sengled light bulbs. Power cycling doesn't help. Re-pairing is weird: i have 6 on one physical switch, so had to delete them and re-pair together - and one didn't pair at all, two are shown as unavailable - however, group on/off works for all of them. It's super weird. I use ZZH Electrolama stick, and it worked flawlessly for almost 2 years.

ezfrag2021 commented 1 year ago

I solved the issue by further deconflicting the Zigbee network from Wi-Fi by changing the channel.

I think some Hue bulbs are really sensitive to 2.4GHz interference even when it's only overlapping at the tail.

formatBCE commented 1 year ago

I live in apartments building, so it's impossible to get off all the interference, unfortunately. But i will check bands today - probably, some neighbour got 2.4 WiFi on my channel recently.

jhrath commented 1 year ago

My setup has worked for a few weeks, but now ALL powered devices (7) go offline, and the battery once stays connected. I keep resetting and re-adding some of them, but they fall off again immediately. It's driving me crazy, and I don't understand it. It's a combination of Hue (led strip & bulb) and Ikea (light and repeaters)

radionurd commented 1 year ago

Two days ago I downgraded my Conbee II to firmware 0x26580700 and now all my devices are still online. I think I'm gonna stick to this firmware. (Using Hue light bulbs, switches and motion sensors, also Tradfri light bulbs, switches and motion sensors and also Aqara switches and curtain drivers)

ThatMishakov commented 8 months ago

I've been having the same issues with two 1st gen Hue Bloom lights of them going offline, and causing havoc around then in general (surrounded by a lot of routers, 61 in network, mostly TuYa and newer Philips lights, also Innr, Ikea, Osram) with ZBDongle-P (CC2652P), and I've recently switched to ZBdongle-E (since zstack was crashing every other day, even on 20231112, but that's a separate problem) and these lights are even more finicky with EFR32.

I did some sniffing on EFR32 dongle before flashing it, and it looked like these lights were flipping routing address endiannes in routing packets, and while I don't know much about protocol, it seems very weird to me. Could that be the issue or is that a normal thing for older devices? image image-1 image-2