Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.76k stars 1.64k forks source link

CC2652 - Gledopto GL-C-002P (Mini 5in1) unable to pair after factory reset / mode switch #11202

Closed MartB closed 2 years ago

MartB commented 2 years ago

TL:DR

This issue is now fixed, read https://github.com/Koenkk/zigbee2mqtt/issues/11202#issuecomment-1128910661

The device is unable to pair again after it was already paired with a z2m instance using a CC2652 or any other zstack 3.x version (Zigbee stack rev >= 0x15), do not expect this product to work reliably in your projects at this time.

~~The firmware/mode change of the device does not correctly wipe the Zigbee network data. Gledopto is aware but did not confirm nor deny the issue yet.~~

Manual flash reset (requires hardware modification) The manual reset restores connectivity to z2m, the first pairing attempt instantly succeeds after the reset. See: https://github.com/Koenkk/zigbee2mqtt/issues/11202#issuecomment-1053144581

Timeline

~~Update: My working hypothesis based on the analysis made further down this thread is that there is some firmware issue with the trust center link key handling within Z-stack 3.x. The device never gets the proper replies and drops off the network.~~

Update: (27.02.22) Only recent zigbee stacks are affected (Stack revision >= 0x15) aka ZStack 3.X Part of the issue has been found in the GL-C-002P device reset / type switching logic, not all persistent zigbee network parameters are cleared correctly. Manufacturer notified, waiting for replies.

Current workaround: Drop zigbee stack revision to 0, test firmware available here: https://github.com/Koenkk/zigbee2mqtt/issues/11202#issuecomment-1052137967 (ty @lorenz)

Todo (personal)


What happened?

Trying to pair the GL-C-002P aka Mini version of the 5in1 controller (GL-C-001P). It initially pairs once but never works after a powercycle, when trying to re-pair the device it never gets paired correctly. It constantly leaves the network and re-joins as long as the joining is enabled from within z2m The device gets detected as GL-C-008P correctly but it doesnt "stick" to the network.

What did you expect to happen?

Expected a normal pairing proccess to happen just like with the older controllers.

How to reproduce it (minimal and precise)

I guess you just have to buy the mini 5in1 pair it once, power cycle or reset it and it never joins your network again. Until anyone confirms this device works i assume this is the case for all of these new controllers.

Zigbee2MQTT version

1.23.0

Adapter firmware version

20220103

Adapter

SLAESH CC2652RB

Debug log

Zigbee2mqtt: log.txt Herdsman: debug_glc002p.log Sniffs:

Additional infos

SOC used by the device: TELINK TLSR8258F512ET32 Video of the failing pairing process: https://www.youtube.com/watch?v=JEhOAXJ_n3k

Koenkk commented 2 years ago

Could you provide the herdsman debug logging when pairing this device?

See https://www.zigbee2mqtt.io/guide/usage/debug.html on how to enable the herdsman debug logging. Note that this is only logged to STDOUT and not to log files.

MartB commented 2 years ago

@Koenkk Thanks for your help, i appreciate it.

I kept everything in there as im going to change secrets anyway, this is the full docker log from zigbee2mqtt with like 3-4 pairing attempts. Started from a completely clean state besides the coordinator firmware flash.

debug_glc002p.log

Koenkk commented 2 years ago

@MartB seems the device just leaves the network, since gledopto advertises that their devices work with the hue bridge, do you have one? If yes, please check if it working by pairing it to the hue bridge .

MartB commented 2 years ago

@Koenkk I tried with a borrowed echo show and it seems to work there, did not work on the first pairing attempt but it got picked up. Z2M doesnt work tho, sad so i guess theres nothing you can do?

Gledopto does not seem to be super keen on trying to fix it as well.

Koenkk commented 2 years ago

@MartB do you maybe have a CC2531 stick? If yes, try sniffing the traffic when pairing the device with the echo. https://www.zigbee2mqtt.io/advanced/zigbee/04_sniff_zigbee_traffic.html#with-cc2531

MartB commented 2 years ago

@Koenkk Sadly i dont, can you recommend an amazon listing / source to obtain this from within europe? I guess i need the debugger + stick if i want to try?

Koenkk commented 2 years ago

@MartB I usually buy from aliexpress, you indeed need the debugger + stick. But another sonoff dongle with the sniffer firmware should also do it

MartB commented 2 years ago

@MartB I usually buy from aliexpress, you indeed need the debugger + stick. But another sonoff dongle with the sniffer firmware should also do it

Aliexpress would have been viable, but the holidays are delaying things. I got one from amazon and im gonna see if it works haha, should be there next week and then im gonna capture both pairing processes.

Thanks for your help so far, maybe we can figure sth out, now that a second user reported the same im sure its not a "me" problem.

basmeyer commented 2 years ago

Would the Sonoff ZBDongle Plus (CC2652) work to flash with a sniffer firmware? @Koenkk Found this https://software-dl.ti.com/simplelink/esd/simplelink_cc13x2_26x2_sdk/3.20.00.68/exports/docs/zigbee/html/zigbee/packet_sniffer.html at Texas Instruments, but PC... the thing is, all Mac here. But plenty of Pi's. :) Any links?

basmeyer commented 2 years ago

As a summary to my experience with these fine Zigbee low voltage DC LED controllers, of which we probably hear more soon because the features they posses together with the purchase price is almost mind blowing, I currently have 5, of which I was able to pair 2 completely successfully with a tiny hickup. But 3 of them are "expelled" from proper pairing. They are recognised but depending of the integration not working or kicked out of the network almost straight away.

The hickup is the main entity shows unavailable but after enabling it and waiting for a couple of minutes they work well.

ZHA perfectly works with 2 out of 5 of the properly paired devices. Zigbee2MQTT currently works with neither of them, not even the 2 same mentioned above. After pairing they last 10-15 seconds and leave the network.

I hope I can assist with the Zigbee sniffing.

Bysmyyr commented 2 years ago

Same happens to me, connect and then disconnect in about a minute. I have Tubes cc2652 lan connected adapter.

Koenkk commented 2 years ago

Would the Sonoff ZBDongle Plus (CC2652) work to flash with a sniffer firmware? @Koenkk Found this https://software-dl.ti.com/simplelink/esd/simplelink_cc13x2_26x2_sdk/3.20.00.68/exports/docs/zigbee/html/zigbee/packet_sniffer.html at Texas Instruments, but PC... the thing is, all Mac here. But plenty of Pi's. :) Any links?

yes that should do it

wtip commented 2 years ago

@MartB any luck with the sniffing?

basmeyer commented 2 years ago

Has anyone contacted the manifacturer to ask what they have changed compared to the 001P?

Bas

On 19 Feb 2022, at 21:59, William @.***> wrote:

 @MartB any luck with the sniffing?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.

wtip commented 2 years ago

@basmeyer I just sent an email to service@gledopto.com but I'm doubtful that I'll get a useful response. Response received 2022-2-20: "Because this product uses ultra-thin design, we use different chips. At present, there are some compatibility problems with zigbee2mqtt. We have feedback to the engineers who are dealing with it. We recommend that you use these gateways for the time being, like Philips hue bridge, smartthings, amazon echo studio, etc."

basmeyer commented 2 years ago

Thanks. I hope they realize that possible use of this product in open source systems will always contribute in a sales boost.

kirovilya commented 2 years ago

I will receive such a device today. Successfully paired in z2m on stick 2531. When the operating mode is changed, the device leaves the network and pairs again with a new behavior model. RGBCCT - GL-C-008P RGBW - GL-C-007P RGB - GL-C-003P CCT - GL-C-006P Dimmer - GL-C-009P

kirovilya commented 2 years ago

As for pairing problems, you may need to pay attention to RF noise, since this device can work with RF remotes, maybe this somehow interferes with the process and throws it out of the network.

I did not find such problems with my device - everything works fine

wtip commented 2 years ago

@kirovilya Can you share your Gledopto device firmware version?

Mine is:

firmware build date:20211223    
firmware version: 10251203

and if it matters I'm also using a SLAESH CC2652RB usb adapter as the coordinator.

kirovilya commented 2 years ago

the same image

MartB commented 2 years ago

@MartB any luck with the sniffing?

Yes, im going to sniff the amazon echo device tomorrow buti sniffed the pairing process with my network and the device leaves after sending some APS: Command thats encrypted, i provided the dumps above. I hope @Koenkk can find some time to look at it, as this is way over my head currently.

Has anyone contacted the manifacturer to ask what they have changed compared to the 001P?

I did and they do not offer support past the listed available hubs, im afraid this one is gonna be on us.

As for pairing problems, you may need to pay attention to RF noise, since this device can work with RF remotes, maybe this somehow interferes with the process and throws it out of the network.

I did not find such problems with my device - everything works fine

I doubt RF noise is related, i do not own any of this Mi-Boxer RF protocol 2.4 ghz remotes and if you check the video i made, it does not get any better than this RF wise. There is some software incompatibility between this telink stuff.

If you want to risk "soft-bricking" the device, kick it from your network and reset it and then try to re-pair it again. And if you have a working sniffing setup please check if it also sends the Update Device, Device Left packet straigth after sending the same APS: Command. image

Assumptions: - Maybe its related to the ZCL Green Power support this telink SOC has.

Update: I ran and quickly borrowed the echo anyway hehe

image Left: Echo, Right: Z2M

~~Its pretty clear that no green power stuff is happening with the echo, so that would be my first attempt. How can this be disabled in software / firmware?~~

kirovilya commented 2 years ago

I confirm! the problem is observed on the stick cc2652 (zstack 3) and is absent on cc2531 (zstack 1.2) gl-c-002p.zip

kirovilya commented 2 years ago

one more test - it works with EFR32 EZSP stick gl-c-002p_ezsp.zip

MartB commented 2 years ago

I confirm! the problem is observed on the stick cc2652 (zstack 3) and is absent on cc2531 (zstack 1.2) gl-c-002p.zip

Thank you for giving this a shot! I think this is gonna be the "missing link" because you have the keys to decrypt the actual traffic for 1) failure and 2) success here. Let's hope you or someone else figure this out. If you are willing to share your network key in private with koenkk or anyone else trusted and capable enough this could shine some more light on the issue.

lorenz commented 2 years ago

@MartB The key is in the capture, just decrypt with the default link key and then grab it from the transport key message.

MartB commented 2 years ago

@MartB The key is in the capture, just decrypt with the default link key and then grab it from the transport key message.

Thanks for that much appreciated!

lorenz commented 2 years ago

I've started digging around the capture and I think packet 122 in the 2652 capture is interesting. It's extremely short, not decryptable and doesn't occur at all on the 2531 capture. Also the captures start to heavily diverge afterwards. Directly preceding it is the node descriptor info from the coordinator where the only significant difference is the stack compliance revision increasing from 0 to 22. The device then keeps issuing node descriptor requests and these weird undecryptable command frames every 5 seconds for 4 cycles before leaving exactly 20s after it has issued the first node descriptor request.

MartB commented 2 years ago

@lorenz This might explain it: Source: https://e2e.ti.com/blogs_/b/process/posts/the-key-to-security-zigbee-3-0-s-security-features

I guess we are seeing the attempted (or working) key update, as the stack compat level is high enough?

image This is what the echo sends, then the APS sent by the device is a few acks, the "encrypted" data and then we get a transport key in the echo dump.

image

Filter dump for the pcacps:

# Get all the transport key messages
zbee_aps.cmd.id == 0x05

# Get only the "non standard" trust center key updates.
zbee_aps.cmd.key_type == 0x04

# Get encrypted data that does not have proper decryption available
zbee_sec.encrypted_payload
lorenz commented 2 years ago

But TI wrote that article so I'd assume that their Z-Stack implementation does not have a broken key update function. Maybe this is something Zigbee2MQTT needs to handle? I'm not very familiar with the interface between Zigbee2MQTT and Z-Stack.

lorenz commented 2 years ago

I got it to work by hacking the coordinator firmware (just zeroed out the Zigbee stack version for GLEDOPTO OIDs), but that's not really a good solution. If you want to try, you can grab the hacked firmware at https://blob.dolansoft.org/public/cc13_2652p-coordinator-with-manager.hex.

The issue is really weird as for the EZSP trace posted above the device behaves normally, but for the TI trace it uses an unknown link key to encrypt the key request. For the Echo it does the same but the Echo is somehow able and willing to decrypt the response with the weird link key. Since pairing worked for the first time my best guess is that the stack does not delete the link key upon leaving the network and reuses that link key when requesting a new one. But Zigbee2MQTT has already deleted that link key and thus Z-Stack can't decrypt the request. Maybe the Echo keeps the link key around indefinitely and is thus able to decrypt and just ignores the protocol violation?

MartB commented 2 years ago

I got it to work by hacking the coordinator firmware (just zeroed out the Zigbee stack version for GLEDOPTO OIDs), but that's not really a good solution. If you want to try, you can grab the hacked firmware at https://blob.dolansoft.org/public/cc13_2652p-coordinator-with-manager.hex.

The issue is really weird as for the EZSP trace posted above the device behaves normally, but for the TI trace it uses an unknown link key to encrypt the key request. For the Echo it does the same but the Echo is somehow able and willing to decrypt the response with the weird link key. Since pairing worked for the first time my best guess is that the stack does not delete the link key upon leaving the network and reuses that link key when requesting a new one. But Zigbee2MQTT has already deleted that link key and thus Z-Stack can't decrypt the request. Maybe the Echo keeps the link key around indefinitely and is thus able to decrypt and just ignores the protocol violation?

Great job @lorenz! I wanted to try this as well but ran out of time last week, i was 100% positive dropping the stack rev to 0 would work as the key exchange was absolutely broken.

I tried to decrypt this frame with all the keys i found in my dump ... Hypothesis: If the mini controller is reset, it never clears it's trust center link keys correctly and thats what we are seeing here? I remember that the first pairing always worked fine and then the device refused to pair, so it has to be related somehow.

We just need to figure out a way to patch the firmware in order to handle this protocol violation in a more "secure" way, i dont think zstack is doing something wrong after digging through the source code.

IIRC the echo also needed a few pairing attempts and did not work on first try. I saw quite some traffic during the alexa device search. I will try to dump the entire pairing process with all retries the echo does tomorrow, maybe that helps us somehow.

lorenz commented 2 years ago

@MartB Yeah, I'm pretty sure that that whatever ZigBee Stack the device is using is not clearing the link key when it's leaving. But without a dump from a fresh pairing process that's hard to confirm and I currently don't have two USB sticks so I can't sniff. I've been looking at what the Echo does and it looks like it just answers with the default link key (which we could patch into the firmware), but the device does not want to verify that key so this key exchange doesn't seem to actually work. The one that's working is using unknown link keys. So this probably needs further sniffing with a coordinator that was not previously paired (or in case that key is not per-coordinator even a new device).

MartB commented 2 years ago

@MartB Yeah, I'm pretty sure that that whatever ZigBee Stack the device is using is not clearing the link key when it's leaving. But without a dump from a fresh pairing process that's hard to confirm and I currently don't have two USB sticks so I can't sniff. I've been looking at what the Echo does and it looks like it just answers with the default link key (which we could patch into the firmware), but the device does not want to verify that key so this key exchange doesn't seem to actually work. The one that's working is using unknown link keys. So this probably needs further sniffing with a coordinator that was not previously paired (or in case that key is not per-coordinator even a new device).

Trying once more to get gledopto involved, im pretty sure i can "factory new" pair one more of these until i run out of attempts. So the following is on my todo list now:

General

Sniffing to do list (postponed)

Flash analyis

As a last resort i will dump the firmware off the device and see if it contains keys and which params change before and after a reset. In second comment, this is getting too long.

MartB commented 2 years ago

Minor victory for people fond of hardware tinkering that want to learn some new stuff. tldr: The device does not fully reset its persistent zigbee network data on factory reset, somehow breaking the key exchange in the process, manually full resetting the flash area solves this.

Preface

I could not keep my hands off this device, i dumped the flash of the device and discovered a region that is saving information about the zigbee networks it was connected to.

Flash analysis

image (Top contains my zeroed memory, bottom is a leftover taken straigth after a firmware reset)

It seems like the deletion on factory reset is off by one, as it never gets fully reset. I cleared 24 KiB of flash area at offset 0x7A000 and the device instantly paired with z2m again and stays paired.

I later found a manual that mentions this too (see page 23): http://wiki.telink-semi.cn/tools_and_sdk/Demo/B91_Zigbee/Telink_Zigbee_Overview_CN.pdf

So i can at least revive / full reset the module at will by using https://github.com/pvvx/TlsrComSwireWriter 1) Building a small board and grabbing a UART serial adapter 2) Opening the case (a pain) 3) Soldering 3 wires to the exposed GND, SWS and VCC pads. 4) Hooking it up to the pc and running python ./TLSR825xComFlasher.py -p /dev/ttyUSB1 --tact 70 es 0x7A000 0x6000

How is this relevant?

Well @lorenz and me were right and the device indeed remembers parameters it should not, and i would not be surprised if these remembered bytes are the missing keys (not sure if in the clear) or other parameters that the device uses to encrypt the AP message for request key with.

I hope Gledopto will take a look at this and at least fix this bug. Now we only need to figure out how the echo device "overrides" this faulty behavior in a way that it still pairs, but that does explain why the pairing also fails there a few times.

patrikulus commented 2 years ago

Minor victory for people fond of hardware tinkering that want to learn some new stuff. tldr: The device does not fully reset its persistent zigbee network data on factory reset, somehow breaking the key exchange in the process, manually full resetting the flash area solves this.

Preface

I could not keep my hands off this device, i dumped the flash of the device and discovered a region that is saving information about the zigbee networks it was connected to.

Flash analysis

image (Top contains my zeroed memory, bottom is a leftover taken straigth after a firmware reset)

It seems like the deletion on factory reset is off by one, as it never gets fully reset. I cleared 24 KiB of flash area at offset 0x7A000 and the device instantly paired with z2m again and stays paired.

I later found a manual that mentions this too (see page 23): http://wiki.telink-semi.cn/tools_and_sdk/Demo/B91_Zigbee/Telink_Zigbee_Overview_CN.pdf

So i can at least revive / full reset the module at will by using https://github.com/pvvx/TlsrComSwireWriter

  1. Building a small board and grabbing a UART serial adapter
  2. Opening the case (a pain)
  3. Soldering 3 wires to the exposed GND, SWS and VCC pads.
  4. Hooking it up to the pc and running python ./TLSR825xComFlasher.py -p /dev/ttyUSB1 --tact 70 es 0x7A000 0x6000

How is this relevant?

Well @lorenz and me were right and the device indeed remembers parameters it should not, and i would not be surprised if these remembered bytes are the missing keys (not sure if in the clear) or other parameters that the device uses to encrypt the AP message for request key with.

I hope Gledopto will take a look at this and at least fix this bug. Now we only need to figure out how the echo device "overrides" this faulty behavior in a way that it still pairs, but that does explain why the pairing also fails there a few times.

I'm confused about the schema, should I connect only TX to SWS or also RX? Sorry for dump question, but I'm new into flashing stuff, so far I have only "tasmotized" few ESP devices.

MartB commented 2 years ago

@patrikulus You should have success when connecting it like this: https://github.com/pvvx/TlsrComSwireWriter/raw/master/schematicc.gif

Basically, the SWS pad on the controller needs to be run to RX of the UART board and to the TX port, with a 1 - 1.8k resistor in between.

Update for relevancy: No news from Gledopto, I assume we won't get a vendor fix for this and I'm in no position to decide what to do with this device. Forcing stack rev to 0 seems almost impossible as that would affect all GL devices.

@Koenkk Can you give this a quick glance and check what we can do in order to support the device?

Koenkk commented 2 years ago

@lorenz could you try hacking the manufacturer code in the coordinator firmware? The echo sends 0x1217, maybe the default 0x0000 z2m manufacturer code triggers this bug on the gledopto device.

MartB commented 2 years ago

@Koenkk i did (albeit flipped to 0x1712) and it did not work, same behavior.

0x1712 manufacturer ID: znp_LP_CC2652RB_tirtos_ccs.zip

(Based on the latest develop patch, compiled with compiler speed optimizations enabled, thats why its bigger)

Koenkk commented 2 years ago

@MartB and what about the Network Manager: True? In your post it is true for Echo and false for z2m. In the EZSP sniff from @kirovilya it is also true:

Screenshot 2022-03-27 at 10 02 50
lorenz commented 2 years ago

I hacked in network manager support already, doesn't do anything

Koenkk commented 2 years ago

If I understand correctly, the GL-C-002P doesn't show this behaviour with EZSP right? (sniff: https://github.com/Koenkk/zigbee2mqtt/issues/11202#issuecomment-1046303717)

lorenz commented 2 years ago

Me and @MartB think since it's related to the device retaining link keys even after disassociation it would happen to the EZSP too if reconnected. It also happens with the Echo, but it responds differently and after a few tries it recovers.

czdaniel commented 2 years ago

Gledopto was an universally recommended brand and I bought 2 of these, facing all the same issues I've read about here. It seems that after connecting it to my HomeAssistant Zigbee2Mqtt network I can't even connect it to my amazon echo plus (1st gen). It must still be trying to connect to the zigbee network, even though I disabled joining and blocked the device. Any tips? I've already written to Gledopto, but I don't expect any help there.

lorenz commented 2 years ago

@czdaniel If you have a 3.3V serial adapter you can try the physical reset procedure which resets the flash of the unit. After that switch the unit to the right mode and then pair it. It will work the first time.

basmeyer commented 2 years ago

Will this also work with like 5 of them? After the first succesful pairing I had the feeling it did not want to pair with more than 1 devices. But I could be wrong as things got messy when I also tried other modes at the same time. In the end none of them was working any more with neither z2m nor ZHA. Yeah, perhaps a bit brave to order 5 of these already. ;)

Bas

czdaniel commented 2 years ago

I got a hue bridge just to test if it works with that and it does. My amazon echo might have other problems. My power supply cables are too beefy for these "easy to use" push in connectors, which do not inspire confidence and I accidentally fried one already. I was under the impression that these could seperately dim 5 leds strips by the barebones manual online, but this is not the case either. I have a good 140W meanwell power supply but when I dim the light the gledopto emits a high pitched whine, possibly due to that bad connection on the 12v input. I've ordered a few Miboxer FUT036Zs and we'll see if they perform better.

Koenkk commented 2 years ago

@ptvoinfo any chance your firmware could be flashed on these devices?

ptvoinfo commented 2 years ago

@Koenkk Sorry, but the firmware does not support TELINK chips.

Koenkk commented 2 years ago

@ptvoinfo ah sorry, I incorrectly assumed this device was using a TI CC2652 chip

Gudde660 commented 2 years ago

I'm also experiencing this issue with the gl-s-004p from Gledopto. They paired succesful the first time and after a factory reset they join the network and disconnect after a few seconds.

Is there something i can do to help?