Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge πŸŒ‰, get rid of your proprietary Zigbee bridges πŸ”¨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.74k stars 1.64k forks source link

PTM 215Z enOcean slow respons randomly #22897

Open samuele2723 opened 3 months ago

samuele2723 commented 3 months ago

What happened?

Hello, i write here to have some support for my home switches because my wife is crazu with me!!

I have a large setup with Sonoff CC2531 coordinator, my network is composed by 190 devices across 4 levels (garage, ground, 1st, 2nd). Overall the network works good, means i don't see delays of any kind when i use from home assistant app or from Z2M interface the lights.

Only random delays i have, are when using my switches around the house, that are all PTM 215Z enOcean. And my wife is mainly using these physical buttons to control the lights.

[https://zigbee2mqtt.io/devices/PTM_215Z.html#enocean-ptm%2520215z]

Sometimes, when i click, the light is switch on\off instantly, (i could not even say there is a delay). Some other times it takes 1 to 3 seconds. This happens also in the same switch in the same moment, means if i keep push on\off the delays changes time by time.

I can't say it's an hardware issue, because i have 25 of them and all are affected (especially ones far from coordinator that's at garage level) and also because i've used these with Philips Hue Bridge and they seemed to be more responsive in that setup

What did you expect to happen?

I expect that there is no delay or minimum and repetitible always similar when i use switches. They should be almost instant in their responsiveness, I don't also want to see difference from ground floor or second floor, i would like to understand how to prioritize message responsiveness from these switches

How to reproduce it (minimal and precise)

I can reproduce it every time, and is not related to what light i turn on, or how (home assistant or MQTT) seems related to the delay of the button transmission of PTM 215Z enOcean itself.

It can also be noticed by watching the state change of click actions of PTM 215Z enOcean in the dashboard, you can perceive the message is received in delay from the switch.

Zigbee2MQTT version

1.38.0

Adapter firmware version

20210708

Adapter

ZDONGLE P 2652 sonoff zStack3x0

Setup

Home assistant OS with MQTT and Z2m Addon

Debug log

No response

burmistrzak commented 3 months ago

@samuele2723 We have a similar setup, also about the same size, and observed similar things. Mainly delayed responses from EnOcean switches. I believe this started only in the last 3–4 weeks or so.

Have you paired your switches in Unicast mode? Otherwise they might overwhelm your network with broadcast traffic.

samuele2723 commented 3 months ago

Hello there, i didnt know about that. I paired all of them with "enable join" active by all devices i guess, because i leave always open the network when z2m start.

So should i re pair them selecting the closer router for each? Could you explain more ?

burmistrzak commented 3 months ago

Hello there, i didnt know about that. I paired all of them with "enable join" active by all devices i guess, because i leave always open the network when z2m start.

I'll suggest you first set permit_join: false, then remove all your EnOcean switches from Z2M. Sorry! πŸ˜… Restart Z2M when you're done, just to safe.

So should i re pair them selecting the closer router for each? Could you explain more ?

Yes! Reset every single switch (procedure described here), then begin re-pairing by always selecting the "nearest" Zigbee router (permit join dropdown) for each EnOcean switch.

I always use the main ceiling-mounted light in every room for this job, because it cannot accidentally get unplugged. πŸ˜‰

burmistrzak commented 3 months ago

@Koenkk Did we change something with regards to GreenPower devices recently? πŸ€” Otherwise, it might all just be a big coincidence.

@samuele2723 Do your EnOcean switches control individual lights or entire Zigbee groups?

Koenkk commented 3 months ago

No nothing has changed regarding GP devices

burmistrzak commented 3 months ago

No nothing has changed regarding GP devices

@Koenkk That's what I thought... πŸ€” Unfortunately, I still see random delays (up to multiple seconds) in some parts of our network (especially for TOGGLE on lights). I first thought this might be caused by running OTA updates on some Hue lights, but they're all finished now... And it apparently also doesn't matter how the delayed command is issued (GP via Node-RED, Homebridge, or Z2M UI). Have you seen something like this before?

LaurentChardin commented 3 months ago

GP devices are tricky because they don't really connect to the Zigbee network : they need to be translated to Zigbee using the Green Power Proxy, which is a capability that all routers have if they are Zigbee 3.0 (if i am not mistaken). Therefore there are really not regular devices, and the way the routers behaves to translate the GP messages into Zigbee messages can vary across manufacturers.

As @burmistrzak states, you want to check your mesh quality too (printing the map and checking the LQI of your routers), and also be sure you have routers with good GP proxy support (difficult to know actually, but in the end, the philips ones have very good support for GP)

burmistrzak commented 3 months ago

Unfortunately, I still see random delays (up to multiple seconds) in some parts of our network (especially for TOGGLE on lights). […] Have you seen something like this before?

@Koenkk Made an interesting discovery: As described in https://github.com/itavero/homebridge-z2m/issues/882, the Homebridge plugin regularly asks Z2M for all gettable keys for every single device, causing a slight surge in Zigbee messages. Already looking into a potential fix for this.

However, that's only part of the problem. The other issue seems to be, the plugin trying to get power_on_behavior from the wrong endpoint for multiple Bosch BMCT-SLZ, resulting in at least seven simultaneous UNSUPPORTED_ATTRIBUTE errors.

Publish 'get' 'power_on_behavior' to '0xdeadbeef' failed: 'Error: ZCL command 0xdeadbeef/1 genOnOff.read(["startUpOnOff"], {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Status 'UNSUPPORTED_ATTRIBUTE')'

Can such invalid requests maybe temporarily clog the network, thus causing other devices to respond slower?

burmistrzak commented 3 months ago

@Koenkk @samuele2723 Lol, guess I've figured out my problem... Some Hue lights (acting as GP proxy) are blasting 46 (!) broadcast messages per second for each EnOcean action they receive. Remember, I explicitly paired all PTM 215Z in unicast mode. No wonder the Zigbee network gets totally overwhelmed.

Just look at this photog..., uhh, screenshot:

Screenshot 2024-06-05 at 20 21 12

Highlighted in pink is a PTM 215Z being translated correctly to unicast, selected (in blue) is one of the 46 broadcast messages of a misbehaving switch.

burmistrzak commented 3 months ago

Huh, that's interesting. @Koenkk I tried to re-pair one of the misbehaving GP switches, but I seemingly can't get it to use unicast!?

Screenshot 2024-06-05 at 21 33 39

The correct Commissioning Mode has been requested by the coordinator, but as soon as GP Pairing begins, the mode somehow changes to Groupcast to pre-commissioned GroupID (0x2), for no obvious reason.

Screenshot 2024-06-05 at 21 36 30
burmistrzak commented 3 months ago

Oh man... This rabbit hole goes deep!

I believe I have tracked down the root cause of the problem. It seems to be the model of Hue light selected to be the GP proxy! While our ceiling lights look the same, they're from different generations. ☝️

This truly seems to be a firmware issue all along. So I assumed all our GP switches were correctly paired as unicast, but in reality, only a few of them actually were! 🀯

burmistrzak commented 3 months ago

This truly seems to be a firmware issue all along.

I was able to confirm that (at least) the following devices support commissioning GP devices in unicast mode:

Hue lights from the same generation as the ones above should work as expected. Older generations, despite latest firmware, likely use Groupcast instead.

burmistrzak commented 3 months ago

@samuele2723 Alright, after I made my discoveries outlined above, I also had to re-pair almost all of our EnOcean switches. πŸ˜…

Here's some important information for anyone trying to pair their Zigbee Green Power in unicast mode:

Make sure the GP proxy you choose (e.g. a Hue light) is actually using unicast commissioning when told to do so! One good indicator I've found is the presence of an input cluster on the Green Power endpoint. So far all GP proxies with an input cluster on that endpoint, will use Groupcast and not Unicast to pair ZGP devices. That's what it looks like in Z2M web UI for example:

IMG_1103

IMG_1102

It might be possible to force/change the commissioning mode (on GP proxies with an input cluster) using some command. We'll see. I'm currently working through the ZGP specification. πŸ˜‡

burmistrzak commented 3 months ago

A few more details: It appears that Signify (Philips Hue) decided at some point, to upgrade their hardwired fixtures to also include a Green Power server, alongside the client cluster available on all of their modern lights.

So the type of light (removable or not) seems to be the key here, not really product generation.

LaurentChardin commented 3 months ago

I always thought that Input and Output GP clusters were part of the Zigbee 3.0 specifications, and needed for the certification. I see that @Hedda answered to you in Koenkk/zigbee-herdsman#902

Anyway very nice drilldown @burmistrzak

burmistrzak commented 3 months ago

I always thought that Input and Output GP clusters were part of the Zigbee 3.0 specifications, and needed for the certification.

@LaurentChardin Apparently not, because the E27 Hue bulbs are definitely Zigbee 3.0 certified. It really seems to be a policy choice by Signify to only include a Green Power Sink (server) with their hardwired fixtures.

I believe what we're seeing here is the difference between a Green Power Proxy Basic, and a Green Power Combo Basic device. The latter being an interesting combination of a GP Proxy Basic, and a GP Sink Basic in a single device.

However, one big mystery still remains: Why does the proxy of such a GP Combo Basic ignore the GP Commissioning Mode requested by the coordinator?

Btw. am I the only one finding the terminology of input/output cluster a bit confusing? Using server/client, same as in the specification, would make much more sense IMHO. πŸ™ˆ

burmistrzak commented 3 months ago

Just realized we absolutely need to improve the dev console with more advanced features (read/write undefined clusters/attributes), so poking around gets a bit more comfortable.

Isn't there some sort of CLI for the CC2652P to interact with the Zigbee network directly?

Hedda commented 3 months ago

I always thought that Input and Output GP clusters were part of the Zigbee 3.0 specifications, and needed for the certification.

@LaurentChardin Apparently not, because the E27 Hue bulbs are definitely Zigbee 3.0 certified. It really seems to be a policy choice by Signify to only include a Green Power Sink (server) with their hardwired fixtures.

As I understand it, all Zigbee Router (ZR) devices need to at a minimum include support for ZGPB/GPB (Zigbee Green Proxy Basic) to be compliant with Zigbee 3.0 certification requirements

burmistrzak commented 3 months ago

Ok, so we're handling a GP Commissioning Notification that's coming in as broadcast differently.

                    // Communication mode:
                    //  Broadcast: Groupcast to precommissioned ID (0b10)
                    // !Broadcast: Lightweight unicast (0b11)
                    let opt = 0b1110010101101000;
                    if (dataPayload.wasBroadcast) {
                        opt = 0b1110010101001000;
                    }

This specific condition was introduced with https://github.com/Koenkk/zigbee-herdsman/pull/518, and seems to be inline with the GP specification.

burmistrzak commented 3 months ago

The way I read the specification, a GP Proxy has to send GP Commissioning Notification commands using the mode requested in the initial GP Proxy Commissioning Mode frame.

The Unicast communication sub-field of the Options field, if set to 0b0, indicates that the receiving proxies SHALL send the GP Commissioning Notification commands in broadcast. If set to 0b1, it indicates that the receiving proxies SHALL send the GP Commissioning Notification commands in unicast to the originator of the GP Proxy Commissioning Mode command.

However, Hue lights that implement a GP Combo Basic device (mainly hardwired fixtures) are simply ignoring it, and use broadcast instead! Thus far, I haven't found any reason in the spec for this particular behavior...

@Koenkk Am I missing something, or is this indeed a bug in the Hue firmware? πŸ€”

chris-1243 commented 3 months ago

@burmistrzak

One good indicator I've found is the presence of an input cluster on the Green Power endpoint. So far all GP proxies with an input cluster on that endpoint, will use Groupcast and not Unicast to pair ZGP devices.

Your sentence is a bit confusing for me. Shall I use a router with an input/output cluster on endpoint 242 to get unicast ?

burmistrzak commented 3 months ago

One good indicator I've found is the presence of an input cluster on the Green Power endpoint. So far all GP proxies with an input cluster on that endpoint, will use Groupcast and not Unicast to pair ZGP devices.

Your sentence is a bit confusing for me. Shall I use a router with an input/output cluster on endpoint 242 to get unicast ?

If it's a hardwired Philips Hue fixture, then (at least for now) no. All other routers that implement only a GP Proxy Basic (i.e. only output cluster) from any manufacturer should generally be fine though. I don't know about non-Hue GP Combo Basic (i.e. input/output on 242) devices. They might work. @chris-1243 If you have one, give it a try and report back!

chris-1243 commented 3 months ago

Now I do understand better your explanation. I have 7 PTM215Z and none of them are paired to a Hue router even if I have some in my network.

I have a PTM 216Z which is paired via a Hue 4034031P7 with output cluster on endpoint 242 only and it seems to work fine.

All my PTM 215Z work fine without any delays which this configuration. My network is quite small compred to yours as I have only 35 devices.

I used to get some delays months ago as all my PTM were not paired directly to a specific router.

burmistrzak commented 3 months ago

I have 7 PTM215Z and none of them are paired to a Hue router even if I have some in my network.

Phew, good for you! πŸ˜‡

  • 2 other via an Ikea ICPSH24-10EU-IL-2 with input/output cluster on endpoint 242
  • 1 via an Ubisys S2 with input/output cluster on endpoint 242.

@chris-1243 Would you be able to provide a packet capture when interacting with these switches? I'm particularly interested to see how GP Combo devices from other manufacturers behave. We currently can't really be sure whether Hue's implementation is an outlier/bug, or generally more common.

I have a PTM 216Z which is paired via a Hue 4034031P7 with output cluster on endpoint 242 only and it seems to work fine.

That's to be expected. πŸ‘Œ

chris-1243 commented 3 months ago

I just had a look on how to sniff traffic and it seems possible to use an ezsp adapter for this (I have a dongle-e laying for testing). You will have to give me some time to get everything running under windows... (not se easy🫣) I will try to send you some data.

Anything special I will have to look for?

burmistrzak commented 3 months ago

I just had a look on how to sniff traffic and it seems possible to use an ezsp adapter for this (I have a dongle-e laying for testing). You will have to give me some time to get everything running under windows... (not se easy🫣) I will try to send you some data.

Oh, that would be fantastic!

Anything special I will have to look for?

Yes! When pressing a button, you should see exactly two GP Notification (press & release) to your coordinator. This means your GP Proxy is handling unicast correctly. Especially interesting to me are EnOcean switches paired to your GP Combo devices (i.e. input/output on 242).

burmistrzak commented 3 months ago

@chris-1243 Btw. here's a guide for EZSP adapter on Windows. Once that's all done, follow this guide to convert, and add your network key to Wireshark.

You should be ready to go in less than ten minutes, if everything goes according to plan. I had my setup (admittedly with a TI chip) up and running in under five minutes, even on macOS. πŸ˜‰

chris-1243 commented 3 months ago

It is running in more than 10 minutes....

Now, you will have to help me a bit what I should check and extract to help you. There are so many different information that I am lost. I can see this Sniff_001 Sniff_002

burmistrzak commented 3 months ago

Now, you will have to help me a bit what I should check and extract to help you. There are so many different information that I am lost.

@chris-1243 First, I would filter for zbee_zcl, so you don't see lower level traffic. But you've already provided a bunch of useful info! πŸ™

Which GP Proxy exactly did you use? IKEA or Ubisys? Also, would it be possible for you to re-pair one of your EnOcean switches? I would love to see the commissioning process.

You just have to remove the PTM 215Z in Z2M, then enable "Permit join" for your GP Proxy of choice, and follow the super quick pairing guide. In Wireshark, please look for the following packets:

chris-1243 commented 3 months ago

Which GP Proxy exactly did you use? IKEA or Ubisys?

The PTM used in the screenshots is paired via an IKEA router (LED2101G4).

chris-1243 commented 3 months ago

So, I used the same IKEA router (LED2101G4), and here are the results of a re-pairing of a PTM215Z.

Sniff_pairing_001 Sniff_pairing_002 Sniff_pairing_003

If you need more or other fields to look at, just let me know

burmistrzak commented 3 months ago

@chris-1243 Thanks! ☺️ Please also share the following packets:

chris-1243 commented 3 months ago

Here we go

Sniff_pairing_004 Sniff_pairing_005 Sniff_pairing_006

I have a *.pcapng backup if required

burmistrzak commented 3 months ago

@chris-1243 Alright, your GP Proxy is behaving as expected. Fantastic!

Packet No. 1747 is unicast, so IKEA is actually following the specification. This means Philips Hue likely messed up something in firmware. Not good...

I'll open a ticket with Signify, but if anyone has a contact there, let me know. 🀞

chris-1243 commented 3 months ago

@burmistrzak Tanks for your effort. Much appreciated.

I am aware it is not the best topic to use, I have a question about PTM216Z. Would you think there is any possiblity to better support them in term of router able to translate the frame sent by this device ? It is quite hard to find a Hue device able to do the translation. I have 4 different Hue lamps and only one is able to do the job...shame. I may open a new issue if requiredπŸ˜‰

burmistrzak commented 3 months ago

Tanks for your effort. Much appreciated.

Thx! Mail to Hue support is on its way. 🀞🀞🀞

I am aware it is not the best topic to use, I have a question about PTM216Z. Would you think there is any possiblity to better support them in term of router able to translate the frame sent by this device ? It is quite hard to find a Hue device able to do the translation. I have 4 different Hue lamps and only one is able to do the job...shame. I may open a new issue if requiredπŸ˜‰

@chris-1243 Well, we unfortunately can't really change the firmware on Zigbee devices... Which Hue lamps exactly did you try? πŸ€”

chris-1243 commented 3 months ago

@burmistrzak

With no success I tried those models:

I successfully paired it via this model:

If it is related to firmware only... effectively, it is going to be hard to better support this device

burmistrzak commented 3 months ago

@chris-1243 Hmm, could be because these three are seemingly from an older generation? Do they have a Green Power cluster on endpoint 242? I've got a Hue Fair myself, nice fixture, but it's unfortunately also affected by the unicast bug... πŸ™„

chris-1243 commented 3 months ago

They all have a Green Power on endpoint 242 (at least the output cluster). 8718696548738 has even an input cluster... I guess they might be too old.

burmistrzak commented 2 months ago

They all have a Green Power on endpoint 242 (at least the output cluster). 8718696548738 has even an input cluster... I guess they might be too old.

@chris-1243 Hmm, very strange. Might be interesting to see a packet capture from an unsuccessful pairing. Also, can a PTM215Z be paired without error?

chris-1243 commented 2 months ago

@burmistrzak PTM215Z are able to pair without any issues or maybe the one you just found.

I will try to make a packet capture regarding PTM216Z. 8718696548738 just had an firmware update (1.116..3) and it seems to accept to pair a PTM216Z...

I need to take more time and do some tests. What kind of packet capture would like to have ?

burmistrzak commented 2 months ago

I will try to make a packet capture regarding PTM216Z. 8718696548738 just had an firmware update (1.116..3) and it seems to accept to pair a PTM216Z...

Uhm, interesting. πŸ‘€

I need to take more time and do some tests. What kind of packet capture would like to have ?

@chris-1243 Sure. πŸ‘ Two sessions should probably be enough to figure out what's going on:

For both sessions, I need to see the first occurrence of each of these these packet types:

chris-1243 commented 2 months ago

I hope I provided the right sequences. First, when the pairing is not working

Sniff_ptm216z_01 Sniff_ptm216z_02

Then when it successfully paired via 8718696548738

Sniff_ptm216z_03 Sniff_ptm216z_04 Sniff_ptm216z_05

And I have two *.pcapng backup for each try if more easier for you to analyse

burmistrzak commented 2 months ago

@chris-1243 Thanks! I'll have to look a bit more into the nuances of the various PTM21x models. Are you comfortable with sharing your *.pcapng files? Would certainly make things easier.

Also, two more questions:

Btw. here's the user guide for the PTM216Z.

chris-1243 commented 2 months ago

@burmistrzak To share my *.pcapng directly via github, it dosen't sound a good idea. I'm going to share my whole network on the internet and maybe more other info.

I did not apply any specific configuratioj to my PTM216Z. I have only one for testing. I do prefer the older 215Z.

I will do an other try. I suspect I did not reset correctly the PTM216Z.

chris-1243 commented 2 months ago

I used two differents bulbs

PTM215Z is a GreenPower_2 model and PTM216Z a GreenPower_7. Could this make a difference ?

burmistrzak commented 2 months ago

To share my *.pcapng directly via github, it dosen't sound a good idea. I'm going to share my whole network on the internet and maybe more other info.

Yea, not a great idea. Didn't know if you have a test setup. πŸ˜…

I will do an other try. I suspect I did not reset correctly the PTM216Z.

You probably need to switch the PTM216Z to one of the non-default, command-based modes. It's in the user guide.

burmistrzak commented 2 months ago

PTM215Z is a GreenPower_2 model and PTM216Z a GreenPower_7. Could this make a difference ?

Certainly. Try configuring a command-based device model. NFC should be the easiest option to do that.

chris-1243 commented 2 months ago

I have one based on an EZSP adapter running ember driver. The main problem is the PTM is able to pair directly to the coordinator on ember.

I will find a way to create a test network on zstack and remove two devices of my main network...

burmistrzak commented 2 months ago

@chris-1243 Alternatively, you can safely share your packet captures using a single-use link. E.g. https://onetimedrop.app