Koenkk / zigbee-herdsman

A Node.js Zigbee library
MIT License
456 stars 277 forks source link

[EZSP] ZGP / greenPower messages are not relayed by HUE routers #823

Open rsegers opened 5 months ago

rsegers commented 5 months ago

(moved from https://github.com/Koenkk/zigbee2mqtt/issues/19853)

What happened?

Zigbee Green Power (ZGP) messages are only broadcasted by the coordinator, so they only work in close proximity to it.

What did you expect to happen?

I have several HUE routers in range of the ZGP switches (EnOcean PTM 215Z). I expect those device to also pick up and relay ZGP messages.

How to reproduce it (minimal and precise)

1.) Create a Zigbee network with z2m and an azsp coordinator (e.g. sonoff dongle e) on channel 11 2.) Configure several ZGP switches for channel 11 3.) Add HUE routers to the network 4.) Position the ZGP switches far from the coordinator but in close range to HUE routers

Zigbee2MQTT version

1.33.2-dev commit: 746066c

Adapter firmware version

6.10.3.0 build 297

Adapter

EZSP Sonoff Dongle Plus-E EFR32MG21

Debug log

No logging. But after applying https://github.com/Koenkk/zigbee-herdsman/compare/master...rsegers:zigbee-herdsman:master pairing a PTM 215Z through a Hue router now shows up in the Herdsman logging. Pairing does not occur, because Zigbee2MQTT thinks it is the coordinator?

2023-12-06T21:07:35.648Z zigbee-herdsman:adapter:ezsp:uart <-- [0132b1a90d2a92abdaf0b12eabdd8549f0f925b0e4320e7ff17e]
2023-12-06T21:07:35.648Z zigbee-herdsman:adapter:ezsp:uart <-- DATA (0,1,0): 0132b1a90d2a92abdaf0b12eabdd8549f0f925b0e4320e7ff17e
2023-12-06T21:07:35.649Z zigbee-herdsman:adapter:ezsp:uart --> ACK  (1)
2023-12-06T21:07:35.649Z zigbee-herdsman:adapter:ezsp:uart --> [8160597e]
2023-12-06T21:07:35.649Z zigbee-herdsman:adapter:ezsp:ezsp <== Frame: 709001590087198364fb0b018817006cb7021b09fc69
2023-12-06T21:07:35.650Z zigbee-herdsman:adapter:ezsp:ezsp <== 0x59: {"cls":"incomingRouteRecordHandler","id":89,"isRequest":false,"source":6535,"longId":{"type":"Buffer","data":[0,23,136,1,11,251,100,131]},"lastHopLqi":108,"lastHopRssi":-73,"relay":{"type":"Buffer","data":[27,9]}}
2023-12-06T21:07:35.650Z zigbee-herdsman:adapter:ezsp:driv handleRouteRecord: nwk=6535, ieee=
2023-12-06T21:07:35.705Z zigbee-herdsman:adapter:ezsp:uart <-- [7d3132b1a97d312a1552f8b54ad7585593499ccc431e6ad79874ebd7778ddc75b2429dcc596f8fffd2db52cbbd6750fb73ea81407e]
2023-12-06T21:07:35.705Z zigbee-herdsman:adapter:ezsp:uart <-- DATA (1,1,0): 1132b1a9112a1552f8b54ad7585593499ccc431e6ad79874ebd7778ddc75b2429dcc596f8fffd2db52cbbd6750fb73ea81407e
2023-12-06T21:07:35.705Z zigbee-herdsman:adapter:ezsp:uart --> ACK  (2)
2023-12-06T21:07:35.706Z zigbee-herdsman:adapter:ezsp:uart --> [82503a7e]
2023-12-06T21:07:35.706Z zigbee-herdsman:adapter:ezsp:ezsp <== Frame: 709001450000e0a12100f2f2000100008264b58719ffff16111404200b8de576018700000015008719d4eb16d8da06
2023-12-06T21:07:35.707Z zigbee-herdsman:adapter:ezsp:ezsp <== 0x45: {"cls":"incomingMessageHandler","id":69,"isRequest":false,"type":0,"apsFrame":{"profileId":41440,"clusterId":33,"sourceEndpoint":242,"destinationEndpoint":242,"options":256,"groupId":0,"sequence":130},"lastHopLqi":100,"lastHopRssi":-75,"sender":6535,"bindingIndex":255,"addressIndex":255,"message":{"type":"Buffer","data":[17,20,4,32,11,141,229,118,1,135,0,0,0,21,0,135,25,212,235,22,216,218]}}
2023-12-06T21:07:35.707Z zigbee-herdsman:adapter:ezsp:debg processMessage: {"messageType":0,"apsFrame":{"profileId":41440,"clusterId":33,"sourceEndpoint":242,"destinationEndpoint":242,"options":256,"groupId":0,"sequence":130},"lqi":100,"rssi":-75,"sender":6535,"bindingIndex":255,"addressIndex":255,"message":{"type":"Buffer","data":[17,20,4,32,11,141,229,118,1,135,0,0,0,21,0,135,25,212,235,22,216,218]}}
2023-12-06T21:07:35.709Z zigbee-herdsman:controller:log Received 'zcl' data '{"frame":{"Header":{"frameControl":{"reservedBits":0,"frameType":1,"direction":0,"disableDefaultResponse":true,"manufacturerSpecific":false},"transactionSequenceNumber":130,"manufacturerCode":null,"commandIdentifier":0},"Payload":{"type":"Buffer","data":[17,20,4,32,11,141,229,118,1,135,0,0,0,21,0,135,25,212,235,22,216,218]},"Command":{"ID":0,"parameters":[{"name":"options","type":33},{"name":"srcID","type":35,"conditions":[{"type":"bitFieldEnum","param":"options","offset":0,"size":3,"value":0}]},{"name":"gpdIEEEAddr","type":240,"conditions":[{"type":"bitFieldEnum","param":"options","offset":0,"size":3,"value":2}]},{"name":"gpdEndpoint","type":32,"conditions":[{"type":"bitFieldEnum","param":"options","offset":0,"size":3,"value":2}]},{"name":"frameCounter","type":35},{"name":"commandID","type":32},{"name":"payloadSize","type":32},{"name":"commandFrame","type":1009},{"name":"gppNwkAddr","type":33,"conditions":[{"type":"bitMaskSet","param":"options","mask":16384}]},{"name":"gppGddLink","type":32,"conditions":[{"type":"bitMaskSet","param":"options","mask":16384}]}],"name":"notification"}},"address":6535,"endpoint":242,"linkquality":100,"groupID":null,"wasBroadcast":true,"destinationEndpoint":242}'
2023-12-06T21:07:35.709Z zigbee-herdsman:controller:greenpower GP Undefined Command
Zigbee2MQTT:debug 2023-12-06 22:07:35: Received Zigbee message from 'Coordinator', type 'commandNotification', cluster 'greenPower', data '{"data":[17,20,4,32,11,141,229,118,1,135,0,0,0,21,0,135,25,212,235,22,216,218],"type":"Buffer"}' from endpoint 242 with groupID null, ignoring since it is from coordinator
Hedda commented 5 months ago

@rsegers have you tried with newer NCP firmware versions? Suggest maybe doing so just to rule out this being a firmware bug?

There are no newer official firmware releases by ITead (Sonoff) but there are many third-party community firmware builders, ex:

@kirovilya ping for https://github.com/Koenkk/zigbee2mqtt/issues/19853 and https://github.com/Koenkk/zigbee-herdsman/issues/319

rsegers commented 5 months ago

@Hedda I just updated from the stock 6.10.3 to 7.3.1 (NCP) and got the same results.

I didn’t try the RCP/multi-pan firmware, as it requires repairing all devices. Don’t think that could solve the issue?

Hedda commented 5 months ago

I just updated from the stock 6.10.3 to 7.3.1 (NCP) and got the same results.

I didn’t try the RCP/multi-pan firmware, as it requires repairing all devices. Don’t think that could solve the issue?

Good, then you have at least ruled out firmware versions up to 7.3.1 (RCP works the same)

sdals-VS commented 4 months ago

Any known workaround on this? I’m having what appears like the same problem.

Nerivec commented 2 months ago

@dduransseau @chris-1243 Could you check this behavior with ember when you have some time?

dduransseau commented 2 months ago

@dduransseau @chris-1243 Could you check this behavior with ember when you have some time?

I just did this test and it worked, however I'm not to sure that the GP endpoint was "far" from the coordinator, the hue router was definitely closer to the GP then the coordinator but enough. I might be able to do other test tomorrow further than what I just did.

chris-1243 commented 2 months ago

@dduransseau @chris-1243 Could you check this behavior with ember when you have some time?

I don't have any spare Hue bulb/plug at the time for a try unfortunately and I don't want to remove one from my main network as it is really stable and I don't want to break anything.

I used a Sunricher module (which relayed correctly ZG frames in zstack for info) and tried to be as far as possible of the coordinator (lots of wall as well). It worked. As the PTM21x/Z seems able to be paired without permit join turned ON as I posted here, I am just wondering if I was away enough of the coordinator or if the ZG frames were sent via such specific router.

On zstack, it is definitely an other behaviour

chris-1243 commented 2 months ago

@Nerivec Sorry to bother you. I would like to share an observation since my last comment.

On zstack, when an end device is joined via a specific router, the lqi showed in Z2M is the same as the router used. This is how my PTM215Z, Ikea remotes are shown.

On ember, at least for PTM215Z/216Z, the behaviour is not the same. In my test, the lqi shown was not the one shown by the router used to join the network. I am wondering if the router was really used.

Nerivec commented 2 months ago

I don't know much about zStack, but as far as I know, EmberZNet brought a lot of improvements on LQI calculations, so I'm not surprised at all that it doesn't match what zStack shows. In EmberZNet, it should be the actual LQI, even deep in the mesh and GP devices have a dedicated LQI calculation (to account for the differences). As you know, I don't have any GP device, so I can't confirm how precise it is (but I can confirm it seems pretty good with regular devices). Is the LQI shown for the GP device not realistic based on the distance to the router in your opinion?

chris-1243 commented 2 months ago

The lqi shown was a bit strange. I did an other test. Finally, I decided to remove a Hue bulb (8718696548738) from my network (one translating ZG frames in zStack). It was right between the coordinator and my PTM216Z and two concrete walls. I paired the ptm via this bulb and it seemed to work. Then, I removed the bulb from the network which should remove the route for the ptm as well. The ZG device was still working and the lqi was sometimes at 200 and a click after at 10...

No idea if it would help you. I guess and as you said, there is something to check with ZG and pairing

Nerivec commented 2 months ago

Welcome to the club of "break your house network to test something" 😅

Since the Hue would only be a proxy in this scenario, it would make sense if the GP device behaves like others, and links up to the coordinator when the Hue is removed. What does not make sense, is such fluctuations between reported LQI. When the new route is established, it should go back to a stable-ish value. Is the LQI stable in "regular" usage? Since it's GP, fluctuations of up to 30-40 would not surprise me, but not 200...

PS: With herdsman-debug ON, you should be able to check if the device is indeed finding a new route. You should see route handlers being triggered (like ezspIncomingRouteRecordHandler).

chris-1243 commented 2 months ago

I am on zStack in regular usage... My journey with Z2M began with an EZSP coordinator but the driver was not really stable for me and I had crash once a day. That was a year ago. I then moved to zStack and the network is stable.

Recently, I moved some devices to EZSP just for trying (curious I am...). Those devices (2 hues bulbs, 1 tradfri driver, 2 ptm215z and 1 sunricher remote) were quite far from the coordinator. The lqi was between 150-200 for all. One ZG PTM215Z had huge lqi drop (from 150 to 5) and was not working all the time. I revert to zStack those devices due to the ptm215z which was not doing correctly its job and my family which was not happy...

PS: With herdsman-debug ON, you should be able to check if the device is indeed finding a new route. You should see route handlers being triggered (like ezspIncomingRouteRecordHandler).

Good to know. Does it work even with those battery less ZG device ?

Nerivec commented 2 months ago

Does it work even with those battery less ZG device ?

That is mostly why I mentioned it... I don't see why not (since you said it was still working), but I can't be sure. I'd expect an error at first (on first wake up, though you might only see it in the logs), but then I assume it is capable of changing route to the coordinator, or whatever GP-capable device on the same network is nearest... Then again, the GP spec... 😓 That definitely would need to be tested with actual devices.

chris-1243 commented 2 months ago

I need to find a way to be out of range of the coordinator and to have a router in the middle. Then I need to pair my ZG device, remove the "middle" device and come closer to the coordinator and check if my ZG device find a new route... That's sound possible

rsegers commented 2 months ago

For reference: I tried the experimental ‘ember’ adapter setting, but that resulted in the same behavior.

Nerivec commented 2 months ago

Could you post the herdsman-debug with ember like you did for ezsp (from the moment you "permit join", to several seconds after the pairing should have completed)? The ember GP handler should currently show plenty of details to figure this out.

Note: The ezsp herdsman debug you posted (1st post) doesn't show a pairing command, it's just a notification from the coordinator. Although the issue might be related to this.

chris-1243 commented 2 months ago

@Nerivec I finally made some test. I removed the antenna of my Sonoff Dongle-E in order to reduce the range of my coordinator (and it reduces a lot the range...). I the add two routers able to translate ZG frames (both from Sunricher actually) and a PTM215Z as this ZG device is easier to pair than PTM216Z. I was not able to pair my ZG device while using ember.

Below the network in ember...

Screenshot 2024-03-05 173139

I decided to remove the devices (I left them at the same place) and to exchange the Sonoff Dongle-E by a Rf-Star CC2652P (my backup dongle). I did remove the antenna as well in order to reproduce the "same" environment. The pairing went fine for the routers and the PTM215Z as well...

It might need more digging and test. I doubt the pairing via a router is working for ZG devices in ember. I will try to reproduce such network and test conditions to see if this behavior occurs again.

Nerivec commented 2 months ago

Please tell me you had herdsman-debug ON and kept the logs? 😁

At first glance, I'd say this might be a case where the coordinator processes the GP message of the proxy (Sunricher), but then sends it up to the host with itself as the source, and Z2M is currently ignoring all messages coming from the coordinator. I'm guessing the zStack coordinator (not able to process GP) just passes the message along, and thus skips that step entirely.

I'd even be curious if the device isn't actually paired in your first test, just Z2M higher logic is not "seeing it". Can you try again, and after a few seconds (the time it takes to usually complete the pairing procedure), try to use the device and check if you see your actions in the logs.

The behavior is also different based on if the GP device is using encryption or not, the logs will tell you (gpdfSecurityLevel in ezspGpepIncomingMessageHandler log lines).

GP sure is fun 😅

chris-1243 commented 2 months ago

Herdsman-debug should have been ON. As I am on Docker, I guess it was erased after stoping and removing the container... To be honest, the logs I have are full of restart of the container and errors as I was looking for the best spot for the Dongle-E. Not really usefull.

Anyway, it was a first try and I plan to make another one. This time, I am going to keep herdsman-debug logs. Shall I clean them or would like to have a full log from start ?

Nerivec commented 2 months ago

Full logs is fine. I've gotten pretty good at reading them fast. You should see the size of my ember report logs folder 😆 Just make sure to start from a clean network, no unnecessary devices, it'll be easier to track things.

chris-1243 commented 2 months ago

So, it is going to be a full log. I guess, I will proceed like this:

Like that, you should not have too much to read and it should be focused on the PTM215Z only and the differents pairing scenarios

chris-1243 commented 2 months ago

Below my different tries of the day... I kept the same routers as yesterday but, at the end, I change the position of the routers (the one closer to the coordinator was then the further...). I have been able to pair once the PTM215Z. In my opinion, the PTM215Z paired directly to the coordinator as lqi was below 10... When I tried to pair it via routers (both of them in different network configuration) nothing happened.

I kept all the herdsman-debug logs... Sorry, you will have tons of lines to read. herdsman-debug.txt (creation of the network) herdsman-debug_ptm_joining.txt (first try) herdsman-debug_ptm_joining_2.txt (second try when the PTM215Z paired....)

I hope you may find something interesting in those logs.

For the PTM216Z and as I need a Hue device, it is going hard to reproduce such test conditions.