Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.67k stars 1.64k forks source link

Can only pair via specific routers #8803

Closed nickw444 closed 2 years ago

nickw444 commented 2 years ago

What happened

Can only pair end devices via specific routers on the network

What did you expect to happen

Can pair end devices via any routers on the network

How to reproduce it (minimal and precise)

  1. Pair a number of router devices (>5 routers) to the network.
  2. For each router, enable pairing for it only, attempt to pair an end device.
  3. For some routers, pairing will not take place, even with multiple attempts. For other routers pairing works almost immediately.

Debug info

Zigbee2MQTT version: 1.21.1-1 (commit: unknown) in hassio supervisor, running on Rpi4 4GB Adapter hardware: CC1352P-2 (based on RF-Star CC2652P2) Adapter firmware version: zStack3x0, 20210708

Further details

Initially I found https://github.com/Koenkk/zigbee2mqtt/issues/7762; which I believed was the root cause of my problem. I had an older network originally running from a CC2530 based coordinator.

However, I have completely nuked my existing network, including re-flashing my CC2652P2 stick, nuking all z2m data (including coordinator backup), and re-paired all existing devices multiple times and still appear to have this same issue.

It does not appear to matter what channel, Pan ID, or network key is used.

I tried reproducing initially on a smaller test network (2 repeaters - 1x Tradfri repeater, 1x DIY CC2530, 2 end devices, CC2652P2 based coordinator), but couldn't reproduce the problem in this configuration.

Instead have needed to re-create my entire production network (including repairing some annoying to access end devices). The issue seems to present itself as the network gains complexity. After adding 5 router devices, it appears to become a problem. End devices will only pair through certain routers. You can see this in the map:

image

I did some further digging this weekend using a CC2531 with sniffer firmware + wireshark. I have captured the following traces:

Archive.zip

(yes I know I've exposed my network key, I will roll it when/if this reaches a resolution)

To explain these traces, there are 5 routers on the network;

IEEE Address Network Address Model Nickname
0xec1bbdfffe9cf5a2 0xE63A IKEA LED1732G11 Nicks Bedside
0x588e81fffe021892 0x6CFA IKEA E1746 Living Room Repeater
0x588e81fffe0205b1 0x049F IKEA E1746 Lounge Repeater
0x00124b001fc64014 0xF82B Custom devices (DiY) CC2530.ROUTER Living Room Noise Repeater
0x00124b000add55c6 0xE671 Custom devices (DiY) CC2530.ROUTER Study Repeater

failed-join-bathroom-motion-via-study-lounge.pcapng

Attempt to join "Bathroom Motion" (0x00158d00063372bf (0x5FB2)) via "Study Repeater" (0x00124b000add55c6 (0xE671)).

Successfully joins via "Lounge Repeater" (0x588e81fffe0205b1 (0x049F))

failed-join-bathroom-via-study-lounge-bedside.pcapng

Attempt to join "Bathroom Door" (0x00158d000451d91a (0x8DFC)) via "Study Repeater" (0x00124b000add55c6 (0xE671)).

Attempt to join via "Lounge Repeater" (0x588e81fffe0205b1 (0x049F))

Successfully joins via "Nicks Bedside" (0xec1bbdfffe9cf5a2 (0xE63A))

failed-join-front-door-button-via-living-room-noise-living-room-lounge.pcapng

Attempt to join "Front Door Button" (0x00158d00045075e9 (0xFD40)) via "Living Room Noise Repeater" (0x00124b001fc64014 (0xF82B))

Attempt to join via "Living Room Repeater" (0x588e81fffe021892 (0x6CFA))

Successfully joins via "Lounge Repeater" (0x588e81fffe0205b1 (0x049F))

failed-join-lemon-tree-via-bedside-via-louge.pcapng

Attempt to join "Lemon Tree" (0x00124b000611b998 (0x997B)) via "Nicks Bedside" (0xec1bbdfffe9cf5a2 (0xE63A))

Successfully joins via "Lounge Repeater" (0x588e81fffe0205b1 (0x049F))

failed-join-living room-button-via-living-room-bedside-lounge.pcapng

Attempt to join "Living Room Button" (0x00158d00028f5873 (0x3B45)) via "Living Room Repeater" (0x588e81fffe021892 (0x6CFA)).

Attempt to join via "Nicks Bedside" (0xec1bbdfffe9cf5a2 (0xE63A))

Successfully joins via "Lounge Repeater" (0x588e81fffe0205b1 (0x049F))

failed-join-living-room-motion-via-many-final-through-nicks-bedside.pcapng

Attempt to join "Living Room Motion" (0x00158d000421a072 (0x5B83)) via many different repeaters. Finally succeeds via "Nicks Bedside" (0xec1bbdfffe9cf5a2 (0xE63A))

francisp2 commented 2 years ago

@castorw Can't pair devices through certain routers, and can't pair unless I just restarted Zigbee2mqtt. I'm using 1.22.1 with a Tube Ethernet coordinator.

Koenkk commented 2 years ago

TI released another SDK update and it looks they added more joining fixes, for those still experiencing this issue please try with 20211207: https://github.com/Koenkk/Z-Stack-firmware/tree/develop/coordinator/Z-Stack_3.x.0/bin

hnykda commented 2 years ago

I wanted to try it, but first I wanted to replicate the issue I had in here but I can't replicate it - the device pairs just fine now (not the TRVs, but these are probably somehow broken). Starting to drive me crazy :joy: . Also @castorw haven't found anything suspicious in my dumps.

jcastro commented 2 years ago

Using the latest version of zigbee2mqtt (1.22.1-1) with latest firmware ( Z-Stack 3.x.0 20211207) for my device (Zzh! CC2652R Multiprotocol RF Stick) and I can still pair new devices. Just wanted to do an update on it.

nickw444 commented 2 years ago

Have upgraded my coordinator to 20211207 over the weekend, and re-paired routers. Looks like the issue is resolved (from what I can tell), managed to pair via a router. Will see if this stays working for the next few days.

realalexandergeorgiev commented 2 years ago

I updated the firmware. Paired a button, worked, moved it inside house, lost connection 10 minutes ago. Unable to find the network. The room it is in has 3 routers. I kept pressing the button while moving there.

Something is broken with routing somehow. I will go back to CC2531 once I find the time. Then I can tell if its related to the chipset (firmware).

nickw444 commented 2 years ago

Dang! Just tried pairing a new Aqara button away from the coordinator but unfortunately it seems to not be able to pair via any router again 🤨

Update: After restarting z2m it seemed to pair (but I think it paired by chance directly to the coordinator). Didn't take any debug traces so don't know for sure though.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

francisp2 commented 2 years ago

I still don't know if it is normal that coordinator_backup.json contains less devices then configuration.yaml and database.db.

nickw444 commented 2 years ago

Agreed, still have some funny things going on in my network, particularly with some devices dropping off the network a few days after being re-paired.

RaveGun commented 2 years ago

Switched from CC2531 to Slaesh's stick on Z2M Edge. Just the sticks switched and the new serial path configured. There network worked fine, no issues. Some weeks later tried to add a new light bulb. Although is supported it paired as an unknown device. I deleted it and tried again. It did not worked at all anymore. Later I added the light bulb to the ZHA network that I have also running with a Deconz stick, no issues.

Several days later tried to pair a Phillips Hue wall switch. Nothing. Just now I tried to add a TVR. Nothing. It is like the network does not switch into pairing mode.

Both just paired fine to the ZHA.

I really wanted to have the Z2M as the main network and the ZHA for testing purposes but I keep adding more and more devices there.

If I start from scratch, not so keen, are somehow the names of the devices kept so that I don't break anything else in home assistant?

Thank you

castorw commented 2 years ago

I believe this is related to #9117. I will be investigating further into this issue as per https://github.com/Koenkk/zigbee2mqtt/issues/9117#issuecomment-1016890665.

hnykda commented 2 years ago

I don't want to derail the conversation, but want to follow up on this in case it's relevant (but it might not!).

Basically, I managed to somehow pair those few missing TRVs by pairing them close to coordinator. It just worked after trying it like 20 times... More about it here: https://github.com/Koenkk/zigbee2mqtt/discussions/10774 . What surprised me a bit was that they got the same name, but that could be because I think I copied friendly name somewhere in the z2m data folder. So that's my story about successful pairings :shrug: .

Second one is weirder. I accidentally hard-switched off (like physically) one of my routers (IKEA Floalt) for a few hours and the whole network started to get crazy. Many devices dropped off the network, even though they were closer to the coordinator or were routers on their own, like other IKEA Floalt. My network is heavily "over centered", almost every device would reach the coordinator and routers wouldn't be needed to extend the range in my case. Z2M simply showed "No route to network (205)". When I (hard)restarted these devices (by fuses down/up), the network works fine now again. But it's pretty scary for me, definitely not what I would expect from mesh.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

francisp2 commented 2 years ago

I still don't know if it is normal that coordinator_backup.json contains less devices then configuration.yaml and database.db.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

francisp2 commented 2 years ago

I still don't know if it is normal that coordinator_backup.json contains less devices then configuration.yaml and database.db.

castorw commented 2 years ago

@francisp2 It could be. We will be investigating futher in https://github.com/Koenkk/zigbee2mqtt/issues/12150 and https://github.com/Koenkk/zigbee2mqtt/issues/12016.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days