Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
12.07k stars 1.67k forks source link

[Problem]: After switching to CC2652 coordinator, my network works perfectly, but no device will pair with it. #10339

Closed convicte closed 2 years ago

convicte commented 2 years ago

What happened?

I transitioned from an old and weak CC2531, which worked ok, but with 40 devices connected started to misbehave, lose connection to devices, etc. Unfortunately, upon switching to CC2652 there is no way to pair any new device (IKEA, XIAOMI, TUYA, etc.). The network rebuilt perfectly fine after the coordinator change, and is updating sensors just fine, but will not execute a pairing process.

I've changed back and forth between different coordinator firmwares, to no effect. Unplugging the coordinator, restarting the VM in which HA is running, neither restarting HA made any effect.

What did you expect to happen?

When pairing is initiated, the interview process begins and the device joins the network as per the old CC2531.

How to reproduce it (minimal and precise)

1) Connect the next adapter to the network 2) Permit_join enabled 3) Put device in pairing mode 4) Wait until the device times out because nothing will happen

Zigbee2MQTT version

1.22.1

Adapter firmware version

20211217, 20210319, 20210708

Adapter

Ebyte CC2652P

Debug log

How to record a debug of something that doesn't happen? I've been monitoring the logs and while the permit_join is enabled, no interview process is initiated. The device sits idle until it times-out.

Fabiancrg commented 2 years ago

@convicte So I deleted and paired all routers not present in the backup file. After that I restarted Z2M to see the content of this file and found that 10 end devices are still missing. After a few hours I can tell that these 10 devices are not reporting anything anymore so I will pair them back.

Concerning the EPID, I also have dddd... so I guess I will have to sniff my network to see if I have PAN conflict.

cracrama commented 2 years ago

convicte Well that is actually good to know because for while i thought that i cought something very mysterious. Thanks for clarification. I know now where to start. Many thanks.

Fabiancrg commented 2 years ago

@convicte Is it really a problem having an EPID equal to dddddddddddddddd ?

Mine is like that but I have just ran a trace and and I can see this:

EPID

It looks OK to me as it's the same EPID on the coordinator and on the other devices on my network so I don't see any PAN id conflict either.

LCerebo commented 2 years ago

I've done the upgrade from a CC2531 to a sonoff 2652P, everything worked fine except for the pairing process of a device. I've sniffed the traffic and I notice that the coordinator is using a new EPID, but the already paired devices used the "empty" EPID: dddddddddddddddd. Turning off z2m, re-flash the coordinator, change the "extended_pan_id" to dddddddddddddddd, and start z2m solved the problem, for the moment. Now I can repair the devices, but some of them are using the other pan_id and I'm not able to pair them. Any idea on how to solve this conflict?

convicte commented 2 years ago

@Fabiancrg, if you don't have a conflict, I would assume everything is fine, as long as all your devices are on the same EPID. Outside that I must defer to @Koenkk since it's beyond my expertise.

@LCerebo I am glad the troubleshooting steps worked for you! In my case, all endpoint or router issues were resolved after forcibly removing them from the network and repairing. If these are not able to repair, are you also having the other problem where the router you try to pair to is not recognized in the backup file?

Fabiancrg commented 2 years ago

@convicte All my issues were fixed when I applied your procedure and removed/paired all routers that were not in the coordinator_backup.json file.

So I would like to help others and create a script that will scan three files:

  1. configuration.yaml
  2. database.db
  3. coordinator_backup.json

The output will list all missing routers with the friendly name, those ones will need to be removed/paired again Missing end devices with friendly names, same actions to perform but as a second steps as some might come back after pairing the missing routers. Extra devices... I currently don't know what to do with these, I can't force-remove them.

@Koenkk is there a way to force the generation of the coordinator_backup.json without restarting Z2M ? I think it will be better if the script can run without causing any downtime of Z2M.

Any idea if the conflict EPID could be detected without snifing the Zigbee network ?

LCerebo commented 2 years ago

@LCerebo I am glad the troubleshooting steps worked for you! In my case, all endpoint or router issues were resolved after forcibly removing them from the network and repairing. If these are not able to repair, are you also having the other problem where the router you try to pair to is not recognized in the backup file?

Yes, all the routers that use the empty pan_id "dddddddddddddddd" aren't listed in the coordinator_backup.json. I tried to repair them but they are listed as "unsupported devices". I don't know what to do to fix the pan_id conflict.

convicte commented 2 years ago

Sorry to sound like a broken record, but you have to follow the steps I outlined above. If the routers are not listed, they are not considered in the pairing process, and thus your network will be handicapped until this is resolved.

Review the routers needing to be repaired, starting from the one closest to the coordinator, and take it from there. In the meantime, please reflash the coordinator as indicated above.

If you have any specific questions or concerns, please let us know. Hopefully someone will be able to help if I fail to do so.

fuglphoenix commented 2 years ago

I can confirm that i had the same problem with the EPID. in my configuration it wasn't set to anything and it then defaulted to dddddd.... which I confirmed in Settings->RAW but for some reason it was changed in my coordinator_backup.json to the same value as my coordinator_ieee ?! I proceeded to stop Z2M, changed the EPID in my coordinator_backup.json reflashed with CC2652RB_coordinator_20211217 and started Z2M again. now it works and I can pair my devices through a router which was my initial problem

MartB commented 2 years ago

Edit: I did it again and now it sticks correctly.

Can someone please tell me in simple terms what my EPID is supposed to be and how i get it to stick?

Config:

"ext_pan_id": [
                221,
                221,
                221,
                221,
                221,
                221,
                221,
                221
            ],

Coordinator backup:

"extended_pan_id": "00124RESTOFIEEE",

I did what @fuglphoenix did, but reflashing does nothing, extended_pan_id keeps being my coordinator ieee even if i change the backup and re-flash.

arjanvanleent commented 2 years ago

Hello my "extended_pan_id": "dddddddddd", and not all end device are in the backup. How do I find the correct "extended_pan_id" ? Don't have another stick to snif Is it possible to use another installation?

drogfild commented 2 years ago

My story here. I could not pair new devices reliably, especially Ikea E1743 were failing. Some router devices behaved strange during joining; they removed them selves from the network after interview.

My coordinator is Smartlight CC2652P, don't have original fw version info anymore but it was stock from manufacturer.

I did EPID change to dddddddddddddddd with this guide https://github.com/Koenkk/zigbee2mqtt/issues/10339#issuecomment-1017521526 Also did those two reflash with 20211217 That dddd... is my guess as I don't have siffing device. But previously I was CC2531 so I took a wild guess.

That didn't seem to resolve any of my pairing issues. I then flashed dev version CC1352P2_CC2652P_other_coordinator_20220103

At least for now that seems to resolve my issues. I was able to pair those Ikea E1743 and all other devices seem to behave now correctly. This as an information to all others.

convicte commented 2 years ago

@drogfild Not sure if I understand you correctly, but changing from something TO dddd... is the exact opposite of my instructions. If I was not clear, my apologies. Considering it seems to work, you may have done something that helped, but it doesn't seem to be similar to my situation. If I were you, I would look into the backup file, to see if the routers are there, but that's just me. @arjanvanleent I am sorry, but I cannot help in this regard. The sniff to look for the EPID was the only route I was informed of.

drogfild commented 2 years ago

Thanks @convicte . In my case I had routers in the backup file before all this and they are there also afterwards. Haven't checked if all of my routers are there. Maybe I should check that.

It might be that all jumps and hoops with EPID was not needed for me. And new FW alone would have done the needed corrections for me. But as I don't have a sniffer I'm not able to tell. It was wild guess and maybe it was correct or maybe it wasn't, but I'm happy now that everything seems to be working.

There's pretty similar case about EPID with dddd.. at https://github.com/Koenkk/zigbee2mqtt/issues/9117#issuecomment-1017479712

It would be really cool to have some other way to see EPID conflicts than sniffing them. But that's most probably a limitation of zigbee network structure.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

mihaimdinca commented 2 years ago

I have the same issue with the CC2652P. Everything pairs with it but not with any other router in my ZigBee network. I found the discussion here pretty confusing and I was hoping that someone who actually managed to fix the problem could wrap things up in a sort of step by step tutorial for us noobs. Thank you in advance.

Fabiancrg commented 2 years ago

@mihaimdinca, if you have the same problem as I had, t's due to some router devices that are not know correctly by the coordinator. They might be working properly, e.g.: a bulb might be working but not forwarding the messages to the coordinator. You can try to remove and pair every router device one by one or look for the one(s) not defined correctly. For this, I did the python script bellow, you can execute it and it will list the routers that you will need to remove and pair again.

Copy the following code in Z2McheckDevices.py and execute it using python3 Z2McheckDevices.py

import json
import yaml

devices = {}     # Dictionnary with all devices
missingRouters = {}
missingEndDevices = {}
additionalDevices = {}

backupfile = open('/opt/zigbee2mqtt/data/coordinator_backup.json',"r")

DBfile = open('/opt/zigbee2mqtt/data/database.db', 'r')
Lines = DBfile.readlines()

# Parse database.db to retrive all devices IEEE address and type define in Z2M

print('Parsing database.db file')

for line in Lines:
    DBdata = json.loads(line)
    if "ieeeAddr" in DBdata and "type" in DBdata:
        if DBdata ["type"] != 'Coordinator':
            devices[DBdata["ieeeAddr"]] = [DBdata ["type"], "", False]
            #print(DBdata["ieeeAddr"] + " is a " + DBdata ["type"])

DBfile.close()
print('Parsing database.db file... Done')

# Parse coordinator_backup.json to retreive all IEEE present in coordinator

data = json.loads(backupfile.read())

for device in data['devices']:
     dev = devices.get("0x" + device['ieee_address'])
     if dev:
         devices["0x" + device['ieee_address']][2] = True
     else:
         devices["0x" + device['ieee_address']]=["Unknown","", True]

backupfile.close()
print('Parsing backup file... Done')

# Retreive friendly name from configuration.yaml

configfile = open('/opt/zigbee2mqtt/data/configuration.yaml',"r")

configData = yaml.safe_load(configfile)
configDevices = configData["devices"]

for key,value in devices.items():
    dev = configDevices.get(key)
    if dev:
        devices[key][1]=configDevices[key]["friendly_name"]
    else:
        devices[key][1]="Unknown"
    #print (key , '-->', value)

configfile.close()

# Print summary
for key,value in devices.items():
    if devices[key][0] == "Router" and not devices[key][2]:
        missingRouters[key] = devices[key][1]
    elif devices[key][0] == "EndDevice" and not devices[key][2]:
        missingEndDevices[key] = devices[key][1]
    if devices[key][0] == "Unknown":
        additionalDevices[key] = devices[key][1]

if len(missingRouters) > 0:
    print ("The following routers are not correctly set up, please remove them from Z2M config and execute the pairing operation again")
    for key, value in missingRouters.items():
        print ("\t" + key + " - " + value)
    if len(missingEndDevices) > 0:
        print ("Some end devices are missing too but you must first fix the routers")
elif len(missingEndDevices) > 0:
    print ("The following devices are not correctly set up, please remove them from Z2M config and execute the pairing operation again")
    for key, value in missingEndDevices.items():
        print ("\t" + key + " - " + value)
else :
   print ("Routers and end devices are correctly setup on the controller")

if len(additionalDevices) > 0:
   print ("Some devices are knwown by the controller but not present on your Zigbee network")
   for key, value in additionalDevices.items():
       print ("\t" + key)

PS: you will have to install python with the json and yaml modules. I did this script for Z2M running on Linux, if you are using Windows, you will have to update the paths PPS: make sure you have a recent coordinator_backup.json file, if you have the last Z2M version, this file is created every few hours, if you are running an older version, please restart Z2M before starting the script to trigger a backup (or upgrade to the latest Z2M version)

mihaimdinca commented 2 years ago

Thank you very much. I'll give it a try in the following days.

On Sun, Jul 10, 2022, 14:33 Fabian @.***> wrote:

@mihaimdinca https://github.com/mihaimdinca, if you have the same problem as I had, t's due to some router devices that are define correctly in Z2M. They might be working properly, e.g.: a bulb might be working but not forwarding the messages to the coordinator. You can try to remove and pair every router device one by one or look for the one(s) not defined correctly. For this, I did the python script bellow, you can execute it and it will list the routers that you will need to remove and pair again.

Copy the following code in Z2McheckDevices.py and execute it using python3 Z2McheckDevices.py

import` json import yaml

devices = {} # Dictionnary with all devices missingRouters = {} missingEndDevices = {} additionalDevices = {}

backupfile = open('/opt/zigbee2mqtt/data/coordinator_backup.json',"r")

DBfile = open('/opt/zigbee2mqtt/data/database.db', 'r') Lines = DBfile.readlines()

Parse database.db to retrive all devices IEEE address and type define in Z2M

print('Parsing database.db file')

for line in Lines: DBdata = json.loads(line) if "ieeeAddr" in DBdata and "type" in DBdata: if DBdata ["type"] != 'Coordinator': devices[DBdata["ieeeAddr"]] = [DBdata ["type"], "", False]

print(DBdata["ieeeAddr"] + " is a " + DBdata ["type"])

DBfile.close() print('Parsing database.db file... Done')

Parse coordinator_backup.json to retreive all IEEE present in coordinator

data = json.loads(backupfile.read())

for device in data['devices']: dev = devices.get("0x" + device['ieee_address']) if dev: devices["0x" + device['ieee_address']][2] = True else: devices["0x" + device['ieee_address']]=["Unknown","", True]

backupfile.close() print('Parsing backup file... Done')

Retreive friendly name from configuration.yaml

configfile = open('/opt/zigbee2mqtt/data/configuration.yaml',"r")

configData = yaml.safe_load(configfile) configDevices = configData["devices"]

for key,value in devices.items(): dev = configDevices.get(key) if dev: devices[key][1]=configDevices[key]["friendly_name"] else: devices[key][1]="Unknown"

print (key , '-->', value)

configfile.close()

Print summary

for key,value in devices.items(): if devices[key][0] == "Router" and not devices[key][2]: missingRouters[key] = devices[key][1] elif devices[key][0] == "EndDevice" and not devices[key][2]: missingEndDevices[key] = devices[key][1] if devices[key][0] == "Unknown": additionalDevices[key] = devices[key][1]

if len(missingRouters) > 0: print ("The following routers are not correctly set up, please remove them from Z2M config and execute the pairing operation again") for key, value in missingRouters.items(): print ("\t" + key + " - " + value) if len(missingEndDevices) > 0: print ("Some end devices are missing too but you must first fix the routers") elif len(missingEndDevices) > 0: print ("The following devices are not correctly set up, please remove them from Z2M config and execute the pairing operation again") for key, value in missingEndDevices.items(): print ("\t" + key + " - " + value) else : print ("Routers and end devices are correctly setup on the controller")

if len(additionalDevices) > 0: print ("Some devices are knwown by the controller but not present on your Zigbee network") for key, value in additionalDevices.items(): print ("\t" + key)

PS: you will have to install python with the json and yaml modules. I did this script for Z2M running on Linux, if you are using Windows, you will have to update the paths PPS: make sure you have a recent coordinator_backup.json file, if you have the last Z2M version, this file is created every few hours, if you are running an older version, please restart Z2M before starting the script to trigger a backup (or upgrade to the latest Z2M version)

— Reply to this email directly, view it on GitHub https://github.com/Koenkk/zigbee2mqtt/issues/10339#issuecomment-1179710676, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXGYEQEKUEWLHBNR3K3XEA3VTKYIZANCNFSM5KVRY4IA . You are receiving this because you were mentioned.Message ID: @.***>

RuinedOne commented 2 years ago

@Fabiancrg Thanks for your script. I was going nuts trying to figure out why nothing would route/pair w/anything but the coordinator (Sonoff CC2652) and this solved my issue.

odelma commented 1 year ago

Thanks for the script 😃 How should I fix: "Some devices are known by the controller but not present on your Zigbee network" - I guess they are only in the database, but I am not sure how I should remove them....?

Fabiancrg commented 1 year ago

Hi I put this for information but I don't now how to clean these devices. I am not sure it is causing any issue so I did not investigate further.

rhvs commented 1 year ago

@Fabiancrg

I tried to edit your code to use it directly in HA OS.... but no luck (I have to admit it was a long shot :) ). Would be great if you could have a look!

import json
import yaml

devices = {}     # Dictionnary with all devices
missingRouters = {}
missingEndDevices = {}
additionalDevices = {}

backupfile = open('/config/zigbee2mqtt/coordinator_backup.json',"r")

DBfile = open('/config/zigbee2mqtt/database.db', 'r')
Lines = DBfile.readlines()

# Parse database.db to retrive all devices IEEE address and type define in Z2M

print('Parsing database.db file')

for line in Lines:
    DBdata = json.loads(line)
    if "ieeeAddr" in DBdata and "type" in DBdata:
        if DBdata ["type"] != 'Coordinator':
            devices[DBdata["ieeeAddr"]] = [DBdata ["type"], "", False]
            #print(DBdata["ieeeAddr"] + " is a " + DBdata ["type"])

DBfile.close()
print('Parsing database.db file... Done')

# Parse coordinator_backup.json to retreive all IEEE present in coordinator

data = json.loads(backupfile.read())

for device in data['devices']:
     dev = devices.get("0x" + device['ieee_address'])
     if dev:
         devices["0x" + device['ieee_address']][2] = True
     else:
         devices["0x" + device['ieee_address']]=["Unknown","", True]

backupfile.close()
print('Parsing backup file... Done')

# Retreive friendly name from configuration.yaml

configfile = open('/config/zigbee2mqtt/configuration.yaml',"r")

configData = yaml.safe_load(configfile)
configDevices = configData["devices"]

for key,value in devices.items():
    dev = configDevices.get(key)
    if dev:
        devices[key][1]=configDevices[key]["friendly_name"]
    else:
        devices[key][1]="Unknown"
    #print (key , '-->', value)

configfile.close()

# Print summary
for key,value in devices.items():
    if devices[key][0] == "Router" and not devices[key][2]:
        missingRouters[key] = devices[key][1]
    elif devices[key][0] == "EndDevice" and not devices[key][2]:
        missingEndDevices[key] = devices[key][1]
    if devices[key][0] == "Unknown":
        additionalDevices[key] = devices[key][1]

if len(missingRouters) > 0:
    print ("The following routers are not correctly set up, please remove them from Z2M config and execute the pairing op>
    for key, value in missingRouters.items():
        print ("\t" + key + " - " + value)
    if len(missingEndDevices) > 0:
        print ("Some end devices are missing too but you must first fix the routers")
elif len(missingEndDevices) > 0:
    print ("The following devices are not correctly set up, please remove them from Z2M config and execute the pairing op>
    for key, value in missingEndDevices.items():
        print ("\t" + key + " - " + value)
else :
   print ("Routers and end devices are correctly setup on the controller")

if len(additionalDevices) > 0:
   print ("Some devices are knwown by the controller but not present on your Zigbee network")
   for key, value in additionalDevices.items():
       print ("\t" + key)
MnM001 commented 1 year ago

Hi,

@Fabiancrg I have tried this script with my Sonoff ZB Dongle-E.

It gave me a list of routers that needed to be "fixed". That list looked suspiciously like all the routers I have in my environment. So I took one router out (held the button until it left my z2m environment). Requested a new backup and then I run the script again. That router was gone from the list of routers that needed to be "fixed".

I waited a few minutes and then added the router back.

After I requested a new backup I run the script again and the router that I removed and re-added again is in the list of routers that needs to be "fixed".

How can that be? I just removed it (by button so the network key was deleted) and then add it back. It should had been fixed I assume.

Is the script not working as it should with the Sonoff ZB Dongle-E? Or I am doing something wrong?

Koenkk commented 1 year ago

The check is now supported directly from Zigbee2MQTT: https://github.com/Koenkk/zigbee2mqtt/pull/18599