home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.8k stars 30.5k forks source link

ZHA Paring mode looping, devices unavailable, devices not responding - SkyConnect coordinator. #98624

Closed fabiobaldoibs closed 9 months ago

fabiobaldoibs commented 1 year ago

The problem

I migrated from 2mqtt 2 weeks ago since I changed the coordinator to Skyconnect (I'm new on ZHA). When since that I've been having a lot of problems.

It is essential to say that I have not changed any of my infrastructure since that, like device places, network, or wifi router. all is the same like 2 weeks ago when my 2mqtt was working.

The first issue it linked I opened a couple of days ago: https://github.com/home-assistant/core/issues/98402

I had a problem with 6 devices in total (my network has 40 devices): 2 lamps bulb (same manufacturer) 2 routers (sonoff dongle plus with router firmware) 1 sonoff mini 1 led strip controller

All of them have the same behavior: The ZHA finds and adds the device but the device continues paring mode and ZHA finds it again and again and again and this loop goes forever or when the devices stop the paring mode (maybe a timeout)... the device stays with an unavailable state.

The second issue

Last night I had an electric issue with my concessionary (the energy got on/off several times) and because of that some devices were reset, and because of that it need to repair.

On this process, one device had the issue of looping on paring mode. The ZHA finds the device and the device continues in paring mode, ZHA finds it again, and again... several times until the device stops (timeout), and inside the ZHA the device looks fine, but does not respond to any command.

I tried a lot of things like changing the place of the devices with problems (near the coordinator), factory reset, paring a lot of time in sequence, etc.

I really need help to understand that.

Thanks

What version of Home Assistant Core has the issue?

core-2023.8.2

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ZHA

Link to integration documentation on our website

https://www.home-assistant.io/integrations/zha/

Diagnostics information

zha-2a22fab467807a6c197677d971184ff0-_TZ3000_bvrlqyj7 TS0002-4804119551430e4f37fa042778c94783.json.txt config_entry-zha-2a22fab467807a6c197677d971184ff0.json.txt

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

I'm using a SkyConnect coordinator with the last firmware

home-assistant[bot] commented 1 year ago

Hey there @dmulcahey, @adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration (zha) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `zha` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign zha` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


zha documentation zha source (message by IssueLinks)

puddly commented 1 year ago

Can you enable debug logging for the ZHA integration (https://www.home-assistant.io/integrations/zha/#debug-logging), restart HA Core, and reproduce the problem with these devices? You can email the entire home-assistant.log file to me if you don't want to ZIP it and post it in a comment here.

fabiobaldoibs commented 1 year ago

@puddly

I'm not home now, so I'll do it tomorrow. Would you like a only file with the debug for all devices or do you prefer one debug file with each device separately?

Thanks!

puddly commented 1 year ago

One single file is fine. Thank you!

fabiobaldoibs commented 1 year ago

logs.zip

fabiobaldoibs commented 1 year ago

Today I bought more two Zigbee devices. A sonoff mini, that went well, and a TS110E by _TZ3210_ysfo0wla that I can't pair. The same issue with the other ones. Debug log in attachment.

home-assistant_zha_2023-08-19T21-12-40.614Z.log.zip

MattWestb commented 1 year ago

If i remeber is the SonOff Mini having some strange bugs and we have not getting sniffs of it so we is not knowing what is going wrong with it. What is the model / brands of your: 2 lamps bulb (same manufacturer) 1 led strip controller ?? Also if not well known devices post the IEEE of them (or the first half if you dont like posting it complete).

fabiobaldoibs commented 1 year ago

lamp bulb: IEEE: a4:c1:38:5d:a7:6c:c8:67 Nwk: 0xd34a TS0505B by _TZ3210_jd3z4yig

Light dimmer: IEEE: a4:c1:38:ef:32:62:0e:71 Nwk: 0x49b2 TS110E by _TZ3210_ysfo0wla

Led controller IEEE: a4:c1:38:31:ed:be:63:50 Nwk: 0xca82 TS0504B by _TZ3210_onejz0gt

led controller (another that I tested today) IEEE: a4:c1:38:19:04:6b:74:c5 Nwk: 0x3e05 TS0504B by _TZ3210_onejz0gt

Switch ( 2 gang) IEEE: a4:c1:38:87:68:fb:ca:09 Nwk: 0xc923 TS0002 by _TZ3000_bvrlqyj7 ps. this device was working well for 2 weeks

MattWestb commented 1 year ago

All is a4:c1:38 thats the new Telink chip and tuya is making Zigbee modules with the same pin/pads so can being replaced on all there devices. I have some dimmer / switches with them but dont have any large problems with them and i think Puddly have the same experience and i hope hi see somthing in your debug logs or we need sniffing the paring but its need one more Zigbee stick / Module.

fabiobaldoibs commented 1 year ago

Let me know if you want more information, log, tests, etc. I'm here to help.

just in case, before zha I was using 2mqtt. Some of these devices were bought over 2 years ago, and all was fine - on 2mqtt, I used the Sonoff dongle model P.

puddly commented 1 year ago

The Tuya device is just deciding to leave:

# Device joined
18:09:34.723  <==   trustCenterJoinHandler: [0xf5b9, a4:c1:38:ef:32:62:0e:71, <EmberDeviceUpdate.STANDARD_SECURITY_UNSECURED_JOIN: 1>, <EmberJoinDecision.USE_PRECONFIGURED_KEY: 0>, 0x3a71]
18:09:34.773  <==   incomingMessageHandler: [<EmberIncomingMessageType.INCOMING_BROADCAST: 4>, EmberApsFrame(profileId=0, clusterId=19, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_NONE: 0>, groupId=0, sequence=132), 108, -73, 0xf5b9, 255, 255, b'\x00\xb9\xf5q\x0eb2\xef8\xc1\xa4\x8e']

# ZHA initialization
18:09:37.992   ==>  sendUnicast: (<EmberOutgoingMessageType.OUTGOING_DIRECT: 0>, 0xf5b9, EmberApsFrame(profileId=260, clusterId=8, sourceEndpoint=1, destinationEndpoint=1, options=<EmberApsOption.APS_OPTION_ENABLE_ROUTE_DISCOVERY: 256>, groupId=0, sequence=163), 164, b'\x00\xa3\x00\x00\x00')
18:09:38.017  <==   messageSentHandler: [<EmberOutgoingMessageType.OUTGOING_DIRECT: 0>, 62905, EmberApsFrame(profileId=260, clusterId=8, sourceEndpoint=1, destinationEndpoint=1, options=<EmberApsOption.APS_OPTION_ENABLE_ROUTE_DISCOVERY: 256>, groupId=0, sequence=80), 164, <EmberStatus.SUCCESS: 0>, b'']
18:09:38.040  <==   incomingMessageHandler: [<EmberIncomingMessageType.INCOMING_UNICAST: 0>, EmberApsFrame(profileId=260, clusterId=8, sourceEndpoint=1, destinationEndpoint=1, options=<EmberApsOption.APS_OPTION_RETRY|APS_OPTION_ENABLE_ROUTE_DISCOVERY: 320>, groupId=0, sequence=153), 140, -65, 0xf5b9, 255, 255, b'\x18\xa3\x01\x00\x00\x00 Y']
18:09:40.964  <==   incomingMessageHandler: [<EmberIncomingMessageType.INCOMING_UNICAST: 0>, EmberApsFrame(profileId=0, clusterId=2, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_RETRY|APS_OPTION_ENABLE_ROUTE_DISCOVERY: 320>, groupId=0, sequence=154), 140, -65, 0xf5b9, 255, 255, b'\x03\x00\x00']
18:09:45.957  <==   incomingMessageHandler: [<EmberIncomingMessageType.INCOMING_UNICAST: 0>, EmberApsFrame(profileId=0, clusterId=2, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_RETRY|APS_OPTION_ENABLE_ROUTE_DISCOVERY: 320>, groupId=0, sequence=156), 104, -74, 0xf5b9, 255, 255, b'\x04\x00\x00']
18:09:45.985  <==   incomingMessageHandler: [<EmberIncomingMessageType.INCOMING_BROADCAST: 4>, EmberApsFrame(profileId=0, clusterId=0, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_NONE: 0>, groupId=0, sequence=158), 108, -73, 0xf5b9, 255, 255, b'\x05\xcf\xc9\xef\xfe\xff\x81v(\x00\x00']
18:09:50.958  <==   incomingMessageHandler: [<EmberIncomingMessageType.INCOMING_UNICAST: 0>, EmberApsFrame(profileId=0, clusterId=2, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_RETRY|APS_OPTION_ENABLE_ROUTE_DISCOVERY: 320>, groupId=0, sequence=159), 108, -73, 0xf5b9, 255, 255, b'\x06\x00\x00']
...

# Device leaves
18:09:55.952  <==   trustCenterJoinHandler: [0xf5b9, a4:c1:38:ef:32:62:0e:71, <EmberDeviceUpdate.DEVICE_LEFT: 2>, <EmberJoinDecision.NO_ACTION: 3>, 0xffff]

There's no quirk for _TZ3210_ysfo0wla so it's probably just yet another broken Tuya device. This is unrelated.

fabiobaldoibs commented 1 year ago

so, do I need to give up and go back to 2mqtt?

fabiobaldoibs commented 1 year ago

For a temporary resolution, I put another coordinator with z2m (a different channel) and I paired those devices on it. So now I have two ZigBee network in the HA, but all devices are working now. :)

MattWestb commented 1 year ago

Interesting !! If i was you i shall trying migrating the Z2M coordinator to ZHA and see if the device is still working OK then thes is in the network of the coordinator (warning if migrating to one other hardware it can being must burning the IEEE and (its can only being done ones) of the old coordinator for getting it working but moving it shall not being any problems)

fabiobaldoibs commented 1 year ago

It will be funny because I just migrate from z2m because I bought a Skyconnect coordinator - my zigbee network was pretty good until that, but I like challenges, lol.

puddly commented 1 year ago

Download the latest firmware for the SkyConnect (https://github.com/NabuCasa/silabs-firmware/raw/main/EmberZNet/beta/NabuCasa_SkyConnect_EZSP_v7.3.1.0_ncp-uart-hw_115200.gbl) and flash it with https://skyconnect.home-assistant.io/firmware-update/. You'll have to temporarily plug your SkyConnect into another computer.

After that's done:

  1. Enable ZHA debug logging (https://www.home-assistant.io/integrations/zha/#debug-logging).
  2. Stop Z2M to release access to your old coordinator.
  3. Re-configure ZHA to use the same coordinator as Z2M. You can do this from the ZHA configuration page, via the "migrate radio" option. When selecting the Z2M coordinator, make sure you select "Use existing network settings", so ZHA keeps your old network.
  4. Reboot your mains-powered devices and re-insert the batteries into your battery-powered ones. They should then announce themselves and ZHA will pick them up, without them actually joining the network.

Once that's done, see if things work. If they still do, migrate to the SkyConnect again and see what happens then.

You can upload the whole debug log here.

MattWestb commented 1 year ago

Bee sure running on EEZSP 7.3.1.0 (have you implanting "soft" IEEE in latest ZHA for the latest EZSP Puddly ?) or the IEEE is permanent copied and burned from the old coordinator and cant being changes in the chip.

fabiobaldoibs commented 1 year ago

home-assistant_zha_2023-08-23T23-10-33.549Z.log.zip

fabiobaldoibs commented 1 year ago

looks like the problem is the SkyConnect not zha.

puddly commented 1 year ago

The network settings for your most recent log don't match what was in the original debug info.

Can you provide the same log after migrating to the SkyConnect using the ZHA migration feature? What is the log you just posted showing?

fabiobaldoibs commented 1 year ago

look what I did:

stopped the debug, and posted it

puddly commented 1 year ago

Something isn't adding up. Your SkyConnect is using channel 25 according to the debug log (backup date August 23). The backup you restore to your Sonoff stick, however, is from August 12th, using channel 26. These are two separate networks and many of your devices won't move to the "new" network because they were never notified of the change.

I believe what is happening here is that you migrated in the past from the Sonoff stick to the SkyConnect but then restarted Z2M with the old Sonoff concurrently. This would mean that you have two identical networks running on channels 25 and 26 (or even on the same channel), with the same coordinator IEEE addresses, network keys, and so on. This isn't good and may explain why devices are getting confused.


To actually test the SkyConnect in isolation properly, what you need to do is form a new network, with completely random settings. ZHA by default won't overwrite the IEEE address of the coordinator when creating a new network so I've attached a ZHA backup JSON here that will form a new network on channel 11 with random settings: ZHA backup 2023-08-24T02-01-40.233Z.json (rename it from .json.txt to .json after downloading).

To use it, in the ZHA integration's configuration click the "migrate radio" button, then click "re-configure current radio", and finally select "upload a manual backup". Upload the backup JSON from above and try joining your devices again. This may fix your problem.

fabiobaldoibs commented 1 year ago

The channel is correct. I'm tried to solve this... so I read some articles that point that channel 26 can be a problem with some devices, so I changed the channel to 25. All ZHA network is working on channel 25 since that.


This method that you propose (new network with a backup). Will I need to repair the devices? (some of them are in difficult access). Will I need to rename the entities?

fabiobaldoibs commented 1 year ago

Tomorrow, I'll do a test.

I'll start a new HA from scratch, put the SkyConnect coordinator on it, and try to pair those devices that are a problem now. I'll debug all and post it.

puddly commented 1 year ago

This method that you propose (new network with a backup). Will I need to repair the devices?

Yes, there is no way to migrate them.

Will I need to rename the entities?

As long as you never delete the ZHA integration and only ever use the "migrate radio" button, you won't have to.

I'll start a new HA from scratch, put the SkyConnect coordinator on it, and try to pair those devices that are a problem now. I'll debug all and post it.

If you do this, make sure to restore the above backup so that your SkyConnect settings in no way match your old coordiantor's settings.

fabiobaldoibs commented 1 year ago

I was thinking of not using a backup. this installation will be only for testing, not in my real network.

puddly commented 1 year ago

Using the backup I provided above is important specifically because your SkyConnect currently shares some network settings with your Sonoff stick. Even if you form a new network, its IEEE address will be the same, which will cause conflicts. The backup I provided above has randomly-generated settings.

fabiobaldoibs commented 1 year ago

ok, I got it.

fabiobaldoibs commented 1 year ago

an update... I made an unexpected trip and because of that, I didn't make the test.

I migrated my network to my old coordinator (all devices) and put Sky Connect on the shelf for a while. Everything is working fine.

I'll probably take the tests next weekend.

ljorg commented 1 year ago

Maybe not really helpful, but when I migrated from Conbee II to SkyConnect I had the exact same issue with some devices that were stuck in a pairing loop and never paired. They connected up, and I could control them for a while, and then they left. Seems like the device never accepted the pairing. These are three smart plugs with product code "TS011F" and manufacturer "_TZ3000_2putqrmw". And a sensor with product code "TS0201" and manufacturer "_TZ3000_bguser20". No matter what I did, I couldn't get them to stick the pairing.

Decided to go back to Conbee II, and they instantly paired up with no issues.

fabiobaldoibs commented 1 year ago

I did a new network from scratch, and the devices that did not work before, are now ok.

So, the problem is the migration? IEEE address?

home-assistant_zha_2023-09-02T17-54-29.772Z.log

puddly commented 1 year ago

I did a new network from scratch

Is this with the backup I provided above?

fabiobaldoibs commented 1 year ago

no, i got an error when i tried to use the backup.

but, the new network is with channel 11 too.

fabiobaldoibs commented 1 year ago

do you wanna that I do something else?

fabiobaldoibs commented 1 year ago

@puddly

this weekend I did a new zigbee network from scratch. First, I removed all devices from my old network and used new aleatory configurations.

All devices joined the network with no problems. after that, I renamed the devices to their original names.

so far is all working perfectly.

ps. I'm using skyconnect.

BooBoss commented 1 year ago

Download the latest firmware for the SkyConnect (https://github.com/NabuCasa/silabs-firmware/raw/main/EmberZNet/beta/NabuCasa_SkyConnect_EZSP_v7.3.1.0_ncp-uart-hw_115200.gbl) and flash it with https://skyconnect.home-assistant.io/firmware-update/. You'll have to temporarily plug your SkyConnect into another computer.

After that's done:

  1. Enable ZHA debug logging (https://www.home-assistant.io/integrations/zha/#debug-logging).
  2. Stop Z2M to release access to your old coordinator.
  3. Re-configure ZHA to use the same coordinator as Z2M. You can do this from the ZHA configuration page, via the "migrate radio" option. When selecting the Z2M coordinator, make sure you select "Use existing network settings", so ZHA keeps your old network.
  4. Reboot your mains-powered devices and re-insert the batteries into your battery-powered ones. They should then announce themselves and ZHA will pick them up, without them actually joining the network.

Once that's done, see if things work. If they still do, migrate to the SkyConnect again and see what happens then.

You can upload the whole debug log here.

I had exactly the same problem as author did with two devices. Endless pairing loop where HA says pairing is done but device is still blinking. One of those devices was thermometer TS0201 which was strange because I already had second thermometer like this which worked perfectly fine. So why another piece of identical thermometer shouldn't work?

ALL I DID was to upgrade SkyConnect firmware to 7.3.1.0. Before reading this threat it came into my mind to try upgrade SkyConnect firmware which I did. BUT... mine was 7.1.x.x and the firmware offered on official SkyConnect website was 7.2.x.x at the time. So I updated to 7.2 but it didn't helped. After reading this thread I came across the link to firmware 7.3.1.0. So updated to this version and both devices paired without any issues.

fabiobaldoibs commented 1 year ago

@BooBoss Probably there is an issue with the migrate process. For me, the only thing that worked was starting a new network from scratch using random parameters.

JHurk commented 1 year ago

Just my two cents; I had the same issue and first just avoided using some devices. The issue I created (with extra information) is here: https://github.com/home-assistant/core/issues/95288

Yesterday I tried adding another new device (4ch Zigbee+RF dry contact switch, powered by 7-32v), and with this device the pairing loop started again. I needed this specific device with these exact specifications and I could not get another similar device. Therefor I looked for an answer again, and found the following;

My Zigbee-adapter is a Home Assistant SkyConnect. So I checked the firmware on the stick which was at version 7.1.0, and after checking found that 7.3.1 is the latest. I upgraded the firmware (very easy with the add-on in Home Assistant) and after enabling ZHA again and adding the device it did stick. While before I would go straight into a pairing loop every 20 seconds. Been almost a day now and the device hasn't dropped since.

More info on upgrading firmware of Home Assistant SkyConnect: https://github.com/NabuCasa/silabs-firmware

So might be worth checking the firmware on your Zigbee stick (whether it be SkyConnect or another one).

issue-triage-workflows[bot] commented 9 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

jurhey commented 9 months ago

I have the same issue, after I installed the treat/matter firmware, my devices where not there anymore. after I use the firmware web updater with the latest Skyconnect firmware it's not finding any devices. really not sure how to solve this..

I have tried to remove ZHA completely and added again also rebooted multiple times my RPI3B+

any ideas really appreciated

dmulcahey commented 9 months ago

I have the same issue, after I installed the treat/matter firmware, my devices where not there anymore. after I use the firmware web updater with the latest Skyconnect firmware it's not finding any devices. really not sure how to solve this..

I have tried to remove ZHA completely and added again also rebooted multiple times my RPI3B+

any ideas really appreciated

This issue is very stale. Please open a new issue and completely fill out the requested information