home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.66k stars 29.95k forks source link

ZHA Devices Unavailable #38437

Closed flannelman173 closed 2 years ago

flannelman173 commented 4 years ago

The problem

Devices on my Zigbee network will routinely drop and become unavailable. In some instances the device info appears to be loading and "Last Seen" is up to date. All entities associated with the device have "unavailable" state. Usually re-pairing the device through ZHA will resolve the issue, but only for a time, maybe a few days. This is only happening on my Samsung devices. Aqara seems to be fine at the moment.

Here are a few screenshots.

zha_issue1 zha_issue2

Environment

Problem-relevant configuration.yaml

I only have the following modification in my config file:

#### ZHA MODIFICATION ####
zha:
    zigpy_config:
      ezsp_config:
          CONFIG_MAX_END_DEVICE_CHILDREN: 32  

Traceback/Error logs

Log file attached. During the logging, I also ran configure, add device option in the ZHA integration page for more logging info.

ZHA_Log.txt

Additional information

Added per request on #37442

My network consists of 18 devices at the moment. I am using the HUSBZB-1 stick and have 3 plugs for additional routers.

probot-home-assistant[bot] commented 4 years ago

Hey there @dmulcahey, @adminiuga, mind taking a look at this issue as its been labeled with an integration (zha) you are listed as a codeowner for? Thanks! (message by CodeOwnersMention)

blackscreener commented 4 years ago

@dmulcahey, @Adminiuga Hi. I use Sonoff Zigbee Bridge, when I disconnect it from power I would to have got unavailable entities.

Adminiuga commented 4 years ago

The 0x1e8e device is off the network or it changed its address

bellows.zigbee.application] No such device 0x1583

Power cycle it by removing the battery, if it does t help then you need to re-pair it.

The other device is on the network but it was stored incompletele, so you have to remove from zha 1st, make sure it was removed and then re-pair it.

Other than that everything works as expected.

flannelman173 commented 4 years ago

The 0x1e8e device is off the network or it changed its address

bellows.zigbee.application] No such device 0x1583

Power cycle it by removing the battery, if it does t help then you need to re-pair it.

The other device is on the network but it was stored incompletele, so you have to remove from zha 1st, make sure it was removed and then re-pair it.

Other than that everything works as expected.

This is what I typically have to do to get them to work again. My process is remove device from ZHA --> reboot HA --> remove battery from device --> ZHA pairing mode --> insert battery

This will get the device to show up again with available entities, but then a day or two later it will drop again, or another one will. Rinse and repeat.

Before anyone suggests it, I have sufficient repeaters spaced around the home, my 2.4 GHz channels are well spaced out and zigbee hub is spaced away from internet router. I had a solid mesh before these issues started.

Adminiuga commented 4 years ago

Since it has unk_manufacturer, it may not been removed correctly/completely. Copy device's ieee address. Remove the device from UI Call zha.remove service with ieee_address: device_ieee and make sure you get warning "device not found for removal" Then rejoin the device, by resetting it, not but pulling the battery

flannelman173 commented 4 years ago

Stopping by to drop an update. I still have the same issues as originally posted. When a device becomes unavailable, the last seen value is current but the name and mfg are "UNK", like the second image on the original post. Proceeding as suggested will restore the device but not prevent from dropping later.

Adminiuga commented 4 years ago

Complete these steps:

  1. Copy device's ieee address.
  2. Remove the device from UI
  3. Call zha.remove service with ieee_address: device_ieee and make sure you get warning "device not found for removal"
  4. Then rejoin the device, by resetting it, not but pulling the battery
flannelman173 commented 4 years ago

This is the method I referred to for restoring devices. It works to restore but it will not prevent from becoming unavailable.

Adminiuga commented 4 years ago

Make sure homeassistant user has write privileges to zigbee.db file. enable debug logging per zha integration documentation. post debug log for step # 3 and step #4

flannelman173 commented 4 years ago

Write privileges are good. I just noticed your comment on step # 3 - I do not get the "device not found for removal" warning when I run the service.

Adminiuga commented 4 years ago

run it twice, but wait about 45s between each run. For battery operated devices there's a timeout of up to 45s

Idskov commented 3 years ago

Same issue here (almost) Did the proposed steps here fix the issue @flannelman173 or is it still the same?

My setup: Zha + sonoff zigbee bridge (tasmota flashed) + 27 devices, mostly Mijia and Aqara with 67 entities.

Everything fully up to date.

Mine do not seem to drop over time, or maybe I just haven't had it running long enough for that to happen, but when ever the bridge is powered off and back on, all devices need to be re-paired to get them out of the unavailable state. Seems to make no difference if I remove and re-add or just go directly to adding the devices.

mrneutron42 commented 3 years ago

I'm seeing the same "unavailable" behavior from Sonoff SNZB-02 temperature sensors. see: #https://community.home-assistant.io/t/sonoff-snzb-02-temp-humidity-sensor-disconnects/231618

The temp sensors go "unavailable" and quit sending reports. I found that re-pairing them (without removing them from the Devices list in HA) will allow them to work with flakey reliability, often showing low battery readings of 40-75% (even with new batteries in the sensors). In this state, they go "unavailable" again after 5-30 minutes.

I manually added zha_map to my HA to try and diagnose the recurring "unavailable" issue.

I found that if I remove the problem temperature sensors from the HA Device list, then re-add them, they work perfectly (even showing 100% battery capacity) for some number of days. There’s something significant that happens when you remove the device from the HA database, and re-add it!

I saved a 2-day section of my HA log which includes the time period I was fighting with the temp sensor I have labeled Temp2 (Zigbee address 0x7e2e). HA log 12-20-2020 to 12-21-2020.txt

cydia2020 commented 3 years ago

Seeing the same behaviour with my ZHA setup on core-2021.1.4.

The coordinator is a sonoff zigbee bridge flashed with tasmota 9.2.0.2, which is paired with 39 devices and have 74 entities, most of these are sonoff and IKEA motion sensors.

Edit (20210308231114): I've taken the EZSP module out of the zbbridge, hooked it up to an USB-TTL adapter, and connected it directly to my Home Assistant VM (bypassing the instability of Wi-Fi), and I am still observing the same issue, but with less frequency.

The tradfri bulbs and tradfri motion sensors become unpaired the most, the bulbs become unavailable roughly every 48 hours, and the motion sensors every week. As mentioned above, a power cycle is required for them to become functional again.

This doesn't seem to happen with Zigbee2tasmota.

Edit (20210326212515): this issue hasn't resurfaced quite a while for me (hasn't recurred after the last update I posted on 20210308), so unstable connections (serial over Wi-Fi) might be a factor here.

Edit (20210624): issue resurfaced after something, I haven't been able to pinpoint the exact cause yet, but the setup was stable for about 3 months.

techcubs commented 3 years ago

I am having the same issue as well. Using sonoff zigbee bridge flashed with tasmota 9.2. Although I don't have many devices but they are sitting one floor apart only. The devices next to the sonoff bridge are available. All others show as unavailable.

MattWestb commented 3 years ago

For the original post and users with Sonoff Zigbee Bridge (both is using Silabs coordinator) is very likely having problems with that the coordinator is not setting up the pull control corect on sleeping end device. One bug is fixed and firmware is new cooked but not recommended to install then its only 24 hours old. So please waiting little for the devs verifying it to working OK.

Sor Samsung and IKEA controller devices (and other with silicon labs like now sonoff ones) its also some bugs that cant being fixed in the host side but both deCONZ and Z2M have implemented workarounds that is reconfiguring the pull control of end devices.

For both cases you can doing one "manual workaround" in ZHA by opening the device card and clicking on "reconfigure this device" and walking the device up if its one sleeping one before doing the clicking.

I have had some of my IKEA E1743 that was not reporting for over 24 hours but after sending "reconfigure this device" they is all reporting herself to the coordinator and is not going offline.

Shellfishgene commented 3 years ago

I have the same issue with the Sonoff Temp sensors and the Sonoff bridge. They disappear after a few hours or days, but upon repairing send at least one value again, but usually disappear after that or a few hours again. This is with the updated Tasmota firmware. It's not a battery issue. The sensors work as expected if they are not directly connected to the Sonoff zbbridge, but via a router.

MattWestb commented 3 years ago

With IKEA remotes ZHA is not setting the battery (and other attribute) reporting OK then pairing them (with luck one report every 24 hours).

If sending one "reconfigure device" from the device card and waking the device up so its not sleeping and can receiving the commands they start reporting every hour.

I think the same shall working for your temperature sensor at least it worth trying.

And f you can using routers the sleeping end devices is working better as you is is having 👍.

If you like trying one more fix so have Sonoff released one newer firmware for the ZBB that is fixing some false settings in the firmware https://github.com/arendst/Tasmota/issues/10413#issuecomment-790274262

Final-Hawk commented 3 years ago

Experiencing the same issue. I have tried multiple repairings and am currently running 6.7.9 on my sonoff bridge. Strangely enough this does not happen to my other sensor of the same model. It goes offline after ~6hours

MattWestb commented 3 years ago

@Final-Hawk Is your problem sensor direct connected to the coordinator or to one router ?

And after pairing the sensor do one reconfigure so its getting the binding and reporting stetted up OK.

I dont have any SonOff sensor for testing but IKEA controllers is needed getting the reporting to working OK so very likely ZHA is doing the same with them.

MattWestb commented 3 years ago

Dine forgetting waking up the sensor before ding the reconfigure so its not sleeping and missing your commands.

Final-Hawk commented 3 years ago

@MattWestb Im not sure what it's connected to at the moment, but I have a ikea router nearby. And I will try another reconfigure to see what happens today

MattWestb commented 3 years ago

You can look on the network card in HA: Integration, ZHA, configure and "Visualization tab".

The map is not updating so oft but you can see how the devices is connected after doing one "refresh topology".

Final-Hawk commented 3 years ago

Alright it refreshed after a bit and is now reporting being connected to my ikea extender (router). It's annoying because the issue happens over time and is not instant, Which makes it quite a process to fix

MattWestb commented 3 years ago

It can being that the sensor is "jumping" from one parent to one other and is not liking its parent. If you is having one OSRAM plug you is having great problems with not reporting sensors because the plug is not sending the data from its children.

Take one look on the network map little now and then and see where your sensor is and wot is good and bad.

github-actions[bot] commented 3 years ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

irqnet commented 3 years ago

I can confirm that the problem still exists with core-2021.6.6 / supervisor-2021.06.6 / Home Assistant OS 6.1.

I'm using a Conbee II USB Stick and from time to time my Osram Smart Plugs beacme unavaible. Re-pairing the devices solves the issues temporary.

MattWestb commented 3 years ago

The old OSRAM smart plug is very tricky then its corrupting packages and is making end devices leaving then they is not getting all packages the need from the coordinator (then the they is rejected for other device then its corrupted) :-((

irqnet commented 3 years ago

The old OSRAM smart plug is very tricky then its corrupting packages and is making end devices leaving then they is not getting all packages the need from the coordinator (then the they is rejected for other device then its corrupted) :-((

Would it be solved if I switch to the Ledvance model of the plug? Got some in spare to exchange them.

MattWestb commented 3 years ago

I dont have the experience of the new "OSRAM" ala Ledvance but you can always trying if its working better or not. The most inported is that you is potting the old plug out of the system so it cant making problem then you is testing the new one.

Giving it one try swapping them, its worth it if its working OK !!

irqnet commented 3 years ago

I dont have the experience of the new "OSRAM" ala Ledvance but you can always trying if its working better or not. The most inported is that you is potting the old plug out of the system so it cant making problem then you is testing the new one.

Giving it one try swapping them, its worth it if its working OK !!

I've replaced two OSRAM plugs that had some issues in the last days with two brand new Ledvance plugs and currently they remain available even if I reboot HA. So, let's see how it goes.

MattWestb commented 3 years ago

I hope the new one dont have the problem so your network is being stable. I holding my 👍 for its working well for You !!

cydia2020 commented 3 years ago

Edit (20210624): issue resurfaced after something, I haven't been able to pinpoint the exact cause yet, but the setup was stable for about 3 months.

Can confirm this is still a problem, my USB to EZSP serial setup was running stable for about 3 months* and then most of my 30W tradfri drivers went offline

MattWestb commented 3 years ago

@cydia2020 Do you have looking if your Zigbee channel is not being blocked of WiFi interference ? Use one WiFi app for scanning your "surroundings" and look on the Wiki pages for how the channels is allocated in Zigbee and WiFi https://github.com/zigpy/zigpy/wiki/Zigbee---Changing-channel.

All IKEA device can crashing because of one well known bug in the Zigbee stack but its shall not being so oft as 3 weeks if you is not "repower" your network very oft (its being triggered then devices is coming back after being repowered and sending one parent accouterments and is hitting most vendors not only IKEA) ;-)

irqnet commented 3 years ago

Update: after I've removed those two old Osram smart plugs that have become unresponsive on a regular base, the network seems to be stable for 24h now.

Nothing else has been changed, just removed those two crappy plugs :D that's kind of weird if you are searching for the issue on the central components but then it's maybe one zigbee device that messes up the whole network.

MattWestb commented 3 years ago

Normally they is working but they is corrupting packages and the network is throwing the bad packages and the problem for most packages its going OK but some is only "lost in space" and is blocking the traffic but not for all types ;-((

It wold have being better if working OK / Not working but the life is not black and white.

Hope your system is staying stable and not getting more problems !!

I have very long time looking on the outdoor plug then its looks good but have not baying it and im glad i was not doing it but i still need some for the Christmas light on the balcony but LIDL (tuya white label) is coming with one but it was delayed in the store so i have not getting them but i still have some time to fixing it before Rudolf is coming ;-)

irqnet commented 3 years ago

As I got it right, ZHA is able to update the firmware of the plugs if I have enabled the settings in the configuration.yaml, right? Not sure whether it is possible to check the fw version of the devices.

MattWestb commented 3 years ago

From the device card > manage cluster > Basic cluster > Cluster attribute > SW_Build_ID then clickingget Zigbee attribute and you is getting the string with the current version.

For OTA updates add this in your HA config:

zha:
  zigpy_config:
    ota:
      otau_directory: /config/zigpy_ota
      ikea_provider: true
      ledvance_provider: true

You can skipping IKEA if you dont have devices from them. Then you can putting local OTA files if you have in the folder or letting the system trying finding the right ones.

Look in the wiki https://github.com/zigpy/zigpy/wiki/OTA-Device-Firmware-Updates

irqnet commented 3 years ago

Just checked, the "old" OSRAM plugs are on V1.04.90 and the Ledvance plugs I have placed yesterday are on V1.05.09. May I can check whether an update on the old plugs improves the stablitity.

Found that link:

https://www.admin-enclave.com/en/articles/linux/483-use-deconz-to-perform-an-ota-update-of-osram-devices.html

MattWestb commented 3 years ago

I think its better waiting and see if your network is staying stable and doing the updating of the old plugs then you knowing its OK so not running in new problems.

But you can downloading the firmware from the website so you is having it before they is shutting down the servers in August 2021 !!

github-actions[bot] commented 2 years ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

SawadeeKC commented 2 years ago

Hi, I don't understand, my Sonoff dongle has been working well for several months with ZHA and the 13 Zigbee devices. But after the 2 last HA Core updates, all Zigbee devices became unavailable, as well as the Sonoff coordinator. I've been obliged to unplug the Sonoff dongle from my Nuc, restart HA, replug the dongle to be able to be detected, then repair all devices one by one. I have a second HA in another place, also with Sonoff dongle and ZHA but with 40 Zigbee devices and I haven't updated HA Core, too scared the same problem happens. What could be wrong?