home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.66k stars 30.41k forks source link

Powerview Polling causing timeout errors #73900

Closed kingy444 closed 6 months ago

kingy444 commented 2 years ago

I put some logging in the new polling implemented under https://github.com/home-assistant/core/pull/73659 This PR resolvee https://github.com/home-assistant/core/issues/70043 but has unfortunately created an issue where shades will timeout from time to time.

While the hub can process the requests the shades cannot always return a result in a timely manner. I have 5 hardwired shades but only 2 enabled for polling and still see these around every 6 hours or so.

These shades are all TDBU, which the current code would cause to poll twice however this is unrelated as testing is the same under PR to resolve that issue: https://github.com/home-assistant/core/pull/73899

image

Originally posted by @kingy444 in https://github.com/home-assistant/core/pull/73659#issuecomment-1162798204

bdraco commented 2 years ago

If its still chatty with your fix, I think it would be fine to suppress any errors from the single polls since we should get errors from the main coordinator poll

probot-home-assistant[bot] commented 2 years ago

hunterdouglas_powerview documentation hunterdouglas_powerview source (message by IssueLinks)

probot-home-assistant[bot] commented 2 years ago

Hey there @trullock, mind taking a look at this issue as it has been labeled with an integration (hunterdouglas_powerview) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

chrisjenx commented 2 years ago

Same issue here. The polling over loads the unit and then crashes it. Only fix is hard reboot of the hub.

Man I hate HD

chrisjenx commented 2 years ago

I have 30 shades FYI, so the unit becomes inoperable, is there a way to disable polling or change frequency?

bdraco commented 2 years ago

sigh. I guess we need an option to disable polling as well.

I'm traveling this week so I won't be able to add it too soon.

I suggest manually patching should_poll to False in the mean time.

chrisjenx commented 2 years ago

Yeah no rush, seems to take about a week to finally build up

bdraco commented 2 years ago

Actually you can already disable it so no code change needed

On Jul 10, 2022, at 1:27 PM, Christopher Jenkins @.***> wrote:

Yeah no rush, seems to take about a week to finally build up

— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/73900#issuecomment-1179776600, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFB7CFDHZQPOUFN3P3QSVLVTMIXXANCNFSM5ZUF2Z7Q. You are receiving this because you were assigned.

bdraco commented 2 years ago

Under system options for the config entry, turn off

Enable polling for updates.

If Home Assistant should automatically poll Hunter Douglas PowerView entities for updates.

On Jul 10, 2022, at 7:10 PM, J. Nick Koston @.***> wrote:

Actually you can already disable it so no code change needed

> On Jul 10, 2022, at 1:27 PM, Christopher Jenkins ***@***.*** ***@***.***>> wrote: > > > Yeah no rush, seems to take about a week to finally build up > > — > Reply to this email directly, view it on GitHub , or unsubscribe . > You are receiving this because you were assigned. >
bdraco commented 2 years ago

Under system options for the config entry, turn off

Enable polling for updates.

If Home Assistant should automatically poll Hunter Douglas PowerView entities for updates.

chrisjenx commented 2 years ago

I thought that disables polling the powerview completely?

glyph-se commented 2 years ago

I have 30 shades FYI, so the unit becomes inoperable, is there a way to disable polling or change frequency?

I have only 7 shades. Since upgrading HA a few versions ago I need to restart my Powerview every few days.

Thought about opening an issue then found this one. Still trying to debug what causes it and how to reliably reproduce it.

chrisjenx commented 2 years ago

I have 30 shades FYI, so the unit becomes inoperable, is there a way to disable polling or change frequency?

I have only 7 shades. Since upgrading HA a few versions ago I need to restart my Powerview every few days.

Thought about opening an issue then found this one. Still trying to debug what causes it and how to reliably reproduce it.

It's odd, since I posted that I haven't had to reboot the controller.. I control most of my shades via scenes which are way more reliable than single control.

glyph-se commented 2 years ago

I have 30 shades FYI, so the unit becomes inoperable, is there a way to disable polling or change frequency?

I have only 7 shades. Since upgrading HA a few versions ago I need to restart my Powerview every few days. Thought about opening an issue then found this one. Still trying to debug what causes it and how to reliably reproduce it.

It's odd, since I posted that I haven't had to reboot the controller.. I control most of my shades via scenes which are way more reliable than single control.

I just updated to 2022.7.6 and my PowerView hub hung right after.

I have tried manually restarting Home Assistant a few times, but the hub doesn't seem to be affected by this. Thinking back, the other times it hung could be after other updates, I haven't seen it in a while, and 2022.7.5 is quite old.

Could it be the same for you @chrisjenx ?

Shall I open a new issue for this?

Edit: Now it hung after restart, so either only sometimes, or there is a delay.

glyph-se commented 2 years ago

My powerview hub was working fine for 19 days, I just updated to 2022.8.2 (from 2022.7.6) and it hung itself again.

chrisjenx commented 2 years ago

Yeah, it seems to happen when it sends a bunch of commands to the PowerView all at once (which I'm guessing happens after a reboot/restart when the component is reloaded) I have mine powered by PoE splitter, debating creating an automation that just reboots the PV at like 4am.

kingy444 commented 1 year ago

I have a PR for the issue this ticket was originally raised for but this ticket spawned a set of users with a different issue where the PowerView Hub would lock up after an update

can any of you advise the below: 1 - does this issue still occur 2 - does the issue occur only on update ? I can’t see why it wouldn’t also occur on reboot 3 - how many hard wired powerview shades do you have. 4 - have you come across any other wierdness. I have 5 hardwired shades, all top down bottom up. The continued polling seems to put their calibration out, and I have to recalibrate every couple of weeks so I have actually set them to battery powered in the app to avoid polling all together (and the issue went away)

chrisjenx commented 1 year ago

Yes still happens for me I have 34 wired shades Update is worst as it refreshes the integration that seems to over load the PV.

Best solution I found is create an automation to reboot the PV hub every few days. That seems to clear out what ever memory leak the device has.

glyph-se commented 1 year ago
  1. I haven't seen it in a while, though I don't know what could have changed.
  2. I have only seen it on Home Assistant update, though I rarely do reboot otherwise. But I did try rebooting a few times without experiencing the issue
  3. None wired, 7 battery powered
  4. I have some connection issues where one out of 2-3 shades sometimes lose connection with the hub. I do believe this is due to bad wireless coverage of the powerview network. I currently have one extender, but I don't think this is enough. The 3 closest shades always work. I also have one shade stuck on FW 1.8, where all my others have FW 1.10

On Mon, 24 Oct 2022 at 10:24, kingy444 @.***> wrote:

I have a PR for the issue this ticket was originally raised for but this ticket spawned a set of users with a different issue where the PowerView Hub would lock up after an update

can any of you advise the below: 1 - does this issue still occur 2 - does the issue occur only on update ? I can’t see why it wouldn’t also occur on reboot 3 - how many hard wired powerview shades do you have. 4 - have you come across any other wierdness. I have 5 hardwired shades, all top down bottom up. The continued polling seems to put their calibration out, and I have to recalibrate every couple of weeks so I have actually set them to battery powered in the app to avoid polling all together (and the issue went away)

— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/73900#issuecomment-1288625377, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALKQUPETQDJBTQNFOTYS33WEZBU3ANCNFSM5ZUF2Z7Q . You are receiving this because you commented.Message ID: @.***>

wez commented 1 year ago

FWIW, I also see this from time to time, unless I turn off polling in system options. I have 33 wired shades. Regarding connectivity/coverage: I have a secondary hub and several repeaters. I haven't needed to reboot the secondary hub to resolve hass issues.

@chrisjenx does your reboot automation do anything to test whether the problem is currently manifesting, or does it just reboot every couple of days?

chrisjenx commented 1 year ago

To be honest I forgot to setup the power cycling, it's been a while since I've had to reboot the unit tbh, but with winter the automation keeps the shades open to increase solar gain so won't move them as much

trullock commented 1 year ago

I have this happening regularly, I currently have to reboot my hub at least weekly, which is really annoying.

This started happening in an update made in the last few months, definitely since this https://github.com/home-assistant/core/issues/70763

I'll help debug if you let me know what to provide

Also happy to work on a fix if someone can get me up to speed on whats happening

Update

I have 3 wireless Silhouette type shades.

HD app reports Version 3.1.6 build 66895

Repeaters' firmware is at 2.0.2928

I haven't been able to determine what causes this, it could be due to rebooting HA but I dont think this is always the case. Sometimes I just come to manually move the blinds or let an automation open/close them and they dont respond. The official HD app also doesn't then affect the blinds. I have to reboot the hub to make them work again.

kingy444 commented 1 year ago

@bdraco there are 2 seperate issues here

the one the ticket was raised for relates to the hub timing out (as I’m not responding fast enough not timedout=true from the hub)

the second is the one that probably relates to “problem with device”

I believe I have a fix for the former, and have a PR mostly finished but it takes a bit to test as the timeout cannot be forced

the latter, I have also experienced but less consistently again making it hard to troubleshoot. @trullock I don’t believe logs will help as it isn’t HA freezing, it is the HD Hub. My best guess is the hub struggles to process multiple shades returning a forced update at once. I was going to look if there is anyway to stagger the calls but don’t know if that is possible. @bdraco is it possible to do that?

Otherwise we probably need to disable polling by default @bdraco as it is highly unlikely HD do anything with the Hub with Gen 3 now out

edit: just adding it’s definitely not taking a massive amount of shades to cause this issue - I too had mine freeze with just 3 shades. I actually have all mine (including the hardwired) set to battery atm because I haven’t had much time and the hub freezing is a pain

bdraco commented 1 year ago

I was going to look if there is anyway to stagger the calls but don’t know if that is possible. @bdraco is it possible to do that?

__init__.py:PARALLEL_UPDATES = 1
cover.py:PARALLEL_UPDATES = 1

We already have PARALLEL_UPDATES set to 1 so it should only poll one at a time.

Otherwise we probably need to disable polling by default @bdraco as it is highly unlikely HD do anything with the Hub with Gen 3 now out

It would be great if we could come up with a way to trigger the issue so we can work around it (if possible). I haven't been able to get mine to freeze.

bdraco commented 1 year ago

It might make sense to have a central polling coordinator that polls, waits 10, polls the next one, etc so it gives the hub some time between polls

glyph-se commented 1 year ago

I'm not sure if it is related to any of the above issues or not, but I believe that HA sends commands to fast to the hub. I don't use scenes, but rather automations to move my blinds 2 or 3 at a time.

About 5-10% of the times only some moves. And then when I run the automation again the other blinds move. Thus I don't think it is related to any hub hanging, though it could possibly be a bad wireless connection to the hub, but I think I would have seen other issues (e.g. with the app) if that also were the case.

trullock commented 1 year ago

@kingy444 the timeout cant be forced but it does happen to me fairly regularly. If you publish the fix behind a flag I can run it for a few weeks and see if theres an improvement

downercc commented 1 year ago

I'm having the same issues. About 10% of the time, a 2-3 of my shades do not move. And every week or so, all shades stop working (via PowerView app or Home Assistant) and I have to reboot the hub to get it working again. I have 10 battery powered shades with 1 hub and 3 repeaters. Although a minor issue in the scheme of things, with a pregnant wife who expects the shades to be down when she needs to nap, I'm trying to find a fix to prevent a tongue lashing about my "not-so-smart home."

To anyone that created an automation, script, node red flow, or otherwise to reboot the HD hub, would you mind pasting the code? I'm relatively new with HA and do not have a coding background so I'm trying to teach myself. If not, no worries--just thought I'd ask.

wez commented 1 year ago

@downercc our sanity has been saved by doing this:

in /config/integrations, on the PowerView integration, click on the three dots and select "System Options", then turn off "Enable polling for Updates".

The consequence is that hass doesn't always know the precise state of the covers, but that's fine for our particular usage.

trullock commented 1 year ago

@wez re consequences, if I only control the blinds via HA, there should never be a state sync issue, right? (assuming the blinds batteries arent flat, or similar, so it has always moved as instructed)

wez commented 1 year ago

@trullock I don't think it hurts anything; it's just the reported state that may be stale/diverged. TBH, I don't automate via HA: my scheduled automations are configured on the powerview hub (way more reliable manipulating all the shades at sunrise/sunset that way) and my family tend to use the built-in homekit integration for deliberate control.

downercc commented 1 year ago

Thanks so much @wez! I'll give it a shot.

issue-triage-workflows[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

trullock commented 1 year ago

This is still a real issue Mr Bot

chrisjenx commented 1 year ago

It's still an issue but I see it far less thankfully

trullock commented 1 year ago

Happened to me twice last week with polling off, but hasnt happened for a long time

wez commented 1 year ago

still an issue for me

kwridan commented 1 year ago

Firstly thanks to the maintainers and contributors for this integration 👍

I've encountered the same issue too recently and disabling polling did help. I was still curious what exactly overwhelms the hub, from a few experiments I've observed the following:

The poweview hub

Are there cases which cause this to be incorrect?

Home assistant integration - polling

The powerview integration is polling /api/shades every minute rather than every single shade individually.

This looks good 👍, I wonder if the hubs have some other issue internally that causes it to die after a large number of requests (over the course of a day or week) 🤔

Home assistant integration - updating position

Performing a shade position update in home assistant results in the following calls

Compare this to what the native poweview app does

home assistant does two extra refresh calls which does result in RF commands getting called, are the extra refresh calls necessary? It appears the hub reports the correct positions when performing GET /api/shades without the refresh.

I suspect the refresh calls taking place alongside the shade update position call is causing the hub to go out of sync as it queries the shades mid operation and confuses its internal state. This can add up in an automation where more than one shade is controlled at the same time 🤐

Home assistant integration - initial setup / restart

When homes assistant first sets up the poweview integration, or when it restarts, it performs the following calls

Understandably on first setup home assistant is trying to ensure it has the most accurate state of the shades, this could be the culprit of what overwhelms the hub (e.g. if users have been restarting their hub due to changing a configuration / adding new integrations, etc...).

Suggestions

trullock commented 1 year ago

Nice work @kwridan

@bdraco can perhaps say why the extra calls are there, my guess would either be because of some previously observed staleness needing correcting or just habitual coding...

bdraco commented 1 year ago

Sometimes the position the hub returns is stale and it takes a few calls to get the right position. If we dropped the extra calls we end up reporting the wrong position for an extended period of time.

kwridan commented 1 year ago

Ah I see, thanks for clarifying.

Interestingly from my testing I did notice stale positions do occur as a result of the consecutive shade update and refresh calls and in this case, indeed does require another refresh call at the very end get it back in sync, but possibly missing other scenarios and cases where stale positions are reported.

Would it be possible to queue the refresh call till a later time rather than immediately after the update call - the tricky bit is working out what is a good delay for this to not incur any additional overhead.

It's unclear if this in particular is the source of the hub timeouts - the hub doesn't seem very reliable nor robust 😞

bdraco commented 1 year ago

If we have a way to reliability reproduce the problem we can probably come up with a solution to avoid it, but without knowing for sure what triggers the hub to flake out any solution we build is just a guess and could make the behavior worse

bdraco commented 1 year ago

If it turns out to be a problem with concurrent requests we could set parallel updates to 0 to turn off home assistants internal semaphore and than wrap every update and api call with a semaphore to prevent parallel requests or polling. But if that isn’t the issue we would have made the whole thing much slower for everyone and still not solved the problem.

kingy444 commented 1 year ago

I would expect this GET /api/shades/<id>?refresh=true for all shades sequentially (calls ~1s apart) to be the main culprit of the issues as most users report on the reboot of HA - from my experience the subsequent calls on a move aren't extensive on the hub as the shade is already active. I have only experienced this issue once (12 shades) and have not been able to force it to debug.

Smashing all shades with a refresh may seem excessive, but if we don't HA cannot know the shade position is up to date or not. Unfortunately many people still use the pebble remotes and not HA solely to move shades, and then those users complain that position is not represented correctly in HA.

Interestingly though is that this issue only got reported after we added polling for hardwired shades and the logic for refresh=true has been there forever.

Re the multiple refreshes on a shade move - decision just needs to be made whether we trust the shade position is in the position the app sent - or actually check the shade has moved. Personally I have definitely seen the command not move the shade position and need to send it again. Without the extra refresh HA wouldn't know the position was incorrect

trullock commented 1 year ago

Is the only way to get it out of sync by using the remotes? If so we could add a switch to enable the extra polling?

kingy444 commented 1 year ago

I mean technically you could add a configuration entity on each shade to determine if it needed additional polling - @bdraco from an architecture view that sounds like something HA wouldn’t want to implement. Happy to be wrong 😊

I’m thinking a Boolean config entry on each shade - would need the decision if the extra polling is on or off by default too (I would assume on, as to maintain current state, but could see benefit in setting to off by default)

glyph-se commented 1 year ago

For me this issue was less common during the winter than the summer of the year. During the winter I only control 3 shades, but now I control all 8 using Home Assistant.

I think it would be great with a configuration option to poll less. This give me the idea to add delays around all calls in my automations, it might have improved things, but will need some more time until I can say for sure,

issue-triage-workflows[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

trullock commented 1 year ago

This annoys me weekly

chrisjenx commented 1 year ago

Yeah still broken. I have a automation that reboots the hub nightly.

On Wed, Aug 23, 2023 at 3:02 PM Andrew Bullock @.***> wrote:

This annoys me weekly

— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/73900#issuecomment-1690634003, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI5DMOZRGHI67ITBT524JTXWZVVVANCNFSM5ZUF2Z7Q . You are receiving this because you were mentioned.Message ID: @.***>