home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.55k stars 30.73k forks source link

Philips Hue - Lights stop reacting #63044

Closed Vorta closed 2 years ago

Vorta commented 2 years ago

The problem

Hello!

I'm using the Philips Hue integration and soon after the 2021.12.7 update I am not able to control lights from Home Assistant. If I reboot the core they start to work for several minutes and then stop working again.

I've added the errors from the log. The light.living_room_lights is a Zone created in the Philips Hue app. I'm using NodeRED to only send turn_on and turn_off commands with parameters for transition, brightness and color temperature.

I have also noticed that when lights receive turn_off command, they turn off all at once, but when turn_on command is issued they turn on one by one. Don't know if it's related.

I'm using the Philips hue V2 bridge (BSB002), Firmware: 1.48.1948086000.

What version of Home Assistant Core has the issue?

core-2021.12.7

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Philips Hue

Link to integration documentation on our website

https://www.home-assistant.io/integrations/hue/

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2021-12-29 22:18:08 ERROR (MainThread) [homeassistant.core] Error executing service: <ServiceCall light.turn_on (c:1ea797dfbf8c460d1043050ea34d6807): entity_id=['light.living_room_lights'], params=transition=0.5, brightness=255, color_temp=400>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/core.py", line 1511, in catch_exceptions
    await coro_or_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1530, in _execute_service
    await handler.job.target(service_call)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 209, in handle_service
    await self.hass.helpers.service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 663, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 896, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 700, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 494, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/hue/v2/group.py", line 186, in async_turn_on
    await asyncio.gather(
  File "/usr/src/homeassistant/homeassistant/components/hue/bridge.py", line 127, in async_request_call
    raise err
  File "/usr/src/homeassistant/homeassistant/components/hue/bridge.py", line 117, in async_request_call
    return await task(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/aiohue/v2/controllers/lights.py", line 90, in set_state
    await self.update(id, update_obj)
  File "/usr/local/lib/python3.9/site-packages/aiohue/v2/controllers/base.py", line 146, in update
    await self._bridge.request("put", endpoint, json=data)
  File "/usr/local/lib/python3.9/site-packages/aiohue/v2/__init__.py", line 191, in request
    raise BridgeBusy(
aiohue.errors.BridgeBusy: 25 requests to the bridge failed, its probably overloaded. Giving up.
2021-12-29 22:19:38 ERROR (MainThread) [homeassistant.core] Error executing service: <ServiceCall light.turn_on (c:f99dfafb970b3c9e4a3584fbeb683b43): entity_id=['light.living_room_lightstrips'], params=transition=0.5, brightness=127>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/core.py", line 1511, in catch_exceptions
    await coro_or_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1530, in _execute_service
    await handler.job.target(service_call)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 209, in handle_service
    await self.hass.helpers.service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 663, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 896, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 700, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 494, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/hue/v2/group.py", line 186, in async_turn_on
    await asyncio.gather(
  File "/usr/src/homeassistant/homeassistant/components/hue/bridge.py", line 127, in async_request_call
    raise err
  File "/usr/src/homeassistant/homeassistant/components/hue/bridge.py", line 117, in async_request_call
    return await task(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/aiohue/v2/controllers/lights.py", line 90, in set_state
    await self.update(id, update_obj)
  File "/usr/local/lib/python3.9/site-packages/aiohue/v2/controllers/base.py", line 146, in update
    await self._bridge.request("put", endpoint, json=data)
  File "/usr/local/lib/python3.9/site-packages/aiohue/v2/__init__.py", line 191, in request
    raise BridgeBusy(
aiohue.errors.BridgeBusy: 25 requests to the bridge failed, its probably overloaded. Giving up.

Additional information

No response

probot-home-assistant[bot] commented 2 years ago

hue documentation hue source (message by IssueLinks)

probot-home-assistant[bot] commented 2 years ago

Hey there @balloob, @marcelveldt, mind taking a look at this issue as it has been labeled with an integration (hue) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

marcelveldt commented 2 years ago

Well, the error message says it all:

aiohue.errors.BridgeBusy: 25 requests to the bridge failed, its probably overloaded. Giving up.

Meaning your bridge is rate limiting requests. How many lights are in that grouped light you're controlling ? Please be aware that only a plain turn on/off is issued to the grouped light directly, all other parameters are passed to each individual light. I can really recommend to use Hue scenes to target multiple lights at once.

Do you have anything else connected to the Hue bridge ? Node Red or other addons/services directly talking to the bridge ?

marcelveldt commented 2 years ago

Follow up question... How often are you sending commands to the lights ? The Hue bridge is a bit fragile when it comes to sending requests and is rate limited to 2 requests per 0.25s If you do not send too many requests in parallel you should never hit this rate limit.

Vorta commented 2 years ago

I have a flow that updates color temperatures for all lights that are ON in intervals of 1min. I would otherwise use Hue scenes, but color temperature changes constantly during the day.

The group in question has 14 lights. There is no other addon that issues any commands to the bridge. I only have a regular Home Assistant dashboard that shows states of all lights.

The dashboard does seem to update in intervals of 1s instead of immediately, but the system should not be using local polling anymore? I can see from hue_events that they are polled at 1s intervals, all events in between are dropped:

https://user-images.githubusercontent.com/179685/147778084-e05a61aa-51ca-4a15-aabc-09b08517ed56.mp4

All in all, I see that I will have to work on throttling the number of commands my flows send to the bridge, however, wasn't local polling removed altogether for V2 bridges with the new version of the integration?

balloob commented 2 years ago

Do you have any custom component that also interacts with Hue?

Vorta commented 2 years ago

Nope. I only use NodeRED to talk with HUE.

The only custom component I have is "Midea Smart Aircon" for my air conditoner.

Here's also the complete list of Addons I have installed (just in case):

deCONZ
ESPHome
Glances
Grafana
InfluxDB
MariaDB
Mosquitto broker
Nginx Proxy Manager
Node-RED
phpMyAdmin
Studio Code Server
Terminal & SSH
UniFi Network Application
Zigbee2mqtt
marcelveldt commented 2 years ago

Local polling is indeed completely removed but if you send commands to 14 lights this will mean 14 separate commands to the bridge every few seconds... That is way more that the bridge can handle. It can handle about 2 requests per 0.25 seconds MAX if HA is the only one talking to the bridge (and accidental use of the official app perhaps).

So your light calls every x seconds to 14 lights will really go wrong. It's still applying the previous batch when you send a new command, in the end resulting in those overloaded exceptions you experienced. In that case it already retried (with seconds delay in between) a command 25 times already.

BTW: I use the same logic as you for adaptive lightning but I do this from HA automations (blueprints!) and a much more relaxed interval and I control about 60 Hue lights without any issues.

marcelveldt commented 2 years ago

Nope. I only use NodeRED to talk with HUE.

Make sure that Node Red doesn't communicate with the Hue bridge directly but instead use the HomeAssistant plugin to talk to your Hue lights. The Node Red plugin is not updated for the new API and is probably still polling the bridge at an insane interval.

Vorta commented 2 years ago

It seems that HA is still using local polling with my bridge for some reason. I can see that with dimmer events and with how dashboards get updated when I alter the state of lights from Hue app for example. This is definitely adding some stress to the hub. If you would go to Developer Tools of your HA instance, subscribe to hue_event and start spamming a button on a dimmer or switch that you have connected to the Hue bridge, would every event be registered there or would it be only 1 each second as in my case?

NodeRED is using the Home Assistant's "call service" node to control the lights and "current state" node to check the state when updating the temperature. I'm not using extra plugins as I prefer to keep as many things as I can "vanilla" in my HA instance.

I have adjusted some flows to do less actions with lights so I am not getting the BridgeBusy error anymore. After some trial and error I noticed that the overload towards the bridge would happen when I do a Deploy in NodeRED, which wanted to update circadian for all of the lights at once. So I mitigated that and no more on-deploy actions occur.

marcelveldt commented 2 years ago

Glad you sorted it and it works with no more bridge errors now!

The fact that events come in instantly is because the bridge pushes these over a so called eventstream connection. So we do no longer have to poll the bridge to find out if something changed, the bridge simply "tells" us.

So every button/light/device event will be registered instantly (as soon as the bridge sends it) without any overload/stress to the bridge.

Vorta commented 2 years ago

But this does not seem to be the case. In my instance it is evident that Home Assistant is still using local polling. I guess this is worth opening a separate issue.

marcelveldt commented 2 years ago

Why ? It is simply not possible to poll with the V2 integration. Where do you base that on ?

Note that button events are limited to 1 per second by Hue default.

marcelveldt commented 2 years ago

You could enable debug logging for both "aiohue" and "homeassistant.components.hue" That way you see everything that happens logged.

Scope666 commented 2 years ago

You could enable debug logging for both "aiohue" and "homeassistant.components.hue" That way you see everything that happens logged.

@marcelveldt Hey, I just wanted to thank you for adding back in the option to ignore availability. I have a few bulbs that go off the grid once in a blue moon and my OCD hates the unavailable icons in HA. :)

I would have posted this in the PR but comments are locked.

Vorta commented 2 years ago

Thank you for your time and patience @marcelveldt! 😄 I picked apart the new bridge API in the last few days and I see all the limitations that it has. I wish these things were better documented in the integration's documentation.

However, any integration that can drop events (in this case, the bridge drops Dimmer's events if there are more than 1 per second) should not have Platinum quality scale as it is not fair towards others like deCONZ and Zigbee2Mqtt. Connecting the same Dimmer to one of those 2 will give you a much better experience than Hue's as no events will be dropped and they will be reported instantly, not in 1s intervals.

P.S. Sorry for dragging this issue for so long, hopefully others with the same questions as I may find it useful.

marcelveldt commented 2 years ago

Platinum quality scale is about the code quality/completeness, not about features of a manufacturer. deConz and ZHA have their own pros and cons. I don't think a 1 second rate limit on events for one device is considered an issue. To me its more a feature ;-) Why would you even want to have events at a higher rate for one button device ? Hue events are reported instantly so I really do not get what you're talking about, sorry. Is there something special that is suspected within that 1 second?

If you don't like the Hue ecosystem, switch to one of the other zigbee implementations, that's the greatness of freedom ;-)

For me personally the Hue ecocystem (its limitations considered) is the most stable smart platform in our house.

Vorta commented 2 years ago

Ah, I didn't know that. 🙂 I currently only have the lights connected to the bridge. Hue Sensors and dimmers are connected to deCONZ.

The best example of what you lose with Hue's event reporting implementation is when you're holding any of the buttons on the Dimmer. The Hue Dimmer sends press event immediately when you press the button and then continues to send long_press events every 0.8 seconds until you release. On release you instantly get a long_release event. If I were listening to hue events in Home Assistant, it wouldn't be as snappy as with deCONZ and some events would be dropped. The last long_press before long_release is a guaranteed drop, but not a big deal. But anyway - Hue's own implementation is not able to work correctly with their own device because of the reporting frequency difference.

With deCONZ all of the button events are instant. Really instant. For example, this allows me to properly interpret when dimming button is being pressed repeatedly.

Curiously, if I wasn't controlling lights through Home Assistant, but instead setup everything in the Philips Hue app, it is quite clear that their implementation on the bridge is able to listen and react to every dimmer button event, no matter how quick it is.

I also use Senic Friends of Hue switches in many rooms. They don't work like dimmers to keep sending long_press events. With them when I receive a long_press event I have to keep repeating the action until long_release is received. It would occasionally happen that any of those events gets dropped resulting in either action not started or action never ending. Especially if you try to do a "shorter" long press.

I'm now using deCONZ to get switch events instantly and without them dropping. But... I was hoping that with Hue's V2 API implementation I would be able to have all of them connected to Hue Bridge. The big benefit there is that lamps act as routers for switches in that case.

Maybe that is something that they will provide us with later, where events coming from the bridge will not be rate limited to 1 per second (per device). 😄🤞🏻

marcelveldt commented 2 years ago

OK, now I understand. Personally I've just linked the lights to the dimmers and switches with the Hue app as that is more convenient and responsive. I'm not sure why they did that 1 second rate limit on the button events. Hopefully that will change one day. Maybe you could post a message on their forum ?