home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
69.87k stars 28.97k forks source link

HUE spamming error LOG: Error fetching light data: #32380

Closed Kugelfang666 closed 4 years ago

Kugelfang666 commented 4 years ago

The problem

Since updating from 0.105.X to 0.106.1 my log becomes spammed with errors. All hue related items become briefly unavailable in Lovelace and then come back. This is true for all devices on both my hue bridges and always happened simultaneously for both bridges.

Environment

Problem-relevant configuration.yaml

Traceback/Error logs

2020-02-29 09:21:33 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching group data: 
2020-02-29 09:21:33 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
2020-02-29 09:14:20 ERROR (MainThread) [homeassistant.components.hue.sensor_base] Error fetching sensor data: 

Additional information

I also have the custom integration "Hue sensor advanced" running on my setup.

probot-home-assistant[bot] commented 4 years ago

Hey there @balloob, mind taking a look at this issue as its been labeled with a integration (hue) you are listed as a codeowner for? Thanks!

badrobit commented 4 years ago

This is also an issue on 105.5 (just reverted). IP Addresses removed for obvious reasons.

2020-03-01 17:51:00 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 576, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/components/hue/light.py", line 375, in async_turn_on
    await self.bridge.async_request_call(self.light.set_action(**command))
  File "/usr/src/homeassistant/homeassistant/components/hue/bridge.py", line 108, in async_request_call
    return await coro
  File "/usr/local/lib/python3.7/site-packages/aiohue/groups.py", line 80, in set_action
    json=data)
  File "/usr/local/lib/python3.7/site-packages/aiohue/bridge.py", line 78, in request
    ) from None
aiohue.errors.RequestError: Error requesting data from ###.###.###.###: [Errno 104] Connection reset by peer
2020-03-01 17:51:00 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
2020-03-01 17:52:00 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 576, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/components/hue/light.py", line 375, in async_turn_on
    await self.bridge.async_request_call(self.light.set_action(**command))
  File "/usr/src/homeassistant/homeassistant/components/hue/bridge.py", line 108, in async_request_call
    return await coro
  File "/usr/local/lib/python3.7/site-packages/aiohue/groups.py", line 80, in set_action
    json=data)
  File "/usr/local/lib/python3.7/site-packages/aiohue/bridge.py", line 78, in request
    ) from None
aiohue.errors.RequestError: Error requesting data from ###.###.###.###: [Errno 104] Connection reset by peer
2020-03-01 17:52:00 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
2020-03-01 17:53:52 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
2020-03-01 17:54:00 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
2020-03-01 17:55:00 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
2020-03-01 17:55:00 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 576, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/components/hue/light.py", line 377, in async_turn_on
    await self.bridge.async_request_call(self.light.set_state(**command))
  File "/usr/src/homeassistant/homeassistant/components/hue/bridge.py", line 108, in async_request_call
    return await coro
  File "/usr/local/lib/python3.7/site-packages/aiohue/lights.py", line 117, in set_state
    json=data)
  File "/usr/local/lib/python3.7/site-packages/aiohue/bridge.py", line 78, in request
    ) from None
aiohue.errors.RequestError: Error requesting data from ###.###.###.###: [Errno 104] Connection reset by peer
2020-03-01 17:55:00 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
balloob commented 4 years ago

@badrobit that was fixed in 106.

@Kugelfang666 please try without any Hue related custom component.

Mariusthvdb commented 4 years ago

HI. just chiming in to let you know on 106.5 this is still observed very frequently, and Hue light go unavailable beyond a workable situation..

2020-03-05 16:47:58 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching group data: 
2020-03-05 16:48:18 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching group data: 
2020-03-05 16:48:23 ERROR (MainThread) [homeassistant.components.hue.sensor_base] Error fetching sensor data: 
2020-03-05 16:48:27 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
2020-03-05 16:48:42 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching group data: 
Kugelfang666 commented 4 years ago

@Mariusthvdb:

Are you using the custom integration "Hue sensor advanced"?

Mariusthvdb commented 4 years ago

if you are referring to @robmarkcole 's Custom integration I have used that for ages, and every time I test the Hue integration in Hass.io, I take it out, because dev team wont/can't accept issues with this alive in the setup. It does seem to intensify this issue, so be sure to take it out for testing purposes.

Schermafbeelding 2020-03-05 om 17 15 59

happening as we speak

Kugelfang666 commented 4 years ago

ao i just removed @robmarkcole custom integration from my setup to try it without. Unfortunately my setup heavily relies on this component since im using the hue dimmer switches all over to control non hue stuff via HA.....

balloob commented 4 years ago

@Mariusthvdb you said in another issue that you use the Rest sensor to hit the Hue API too. Disable those too. Hue only supports a limited number of requests per second. The error shows up when the Hue hub is overwhelmed.

Kugelfang666 commented 4 years ago

I can confirm the issue also occurs without the custom integration. This makes sense since I was using the combination since months without any issues and they popped up just after the update to 0.106 without any change to my hue setup

Mariusthvdb commented 4 years ago

@Mariusthvdb you said in another issue that you use the Rest sensor to hit the Hue API too. Disable those too. Hue only supports a limited number of requests per second. The error shows up when the Hue hub is overwhelmed

Correct, I have 1 rest sensor now that checks the lights for ‘reachable’ which is needed to distinguish lights being truly unavailable because of no power, or unavailable because of this issue. setting allow_unreachable: true shows these lights as on/off so we need a way to get the ‘reachable’ attribute in the HA state machine.

Screenshot ![94FCC638-207D-42E8-893C-857952BE89F7](https://user-images.githubusercontent.com/33354141/76008776-81cc8080-5f10-11ea-8cf3-34654497c3be.png)

Will take it out for testing also. Thanks

balloob commented 4 years ago

@Kugelfang666 how many hue devices do you have ?

Kugelfang666 commented 4 years ago

Hue Bridge 1 5x motion detectors, 11x Dimmer switches 23x Lamps

Hue Bridge 2 2x motion detectors, 3x Dimmer switches 1x Hue Tap 10x Lamps

balloob commented 4 years ago

For the record, the Hue custom integration is absolute trashing your Hue hubs. It doesn't leverage existing caches, it doesn't cache itself either across the different platforms and is just hammering your hubs non-stop. It doesn't just break Home Assistant, I bet a lot of other stuff communicating with Hue (like Harmony) is also hurting by it.

Kugelfang666 commented 4 years ago

ok, fair point, the only thing i don't understand is why the error popped out just after the update to 0.106. I'm always keeping a close eye on the error log and this is the first time I saw it.

Also as I wrote earlier, I just uninstalled the custom component and removed the entries from config.yaml and after just 30 min the same error was in the log

is there an easy way to diagnose which devices (on my network) or integrations from HA are polling my bridges?

Kugelfang666 commented 4 years ago

I just updated to the new version which has the lower polling rate (0.5) , I'll give it a spin and report back,

gernerally speaking, having the hue dimmer switches available in HA as sensors for me is a major asset since I'm heavily using them to trigger automation, control devices... Hence, having a "proper" solution would be fantastic.

balloob commented 4 years ago

I would get yourself a Conbee 2 stick. Instant sensors/remote updates instead of polling for changes. The Hue Hub API just isn't meant for this.

Kugelfang666 commented 4 years ago

sorry off topic:

which I actually already have, unfortunately deCONZ had two limitations for me:

  1. it does not allow to configure the scenes which are loaded in case of motion being detected as a function of time. Let's say 7-23 Scene A, 23-7 Scene B

  2. I could not manage to make two dimmer switches within one group cycle through different sets of scenes. Switch 1 Scene 1,2,3 , Switch 2 Scene 3,1,2,6,7

  3. I found the scene transitions to be a bit rougher than with the hue brigade (but im not entirely sure about this one)

I was about to move my whole setup over but 1. and 2. were a deal breaker for me. Currently im using the iOS App iconnecthue to manage my hue bridge which actually allows for all these functions.

Kugelfang666 commented 4 years ago

Also with the reduced poll rate I still get the errors in my brige

Kugelfang666 commented 4 years ago

i have quite some command line switches in my setup to deactivate my hue motions sensors:

switch:
  - platform: command_line
    switches:
      hue_schedule_we_switch:
        friendly_name: "Fade Light In"
        command_on: curl -X PUT -d '{"on":true}' "192.168.178.36/api/XXXXXXX/sensors/49/config"
        command_off: curl -X PUT -d '{"off":true}' "192.168.178.36/api/XXXXXXX/sensors/49/config"
        value_template: "{{ is_state('sensor.arbeitszimmer_motion_sensor', 'on') }}"

could it be that this is also hammering the bridge to check ists state? I'm just trying to diagnose if something else in my setup might also be contributing to this issue

Mariusthvdb commented 4 years ago

I would get yourself a Conbee 2 stick. Instant sensors/remote updates instead of polling for changes. The Hue Hub API just isn't meant for this.

to try and take out the need for the custom integration (which holds a great deal of extra info than the current Core does), we can add some of these extra attributes to the core HA integration, hope this will be allowed. Next to these attributes, we really need to integrate the Hue remotes into HA. Is anyone working on that, or is an architecture discussion needed for that before doing so.

balloob commented 4 years ago

@Kugelfang666 that won't hit the server unless you're turning them on/off. However that value template is sourced from the hub…

@Mariusthvdb you don't need an architecture issue, you need a developer to fix it. Remotes however will never work great, as it's a polling API so things will never be instant unless you hammer the hub, which we won't allow, as that's causing this issue to begin with.

Kugelfang666 commented 4 years ago

Thanks for the feedback.

Is the value template regularly polled or just checked when the switch is operated?

balloob commented 4 years ago

Value template is only evaluated when the sensor.arbeitszimmer_motion_sensor state changes. It does not interact directly with the hub to get the state.

Kugelfang666 commented 4 years ago

ok thanks for pointing this out, the only thing I still don't understand is why this error just popped out with the latest HA update, I'm 100% positive at there was no such entry in the log. could it be that earlier it was not reported?

Mariusthvdb commented 4 years ago

@Mariusthvdb you don't need an architecture issue, you need a developer to fix it. Remotes however will never work great, as it's a polling API so things will never be instant unless you hammer the hub, which we won't allow, as that's causing this issue to begin with.

Yes I understand that, and wouldn't want to do so. Thats why I had the Custom integration poll per second, not 0.1 second and that worked fine. Would you consider that too much still? If not, I think this would be just fine to have in core HA.

Mariusthvdb commented 4 years ago

fwiw, Ive deleted all Custom integrations and other rest sensors to the hue hub, except for 2 that only run once a day, and moved over to all core Hue sensors.

Which is a complete let down so far: where I saw only the hue lights go unavailable before (and all Custom integration sensors where rock solid, as they have always been), I now witness all these sensors go unavailable also! So this truly has nothing to do with the custom integration (though relieving the Hub couldn't hurt) and cause should be sought elsewhere.

really am tempted to revert the change but willing to persevere if another way-out can be found. Would love to stick to core integrations... but they should be functional

Now how to proceed.

Schermafbeelding 2020-03-06 om 23 19 29 Schermafbeelding 2020-03-06 om 23 19 04

and a few template sensors built on the core sensor binary...

Schermafbeelding 2020-03-06 om 17 03 53

fwiw, this is the production system, no manual editing in the source files on local dev. plain simple 106.5 Hassio

Kugelfang666 commented 4 years ago

that is perfectly in line whit what I saw. For me removing the custom integration also did not resolve the problem.

Mariusthvdb commented 4 years ago

R you using Custom Header? Taking that out eases up a lot in my setup..

Kugelfang666 commented 4 years ago

?? I can't follow, what are you referring to?

Mariusthvdb commented 4 years ago

https://github.com/maykar/custom-header

But it was only my hope speaking I fear... still everything Hue (and Hue only) going unavailable..

fear this all is more fundamental and has also to do with the connection between frontend and backend: as I have reported for many times, on each Lovelace refresh, Hue shows unavailable. No matter what. Can't imagine the Hub has issues with a frontend refresh, so wouldn't this suggest some logic bug in the Hue code for showing unavailable?

Kugelfang666 commented 4 years ago

no I'm not using the custom header

marriotb commented 4 years ago

I thought I should add my bit of information to the conversation: I've been running Hue with Home Assistant for a couple of months now. I have a small installation: a single Hue bridge listening to three movement sensors and two dimmer controls and directly controlling 5 lights and a light strip. It also sends data to Home Assistant.

I'm running with Home Assistant on HassOS on a Raspberry Pi.
The bridge (BSB002) is on firmware 1937045000. Home Assistant is v 209; HassOS is 3.12.

I can't tell you the version number at earlier times, but the Hue bridge, Hassio and Home Assistant are kept up to the latest general release software version at any time, with rarely more that a day or so delay from release date.

I'm running no custom components that are likely to be directly involved with the the Hue. I've even turned off Node Red, which was being used to monitor the state of some Hue devices in order to control other non-Hue things.

The log messages show up in pairs:

2020-03-09 16:31:09 ERROR (MainThread) [homeassistant.components.hue.sensor_base] Error fetching sensor data: 2020-03-09 16:31:09 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data:

(The order may vary)

The messages occur whether or not the Lovelace inteface is being actively used. The sample above was definitely when the UI was idle.

They also occur independently of activity on the sensors. The Home Assistant logbook shows that at the time of the above records, all devices on the bridge became marked as 'changed to unavailable'. A few seconds later (16:31:15) they all returned to the state they were in previously. (Quote Bob Dylan: "everyone commenced to do what they were doin’ before he turned their heads")

Suggestions as to things I could do to help diagnose the issue would be gratefully received:-)

azogue commented 4 years ago

Hi @marriotb, nice report :)

A question though, you say that you have 'three movement sensors and two dimmer controls', are the dimmers configured in HA? (I think they are not implemented in HA Core, and you're saying you have no related custom_components)

What's the frequency of those error pairs? I also have them, but only once in HA startup.

The messages occur whether or not the Lovelace interface is being actively used

This could be interesting. Any custom lovelace card related to hue devs?

balloob commented 4 years ago

@Mariusthvdb wrote:

on each Lovelace refresh, Hue shows unavailable

Do you have many graph cards in your UI that are rendering historic data? Can you try adding an empty dashboard and refreshing there. I wonder if there is something in your UI that just swamps Home Assistant, causing it to stall for enough time that the request times out.

Note that in 106 we show better errors why a request failed (timeout, something else).

I'll add a tweak so that sensors and ligths don't check for updates at the same time. That might help a bit.

Mariusthvdb commented 4 years ago

well, tbh, not really no. Have a few graphs of course, but not on my lights pages. It really doesn't matter on which view I am though. Refreshing Lovelace caused the Hue entities to go unavailable.

Have been thinking about the unavailable and reachable thing some more. couldn't it be you are showing unavailable now for way too many situations? If we think of unavailable as an indicator of Hue communication issues, and only that, we shouldn't have to show unavailable at all for entities that are off power. They are simply unreachable, which has its own representation on the Hue hub.

fyi: I have for the sake of it reinstalled the now improved Custom integration, (mainly because all core Hue sensors showed unavailable all the time, just as frequently as the lights did, never noticed that before..) and with the exception of the startup sequence, I havent seen a single data fetching error yet.

If you are tweaking the Hue integration: would it be feasible to have the creation of the (binary_)sensors optional? That way users can actively select the sensors if they want/need them, and if not, have many entities less to worry about.

About the errors: they are really not very informative:

Schermafbeelding 2020-03-09 om 17 26 41 Schermafbeelding 2020-03-09 om 17 26 52

as said, it has quieted down significantly now, hope this stays for a while.

Thanks for the tweak, anything that can alleviate this is most welcome! very much appreciated, and let me know how to help any further.

balloob commented 4 years ago

You can disable any entity via the entities page in config (or more info -> settings). However, all data for all sensors is fetched at once.

Mariusthvdb commented 4 years ago

of course! thanks for reminding me... sorry.

just so you know: unavailable also shows when checking the config, or restarting it for that matter. It comes during the check and before restart, but thought it was worth mentioning.

secondly: the home-assistant loader fails to report the custom integration for Hue_custom..... this message was introduced specifically for Hue custom at the time, so it is quite remarkable it misses out on it.

barramundi commented 4 years ago

Hello. It is happening here also. In 0.106.5 every few minutes i get all hue entities unavailable with these logs

[homeassistant.components.hue.light] Error fetching group data: [homeassistant.components.hue.light] Error fetching light data: [homeassistant.components.hue.sensor_base] Error fetching sensor data:

Reverted to old snapshots (versions 0.106.3 and 0.105.2), still the same problem. May be related with last hue bridge update (02/24)? Really frustrating..

balloob commented 4 years ago

@Mariusthvdb what platform do you run HA on?

barramundi commented 4 years ago

Info about mine: Raspbian GNU/Linux 10 (buster) in a raspberry pi 4. Worked perfectly for more than eight months and no big changes in my configuration for more than three.

marriotb commented 4 years ago

@azogue wrote:

are the dimmers configured in HA?

No. The dimmers are only configured using the Hue iPhone app Accessory setup component to control one or more of the lights. The state of each light is available to HA via the inbuilt Hue driver. Any automations in use depend only on those HA state values.

I suspect that this approach induces a slight delay in response (this was certainly the case using Node-RED), but I'm happy to wear that. The native Hue automation is sufficient to, for example, turn on a light as I enter a room; any secondary activity (scene lights, HiFi amp, monitors, etc) has no great impact if it is delayed a second or so.

What's the frequency of those error pairs? I also have them, but only once in HA startup.

It's hard to see a pattern. Approximately twice an hour, but very random around that. I attach the log entries for the past 22 hours if you'd like to see the distribution. I've included both the homeassistant.components.hue.sensor_base and homeassistant.components.hue.light errors, because they don't always both occur.

I note, now, that there hasn't been an event now for about 7 hours! I have no explanation for that - no updates have been made to HA or Hue.

The messages occur whether or not the Lovelace interface is being actively used

This could be interesting. Any custom lovelace card related to hue devs?

I am using the HACS plug-in slider-entity-row on some of the Hue lights, but I don't think that's specific to Hue devices.

marriotb commented 4 years ago

Oops. I forgot to attach the log extract: Log extract 2020-03-10.log

Mariusthvdb commented 4 years ago

@Mariusthvdb what platform do you run HA on?

Hassi.io 106.5 Rpi4 4 gb

Reported above things had quieted down, but unfortunately they have not disappeared:

Schermafbeelding 2020-03-10 om 07 03 44

this happens with and without the Custom integration Hue sensors. Somehow this seems to happen alongside script execution too. Cant pin it down just yet to a specific script.

marriotb commented 4 years ago

Yes, mine went quiet for 10 hrs:

2020-03-10 04:17:39 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 2020-03-10 04:17:39 ERROR (MainThread) [homeassistant.components.hue.sensor_base] Error fetching sensor data: 2020-03-10 14:43:54 ERROR (MainThread) [homeassistant.components.hue.sensor_base] Error fetching sensor data: 2020-03-10 14:43:54 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data:

I'm on same HW & core SW as @Mariusthvdb , except my RPi is only 2GB

Mariusthvdb commented 4 years ago

just to report that todays refactoring of the custom Hue integration to 2.13, streamlining it to the new hue datamanager, and with a eased down scan_interval of +1 seconds, still doesn't prevent the above errors to appear/happen. As said, does happen without it also.

balloob commented 4 years ago

@Mariusthvdb try to disable all custom integrations and hassio add-ons and see if it happens. If you get timeouts that often, including for other stuff, it means that something is breaking your system, probably doing I/O in the event loop, or that your Pi 4 is unable to handle all the load that you put on it.

Kugelfang666 commented 4 years ago

mhm thats really od. especially since im running HA on a i3 NUC. There should be plenty of power to run even demanding tasks...

THATDONFC commented 4 years ago

I'd like to add that I am seeing these errors as well. I hadn't seen them before upgrading to 0.106.5 I'm running Home Assistant in Docker (generic Linux install) on a rpi4 4gb. I do not run any custom hue components. My hue devices have always worked fine and I have never seen any errors relating to hue.

Here are the latest log entries I found related to this issue:

2020-03-09 14:18:13 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching group data: 
2020-03-09 14:18:35 INFO (MainThread) [homeassistant.components.hue.light] Fetching group data recovered
2020-03-10 13:30:20 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching light data: 
2020-03-10 13:30:20 ERROR (MainThread) [homeassistant.components.hue.light] Error fetching group data: 
2020-03-10 13:30:20 ERROR (MainThread) [homeassistant.components.hue.sensor_base] Error fetching sensor data: 
2020-03-10 13:30:26 INFO (MainThread) [homeassistant.components.hue.light] Fetching light data recovered
2020-03-10 13:30:26 INFO (MainThread) [homeassistant.components.hue.light] Fetching group data recovered
2020-03-10 13:30:26 INFO (MainThread) [homeassistant.components.hue.sensor_base] Fetching sensor data recovered
Mariusthvdb commented 4 years ago

@Mariusthvdb try to disable all custom integrations and hassio add-ons and see if it happens. If you get timeouts that often, including for other stuff, it means that something is breaking your system, probably doing I/O in the event loop, or that your Pi 4 is unable to handle all the load that you put on it.

sure will do. please allow me 3 questions:

the 'cant reach Hue at 192.168.1.212' error messages seem to have simply been replaced by these new ones...

balloob commented 4 years ago

@Mariusthvdb in your screenshot there is also an error from pyhaversion being unable to fetch data.

Yes, this is the same cause as the "cant reach Hue" messages from before. And it's very weird that opening the frontend is causing this. That almost feels like your system is coming to a halt. How many lines is your lovelace config? I'm interested to see if you can limit it to see if it impacts it. There are significant changes related to Lovelace configs in 107 too.