RobertD502 / home-assistant-flair

Custom component for Home Assistant Core for Flair pucks, vents, rooms, structures, and minisplits
MIT License
87 stars 12 forks source link

Vents stop responding about once/day, reloading integration fixes it #26

Closed bfenty closed 2 years ago

bfenty commented 2 years ago

I'm finding the integration works really well, but approximately once every 24 hours it seems to stop responding. Vents don't open/close but they still show available in HA.

Reloading the integration makes them immediately become available again and responsive.

I've noticed other people saying they're getting API requests going out every few seconds. Looking at my DNS records, I'm seeing something similar. Maybe this is hitting some sort of rate limiter?

Thanks for your great work on this integration.

RobertD502 commented 2 years ago

Do you have any logs from around the time this occurs? Also, do you have pihole set up within your network?

bfenty commented 2 years ago

I do have pihole. I don't know exactly when they stop working in order to check the logs. Is there something I should look for that would give an indication? I will happily provide.

RobertD502 commented 2 years ago

Whenever you notice that they stop working, check your logs and add logs here that are relevant to the Flair integration (include the timestamps as well which also lists the first occurrence, number of occurrences, and last occurrence). I did some digging, and the common denominator seems to be pihole. So, you're right in assuming that you could possibly be rate limited as a result of flooding their servers. It may be worthwhile changing the rules and rerouting you HA instance to not use the pihole and see if the issue persists. I use the Adguard Home addon and haven't had any issues - I was a former pihole user so I understand not everyone wants to make the change.

bfenty commented 2 years ago

Thanks. The HA server is also the pihole server lol. I'm running in docker.

It's odd that pihole would be limiting it but I'll try to cut that variable out and get back to you with logs as well. Do you mean pihole HA or both logs?

bfenty commented 2 years ago

Also-do you have a list of URLs that have to be accessed? Maybe one is getting blocked? I searched for "flair" and all that traffic is going through.

RobertD502 commented 2 years ago

Thanks. The HA server is also the pihole server lol. I'm running in docker.

If you're running PiHole through HA....I'm assuming you're referring to running PiHole as an addon container within Home Assistant OS. Even if that is the case, you can still set a rule for any local traffic (local as in initiated by your HA instance) to not be routed through PiHole, but keep everything else on the network running through PiHole.

Do you mean pihole HA or both logs?

Just the HA Flair logs. If you want you can also include a screenshot of the dns queries to "api.flair.co" in pihole showing the timestamp (no need to paste the entire log of dns queries to "api.flair.co", just the first page is sufficient)

Also-do you have a list of URLs that have to be accessed? Maybe one is getting blocked? I searched for "flair" and all that traffic is going through.

All the URLs for flair's api start with "api.flair.co". What comes after it depends on the device and then also its unique ID. Extremely unlikely that you'd be getting blocked on specific device endpoints - what will help the most are the HA logs if they show any unusual responses from Flair's servers.

Edit:

Just as an example....I had a call to Flair's servers fail at 3:32 PM, which I know by looking at my HA log:

image

However, looking at Adguard Home's query log I'd think that everything is fine:

image

Adguard Home, just like would be the case if i was using PiHole, had no problem resolving the dns query to flair, but I won't see the server response in my Adguard Home query log....for that I need to rely on the logs created within Home Assistant for the Flair integration.

bfenty commented 2 years ago

image

bfenty commented 2 years ago

I'm not sure if this is normal but I'm getting a lot of these time out messages.

RobertD502 commented 2 years ago

I get those as well, but not something that is causing your problems.

Edit: How many vents do you have in your setup?

bfenty commented 2 years ago

Ok, got an error that might be what you're looking for.

To clarify my setup, I have 2 pucks, 20 vents. I am running HA as a docker image under Ubuntu. I am also running pihole as a docker image under ubuntu on the same machine. I installed the integration manually by copying the folders to the correct directories.

Now that I'm paying attention, it's roughly 8-10 hours before it fails. Automations including the vents run but the vents do not update. I am only trying 0% or 100% open/close, nothing in between (it appears the vents only support 0%,50%,and 100% with nothing in between those values updating).

As suspected it appears that the URL is not connecting for some reason.

Logger: homeassistant.helpers.entity
Source: custom_components/flair/select.py:390
Integration: Flair ([documentation](https://github.com/RobertD502/home-assistant-flair/blob/main/README.md), [issues](https://github.com/RobertD502/home-assistant-flair/issues))
First occurred: 12:00:08 AM (14 occurrences)
Last logged: 6:14:12 AM

    Update for select.closet_activity_status fails
    Update for select.laundry_activity_status fails
    Update for select.workout_activity_status fails
    Update for select.bedroom_activity_status fails
    Update for select.study_activity_status fails

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/local/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Try again

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1040, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 358, in connect
    self.sock = conn = self._new_conn()
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f26963dd190>: Failed to establish a new connection: [Errno -3] Try again

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 440, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 785, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.flair.co', port=443): Max retries exceeded with url: /api/rooms/73845 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f26963dd190>: Failed to establish a new connection: [Errno -3] Try again'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 515, in async_update_ha_state
    await self.async_device_update()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 743, in async_device_update
    raise exc
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/config/custom_components/flair/select.py", line 390, in update
    self._room.refresh()
  File "/usr/local/lib/python3.9/site-packages/flair/rooms/room.py", line 13, in refresh
    room_state = self.api.refresh_attributes(ROOMS, self.room_id)
  File "/usr/local/lib/python3.9/site-packages/flair/flair_helper.py", line 159, in refresh_attributes
    return self.client.get(resource_type, id)
  File "/usr/local/lib/python3.9/site-packages/flair_api/client.py", line 221, in get
    requests.get(
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.flair.co', port=443): Max retries exceeded with url: /api/rooms/73845 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f26963dd190>: Failed to establish a new connection: [Errno -3] Try again'
bfenty commented 2 years ago

I have also just now set a custom DNS on the HA docker container, which should completely bypass Pihole. I'll see if the issue happens again.

RobertD502 commented 2 years ago

My guess is that it should work now that you don't have pihole involved. I will be releasing an update shortly that uses a forked version of Flair's API wrapper which uses a requests Session object and reuses the TCP connection to Flair's servers. Their current wrapper creates a new connection for every request made. I would hold off on this update until you can verify if your problem is resolved/not resolved now that you set a custom DNS.

If the problem is fixed by using a custom DNS: update the flair integration to the new version (0.0.5.8) and restore your DNS settings to use PiHole again - I'm curious to see if using a Session object helps with the PiHole problems or if the problem returns.

If the problem persists: Update the flair integration to the new version (0.0.5.8) and restore your DNS settings to use PiHole again - Again, I'm curious to see if using a Session object fixes the problem when a custom DNS didn't.

bfenty commented 2 years ago

Well, in my experience it should've failed by now, but it hasn't. I'm not ready to call it for another day or two at least, but so far simply disabling pihole seems to be helping. I still don't understand WHY which is going to bother me, as pihole handles all my network traffic just fine and this would be the first thing that it would be causing an issue with. I've had it for years. That said, if enabling a different DNS fixes it...well then great. I'm happy to experiment a bit as my vent control isn't mission-critical for anything yet. I'm hoping this issue can help someone else solve their problem.

What I did, exactly: In my docker run script, I added --dns="1.0.0.1" --dns="1.1.1.1"

What this does is add Cloudflare's DNS servers for the docker container and overrides the host's DNS settings. You don't have to use Cloudflare, it was just one I knew off the top of my head.

I'll monitor and report back if I continue to have issues.

RobertD502 commented 2 years ago

After a day or two of testing, would you mind updating to version 0.0.5.9 of this integration and revert back to using the pihole? Just curious to see if reusing the TCP connection keeps pihole happy.

bfenty commented 2 years ago

Certainly. I'll leave the issue open until then? It also still seems to be responding more than 24 hours later, which is a good sign.

bfenty commented 2 years ago

I'm noticing that certain vents are going offline and then coming back on their own. It's certainly an improvement since disabling pihole. I have not reset it for going on 48 hours now.

Here's some potentially related logs:

`This error originated from a custom integration.

Logger: homeassistant.helpers.entity
Source: custom_components/flair/sensor.py:591
Integration: Flair (documentation, issues)
First occurred: June 22, 2022 at 4:31:38 PM (68 occurrences)
Last logged: 7:12:14 PM

Update for sensor.flair_vent_workout_a202rssi fails
Update for sensor.flair_vent_living_room_00adrssi fails
Update for sensor.flair_vent_living_room_6750rssi fails
Update for sensor.flair_vent_laundry_a2cbrssi fails
Update for sensor.flair_vent_playroom_697frssi fails
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/local/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Try again

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1040, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 358, in connect
    self.sock = conn = self._new_conn()
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f57cc0ec550>: Failed to establish a new connection: [Errno -3] Try again

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 440, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 785, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.flair.co', port=443): Max retries exceeded with url: /api/vents/d7e5aecc-de64-5101-0503-dd291520d7f1 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f57cc0ec550>: Failed to establish a new connection: [Errno -3] Try again'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 515, in async_update_ha_state
    await self.async_device_update()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 743, in async_device_update
    raise exc
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/config/custom_components/flair/sensor.py", line 591, in update
    self._vent.refresh()
  File "/usr/local/lib/python3.9/site-packages/flair/vents/vent.py", line 13, in refresh
    vent_state = self.api.refresh_attributes(VENTS, self.vent_id)
  File "/usr/local/lib/python3.9/site-packages/flair/flair_helper.py", line 159, in refresh_attributes
    return self.client.get(resource_type, id)
  File "/usr/local/lib/python3.9/site-packages/flair_api/client.py", line 221, in get
    requests.get(
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.flair.co', port=443): Max retries exceeded with url: /api/vents/d7e5aecc-de64-5101-0503-dd291520d7f1 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f57cc0ec550>: Failed to establish a new connection: [Errno -3] Try again'))
`
RobertD502 commented 2 years ago

So the newest release that I put out includes changes that I made to Flair's API wrapper that now uses a Session object - the version you are running right now uses their untouched wrapper. The difference: the changes I made results in reusing the connection to Flair's servers instead of creating a new connection for every request. Every request is still being made, but more efficiently. So, the errors you are showing are much less likely to occur with this new method. However, given that this is still a cloud-based API, we can never fully eliminate server connection problems and read timeouts.

Since you've been running it without the pihole for a while now, I think you're good to upgrade to the latest version of this integration and turn pihole back on. Let me know if the integration remains functional given the Session object changes that were introduced.

bfenty commented 2 years ago

ok I downloaded and installed the latest (just deleted/re-copied the flair folder into custom-components, then reloaded HA).

I'll report back if I continue to have any issues.

I know that remote APIs will always have the possibility of issues, I wish that Flair would release a local API option.

Thanks for your support.

RobertD502 commented 2 years ago

The potential is there, but who knows if they will. Their puck uses a ESP8266, if I remember correctly. The Bond hub (RF Fan, Fireplace, Shades, etc) also uses a ESP at its core and they implemented a local REST API on it.

bfenty commented 2 years ago

So far the new version is working with pihole. Maybe another 24 hours is necessary but it's already lasted longer than before.

RobertD502 commented 2 years ago

Great to hear! Keep me posted. I'll close this issue if it is still going strong then.

bfenty commented 2 years ago

well it ran all weekend with no issues-I think it's safe to call this one fixed. Well done.

RobertD502 commented 2 years ago

Thanks! Also, it was @fcfort that pointed out Flair's wrapper doesn't use a Session, so, he's the MVP here. PiHole is a bit needy - I've found other instances of integrations having trouble with PiHole which are only fixed by using a Session.