cdpuk / ha-bestway

Home Assistant integration for Bestway / Lay-Z-Spa hot tubs
MIT License
74 stars 18 forks source link

Data update and communication error #64

Closed FunkeyMonkey closed 4 months ago

FunkeyMonkey commented 5 months ago

Version of the custom_component

Bestway device

Airjet V01 wifi model

Describe the bug

I've been getting these error logs for the last few days now

Logs


Logger: custom_components.bestway.coordinator
Source: helpers/update_coordinator.py:344
integration: Bestway ([documentation](https://github.com/cdpuk/ha-bestway), [issues](https://github.com/cdpuk/ha-bestway/issues))
First occurred: 8:06:32 AM (2 occurrences)
Last logged: 8:11:40 AM

Error fetching Bestway API data: Error communicating with API: 502, message='Bad Gateway', url=URL('https://usapi.gizwits.com/app/bindings')
Logger: custom_components.bestway.coordinator
Source: custom_components/bestway/coordinator.py:36
integration: Bestway ([documentation](https://github.com/cdpuk/ha-bestway), [issues](https://github.com/cdpuk/ha-bestway/issues))
First occurred: 8:06:32 AM (2 occurrences)
Last logged: 8:11:40 AM

Data update failed
Traceback (most recent call last):
  File "/config/custom_components/bestway/bestway/api.py", line 76, in _raise_for_status
    api_error = await response.json()
                ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/aiohttp_client.py", line 79, in json
    return await super().json(*args, loads=loads, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 1176, in json
    raise ContentTypeError(
aiohttp.client_exceptions.ContentTypeError: 0, message='Attempt to decode JSON with unexpected mimetype: text/plain; charset=utf-8', url=URL('https://usapi.gizwits.com/app/bindings')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/config/custom_components/bestway/coordinator.py", line 36, in _async_update_data
    await self.api.refresh_bindings()
  File "/config/custom_components/bestway/bestway/api.py", line 140, in refresh_bindings
    device.device_id: device for device in await self._get_devices()
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/bestway/bestway/api.py", line 145, in _get_devices
    api_data = await self._do_get(f"{self._api_root}/app/bindings")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/bestway/bestway/api.py", line 436, in _do_get
    await _raise_for_status(response)
  File "/config/custom_components/bestway/bestway/api.py", line 78, in _raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 1070, in raise_for_status
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 502, message='Bad Gateway', url=URL('https://usapi.gizwits.com/app/bindings')
alexkrishnan commented 4 months ago

I have also been seeing this issue occasionally during the same time period. Generally, any error given by a server in the 5xx range means "I am broken" as opposed to "your request is bad" so the most likely explanation is that Bestway/gizwits is having server trouble/a bad deploy and there's nothing to be done to fix anything in this ha integration.

Edit: it is also possible that this client is holding onto stale IPs/isn't refreshing DNS often enough but a cursory dive through the code doesn't seem to indicate that as the problem. This also seems unlikely as the issue seems to have started unrelated to any changes to ha-bestway.

alexkrishnan commented 4 months ago

Hmm, I am now consistently seeing these 502 errors every few minutes. Since the requests do eventually go through, I suspect that the Bestway servers are just having a hard time. I tried using the first-party iOS app a few times in the past few days and I notice that the app will frequently swallow my inputs, but as a new owner of this spa/new user of the app I don't know if it's just always been this bad or if it's a new phenomenon.

I guess it's a good excuse/motivator for me to learn this codebase and improve the HTTP error handling.

Seaniau commented 4 months ago

I've been seeing these for a few weeks now. I thought I got around them by using Repeat Until building blocks in my Automations, and setting the Action against the Spa to continue on error. But this doesn't appear to be working either, the Automations don't complete properly when this error is received.

If receiving the errors cannot be circumvented, could their handling be improved in the integaration so that HA Automations don't break?

cdpuk commented 4 months ago

Closing this as there's not much we can do about 5xx errors. The best bet would be a change of approach as described in #28, as that would allow us to communicate with the spa locally, but that's a fair chunk of work.

I don't think suppressing the errors is the right answer here, however if there's a recognised design pattern or some precedence for a different approach, I'm open to suggestions as part of a separate discussion.

alexkrishnan commented 4 months ago

There are several things we can do here, in order from least effort to most effort (hopefully I will find some time in the coming weeks to chip away at the list):