tuya / tuya-iot-python-sdk

Tuya IoT Python SDK for Tuya Open API.
MIT License
117 stars 47 forks source link

Add exponential backoff to MQTT client to prevent loop breaking on API failure #57

Closed ZephireNZ closed 2 years ago

ZephireNZ commented 2 years ago

This fixes #56 where API errors are causing the MQTT reconnect to fail and the loop to stop - after which the MQTT client will stop receiving updates from the server.

I have implemented exponential backoff (with a cap at 60 seconds) - but I am happy to take feedback if this should be lower/higher as needed.

ZephireNZ commented 2 years ago

I am thinking to also add retry to the commands as well, as that's the other one that keeps getting failures.

However that can get a bit trickier, as someone may manually retry the service call - resulting in two calls being instead of one.

Thoughts?

For example:

2022-02-06 20:46:04 DEBUG (Thread-7) [tuya_iot] _on_log: Sending PINGREQ
2022-02-06 20:46:05 DEBUG (Thread-7) [tuya_iot] _on_log: Received PINGRESP
2022-02-06 20:47:05 DEBUG (Thread-7) [tuya_iot] _on_log: Sending PINGREQ
2022-02-06 20:47:05 DEBUG (Thread-7) [tuya_iot] _on_log: Received PINGRESP
2022-02-06 20:47:21 DEBUG (SyncWorker_0) [homeassistant.components.tuya] Sending commands for device 078443178caab57ec4c8: [{'code': <DPCode.SWITCH: 'switch'>, 'value': True}, {'code': <DPCode.MODE: 'mode'>, 'value': 'cold'}]
2022-02-06 20:47:21 DEBUG (SyncWorker_0) [tuya_iot] Request: method = POST,                 url = https://openapi.tuyaus.com/v1.0/devices/078443178caab57ec4c8/commands,                params = None,                body = {'commands': [{'code': <DPCode.SWITCH: 'switch'>, 'value': True}, {'code': <DPCode.MODE: 'mode'>, 'value': 'cold'}]},                t = 1644133641122
2022-02-06 20:47:26 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140328344721152] HTTPSConnectionPool(host='openapi.tuyaus.com', port=443): Max retries exceeded with url: /v1.0/devices/078443178caab57ec4c8/commands (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa0bd581190>: Failed to establish a new connection: [Errno -3] Try again'))
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/local/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Try again

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1040, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 358, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fa0bd581190>: Failed to establish a new connection: [Errno -3] Try again

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 440, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 785, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openapi.tuyaus.com', port=443): Max retries exceeded with url: /v1.0/devices/078443178caab57ec4c8/commands (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa0bd581190>: Failed to establish a new connection: [Errno -3] Try again'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 190, in handle_call_service
    await hass.services.async_call(
  File "/usr/src/homeassistant/homeassistant/core.py", line 1630, in async_call
    task.result()
  File "/usr/src/homeassistant/homeassistant/core.py", line 1667, in _execute_service
    await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 204, in handle_service
    await self.hass.helpers.service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 668, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 921, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 705, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/climate/__init__.py", line 470, in async_set_hvac_mode
    await self.hass.async_add_executor_job(self.set_hvac_mode, hvac_mode)
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/homeassistant/homeassistant/components/tuya/climate.py", line 275, in set_hvac_mode
    self._send_command(commands)
  File "/usr/src/homeassistant/homeassistant/components/tuya/base.py", line 271, in _send_command
    self.device_manager.send_commands(self.device.id, commands)
  File "/usr/local/lib/python3.9/site-packages/tuya_iot/device.py", line 488, in send_commands
    return self.device_manage.send_commands(device_id, commands)
  File "/usr/local/lib/python3.9/site-packages/tuya_iot/device.py", line 649, in send_commands
    return self.api.post(
  File "/usr/local/lib/python3.9/site-packages/tuya_iot/openapi.py", line 316, in post
    return self.__request("POST", path, None, body)
  File "/usr/local/lib/python3.9/site-packages/tuya_iot/openapi.py", line 266, in __request
    response = self.session.request(
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='openapi.tuyaus.com', port=443): Max retries exceeded with url: /v1.0/devices/078443178caab57ec4c8/commands (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa0bd581190>: Failed to establish a new connection: [Errno -3] Try again'))
frenck commented 2 years ago

Thoughts?

I would not add retries to that one, instead, catch those exceptions and wrap it into a specific exception for this library. E.g., a TuyaCommandError is something...

It is up to the implementer at that point to retry or not.

zlinoliver commented 2 years ago

Hi @ZephireNZ thanks for the contribution, will merge this PR. @frenck your suggestion is also reasonable, but as most of our integration users don't have technical background, may not know how to handle these exceptions. So @ZephireNZ 's solution is more suitable for now.

balloob commented 2 years ago

@zlinoliver Frenck's feedback was not on this PR, but on the idea of retrying commands. I agree with Frenck. For commands failing, don't retry but instead raise. The user won't see them but HA or any other application integration this SDK will. Let the application decide if they want to retry. Else the application cannot distinguish between a long running command versus errors+retrying under the hood.