upsert / lutron-caseta-pro

Custom Home Assistant Component for Lutron Caseta Smart Bridge PRO / RA2 Select
Apache License 2.0
184 stars 38 forks source link

Error when HA loses connection to Pro Bridge #1

Closed marthoc closed 6 years ago

marthoc commented 6 years ago

When HA loses connection to the Pro Bridge (eg the bridge is restarted, after a power loss, etc), the following error appears in the HA log:

Error doing job: Fatal error on transport TCPTransport (error status in uv_stream_t.read callback)
OSError: [Errno 113] Host is unreachable

When trying to control a device, the following traceback is thrown:

Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/lib/python3.6/asyncio/tasks.py", line 182, in _step
    result = coro.throw(exc)
  File "/config/custom_components/lutron_caseta_pro.py", line 239, in _read_next
    read_response = yield from self._casetify.read()
  File "/config/custom_components/casetify.py", line 194, in read
    match = yield from self._read_until(CASETA_RE)
  File "/config/custom_components/casetify.py", line 184, in _read_until
    self._read_buffer += yield from self.reader.read(READ_SIZE)
  File "/usr/lib/python3.6/asyncio/streams.py", line 628, in read
    yield from self._wait_for_data('read')
  File "/usr/lib/python3.6/asyncio/streams.py", line 458, in _wait_for_data
    yield from self._waiter
  File "/usr/lib/python3.6/asyncio/futures.py", line 332, in __iter__
    yield self  # This tells Task to wait for completion.
  File "/usr/lib/python3.6/asyncio/tasks.py", line 250, in _wakeup
    future.result()
  File "/usr/lib/python3.6/asyncio/futures.py", line 245, in result
    raise self._exception
OSError: [Errno 113] Host is unreachable

So it appears as though HA isn’t trying to reestablish the telnet connection if it breaks. I’ve found that only restarting HA makes the Caseta devices reachable again.

upsert commented 6 years ago

Confirmed on Home Assistant 0.61.1. I will look into it.

Just pulling the cable on the Lutron box didn't have the same effect. Curious.

upsert commented 6 years ago

Added some re-connection code. If it loses the connection it will attempt a new connection every 60 seconds.

I tested it by unplugging the bridge for a few minutes and re-connecting. Works ok, but only downside is it does not update the state after re-connections. Not exactly sure how to access all the devices from that part of the code so there is definite room for improvement.

Also found if you unplug and re-plug with a minute or so, the timeouts may not occur because no exceptions are thrown. So a short disconnection may not re-connect properly. Only applies to power cut to the bridge. A loss of network connection is different and usually recovers.