infinite loop if light unreachable

swbova commented 4 years ago

The light is wired to a wall switch. Apparrently when the switch is off, the IP address of the light is unreachable.
This appears to result in an infinite loop in the discover phase. After a sufficient amount of time, this seems to have crashed my HA instance.

I looked at it a little bit. It does appear that perhaps incrementing self._error_count around line 690 when the OS_ERROR exception is caught would fix this? Here is the first error reported:

2020-07-04 21:19:08 ERROR (MainThread) [homeassistant.core] Error doing job: Fatal read error on socket transport
Traceback (most recent call last):
  File "/usr/lib/python3.7/asyncio/selector_events.py", line 801, in _read_ready__data_received
    data = self._sock.recv(self.max_size)
OSError: [Errno 113] No route to host
2020-07-04 21:19:08 DEBUG (MainThread) [aiosenseme.device] Kitchen Light: Connection lost
2020-07-04 21:19:08 WARNING (MainThread) [aiosenseme.device] Kitchen Light: Connection lost
2020-07-04 21:19:08 DEBUG (MainThread) [aiosenseme.device] Kitchen Light: Updater task cancelled
2020-07-04 21:19:09 DEBUG (MainThread) [aiosenseme.device] Kitchen Light: Connecting
2020-07-04 21:19:12 DEBUG (MainThread) [aiosenseme.device] Kitchen Light: Connect failed, try again in a minute
Traceback (most recent call last):
  File "/srv/homeassistant/lib/python3.7/site-packages/aiosenseme/device.py", line 739, in _listener
    PORT,
  File "/usr/lib/python3.7/asyncio/base_events.py", line 959, in create_connection
    raise exceptions[0]
  File "/usr/lib/python3.7/asyncio/base_events.py", line 946, in create_connection
    await self.sock_connect(sock, address)
  File "/usr/lib/python3.7/asyncio/selector_events.py", line 464, in sock_connect
    return await fut
  File "/usr/lib/python3.7/asyncio/selector_events.py", line 494, in _sock_connect_cb
    raise OSError(err, f'Connect call failed {address}')
OSError: [Errno 113] Connect call failed ('192.168.1.202', 31415)

2020-07-04 21:20:12 DEBUG (MainThread) [aiosenseme.device] Kitchen Light: Connecting
2020-07-04 21:20:15 DEBUG (MainThread) [aiosenseme.device] Kitchen Light: Connect failed, try again in a minute

Seven hours later, still going.

2020-07-05 05:23:04 DEBUG (MainThread) [aiosenseme.device] Kitchen Light: Connecting
2020-07-05 05:23:07 DEBUG (MainThread) [aiosenseme.device] Kitchen Light: Connect failed, try again in a minute
Traceback (most recent call last):
  File "/srv/homeassistant/lib/python3.7/site-packages/aiosenseme/device.py", line 739, in _listener
    PORT,
  File "/usr/lib/python3.7/asyncio/base_events.py", line 959, in create_connection
    raise exceptions[0]
  File "/usr/lib/python3.7/asyncio/base_events.py", line 946, in create_connection
    await self.sock_connect(sock, address)
  File "/usr/lib/python3.7/asyncio/selector_events.py", line 464, in sock_connect
    return await fut
  File "/usr/lib/python3.7/asyncio/selector_events.py", line 494, in _sock_connect_cb
    raise OSError(err, f'Connect call failed {address}')
OSError: [Errno 113] Connect call failed ('192.168.1.202', 31415)

mikelawrence commented 4 years ago

Can you explain what you mean by "this seems to have crashed my HA instance"? Also how much time did it take?

If you increment self._error_count where you indicated eventually the listener would stop altogether and no longer work for that fan until Home Assistant is rebooted. I want it to keep trying every minute forever if necessary. The user will see the fan as disabled while these errors are happening and then re-enable as soon as the listener reconnects.

These debug messages are only being printed because you have changed the default logging for the Senseme integration. This can result in a lot of logged messages. Are you seeing messages more frequent than once a minute?

swbova commented 4 years ago

Hi Mike

I meant to say that my instance at port 8123 became unreachable. I logged into my HA server, checked and the process was no longer running, although the discovery loop in device.py was. After restarting HA manually, and turning on the wall switch everything was fine. I was able to reproduce this on July 5 by having the switch on, then turning it off. This again resulted in the HA process exiting abnormally.

In any case, since then I have updated to HA 0.112.3 and now I can no longer reproduce the error.

I'm sorry for the false alarm.

Looks like it was a bug in HA. The log has this error:

2020-07-04 21:19:08 ERROR (MainThread) [homeassistant.core] Error doing job: Fatal read error on socket transport
Traceback (most recent call last):
  File "/usr/lib/python3.7/asyncio/selector_events.py", line 801, in _read_ready__data_received
    data = self._sock.recv(self.max_size)
OSError: [Errno 113] No route to host

There is an old open issue where this message has been seen since 0.57.2 up to 0.111.4

mikelawrence / senseme-hacs

infinite loop if light unreachable #11