JoDehli / PyLoxone

Python Loxone binding
Apache License 2.0
162 stars 40 forks source link

[Bug]: Lost connection to miniserver #221

Closed ztamas83 closed 5 months ago

ztamas83 commented 5 months ago

Describe the bug

Realized that I cannot control some entities, started to look into the logs and found some errors. Need to further triage this, just recording the logs and stuff so I can have a look into it.

Firmware of your Miniserver

14.2.6.16

HomeAssistant install method

HAOS on VM

Version of HomeAssistant

Core 2023.12.1 Supervisor 2023.11.6 Operating System 11.2 Frontend 20231208.2

Version of Pyloxone

0.4.8

Update pyloxone

Not yet, updating to 0.4.12 and will monitor

Log


2023-12-10 22:10:00.523 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1302, in close_connection
    await self.transfer_data_task
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 959, in transfer_data
    message = await self.read_message()
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1029, in read_message
    frame = await self.read_data_frame(max_size=self.max_size)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1104, in read_data_frame
    frame = await self.read_frame(max_size)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1161, in read_frame
    frame = await Frame.read(
            ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/framing.py", line 68, in read
    data = await reader(2)
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 732, in readexactly
    await self._wait_for_data('readexactly')
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 525, in _wait_for_data
    await self._waiter
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/config/custom_components/loxone/miniserver.py", line 189, in listen_loxone_send
    await self.api.send_websocket_command(device_uuid, value)
  File "/config/custom_components/loxone/api.py", line 386, in send_websocket_command
    await self._ws.send(command)
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 635, in send
    await self.ensure_open()
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 935, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedError: sent 1011 (unexpected error) keepalive ping timeout; no close frame received
JoDehli commented 5 months ago

@ztamas83 you need the newest pyloxone for ha 2023.12

ztamas83 commented 5 months ago

Thanks! Updated and monitoring now. I close this issue then and reopen should it still be a problem

ztamas83 commented 5 months ago

It looks like the problem is still there, somehow the connection is lost and never recovered, right now the only clue is the following log. I try to capture the situation when the connection actually brakes.

2023-12-13 06:03:16.920 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1302, in close_connection
    await self.transfer_data_task
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 959, in transfer_data
    message = await self.read_message()
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1029, in read_message
    frame = await self.read_data_frame(max_size=self.max_size)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1104, in read_data_frame
    frame = await self.read_frame(max_size)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1161, in read_frame
    frame = await Frame.read(
            ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/framing.py", line 68, in read
    data = await reader(2)
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 732, in readexactly
    await self._wait_for_data('readexactly')
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 525, in _wait_for_data
    await self._waiter
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/config/custom_components/loxone/miniserver.py", line 182, in listen_loxone_send
    await self.api.send_websocket_command(device_uuid, value)
  File "/config/custom_components/loxone/api.py", line 369, in send_websocket_command
    await self._ws.send(command)
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 635, in send
    await self.ensure_open()
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 935, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedError: sent 1011 (unexpected error) keepalive ping timeout; no close frame received
ztamas83 commented 5 months ago

PR #223 helps to recover from the error at least. Will continue looking for where the connection is lost and why it doesn't recover.

ztamas83 commented 5 months ago

Keepalive timeout. Interesting is that Tibber's websocket connection also breaks. Will investigate the virtual machine and the underlying ethernet connection. Does not seem to be a Loxone integration issue.

2023-12-13 22:55:13.673 DEBUG (MainThread) [custom_components.loxone.api.websocket] % sending keepalive ping
2023-12-13 22:55:13.673 DEBUG (MainThread) [custom_components.loxone.api.websocket] > PING 39 ff ce 63 [binary, 4 bytes]
2023-12-13 22:55:13.677 DEBUG (MainThread) [custom_components.loxone.api.websocket] < PONG 39 ff ce 63 [binary, 4 bytes]
2023-12-13 22:55:13.677 DEBUG (MainThread) [custom_components.loxone.api.websocket] % received keepalive pong
2023-12-13 22:55:13.690 DEBUG (MainThread) [custom_components.loxone.api.websocket] < BINARY 03 01 00 00 58 1b 00 00 [8 bytes]
2023-12-13 22:55:13.692 DEBUG (MainThread) [custom_components.loxone.api.websocket] < BINARY 1f 8b 08 00 00 00 00 00 00 13 cd 5d e9 6f e3 46 ... dd 93 65 39 c0 6c 00 00 [7000 bytes]
2023-12-13 22:55:33.680 DEBUG (MainThread) [custom_components.loxone.api.websocket] % sending keepalive ping
2023-12-13 22:55:33.681 DEBUG (MainThread) [custom_components.loxone.api.websocket] > PING ce e8 42 d0 [binary, 4 bytes]
2023-12-13 22:55:35.594 INFO (MainThread) [gql.transport.websockets] >>> {"type": "ping"}
2023-12-13 22:55:35.665 DEBUG (MainThread) [custom_components.loxone.api.websocket] > TEXT 'keepalive' [9 bytes]
2023-12-13 22:55:36.004 DEBUG (MainThread) [custom_components.loxone.api.websocket] > TEXT 'keepalive' [9 bytes]
2023-12-13 22:55:39.673 INFO (MainThread) [gql.transport.websockets] <<< {"type":"pong"}
2023-12-13 22:55:44.847 DEBUG (MainThread) [custom_components.loxone.api.websocket] > TEXT 'keepalive' [9 bytes]
2023-12-13 22:55:53.684 DEBUG (MainThread) [custom_components.loxone.api.websocket] ! timed out waiting for keepalive pong
2023-12-13 22:55:53.685 DEBUG (MainThread) [custom_components.loxone.api.websocket] ! failing connection with code 1011
2023-12-13 22:55:53.685 DEBUG (MainThread) [custom_components.loxone.api.websocket] = connection is CLOSING
2023-12-13 22:55:53.685 DEBUG (MainThread) [custom_components.loxone.api.websocket] > CLOSE 1011 (unexpected error) keepalive ping timeout [24 bytes]
2023-12-13 22:56:09.676 INFO (MainThread) [gql.transport.websockets] >>> {"type": "ping"}
2023-12-13 22:56:09.720 ERROR (MainThread) [tibber.home] Error in rt_subscribe
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 959, in transfer_data
    message = await self.read_message()
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1029, in read_message
    frame = await self.read_data_frame(max_size=self.max_size)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1104, in read_data_frame
    frame = await self.read_frame(max_size)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 1161, in read_frame
    frame = await Frame.read(
            ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/framing.py", line 68, in read
    data = await reader(2)
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 732, in readexactly
    await self._wait_for_data('readexactly')
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 525, in _wait_for_data
    await self._waiter
  File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 964, in _read_ready__get_buffer
    nbytes = self._sock.recv_into(buf)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/tibber/home.py", line 439, in _start
    async for data in self._tibber_control.realtime.sub_manager.session.subscribe(
  File "/usr/local/lib/python3.11/site-packages/gql/client.py", line 1064, in subscribe
    async for result in inner_generator:
  File "/usr/local/lib/python3.11/site-packages/gql/client.py", line 960, in _subscribe
    async for result in inner_generator:
  File "/usr/local/lib/python3.11/site-packages/gql/transport/websockets_base.py", line 388, in subscribe
    answer_type, execution_result = await listener.get()
                                    ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gql/transport/websockets_base.py", line 56, in get
    raise item
  File "/usr/local/lib/python3.11/site-packages/gql/transport/websockets_base.py", line 301, in _receive_data_loop
    answer = await self._receive()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tibber/websocker_transport.py", line 37, in _receive
    msg = await asyncio.wait_for(super()._receive(), timeout=self._timeout)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/tasks.py", line 489, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gql/transport/websockets_base.py", line 231, in _receive
    data: Data = await self.websocket.recv()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 568, in recv
    await self.ensure_open()
  File "/usr/local/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 935, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedError: no close frame received or sent