michaelarnauts / home-assistant-comfoconnect

Home Assistant integration for Zehnder ComfoAir trough a Zehnder ComfoConnect LAN C.
Other
26 stars 13 forks source link

Timeout while waiting for response from bridge #36

Closed roofburner closed 4 months ago

roofburner commented 6 months ago

Hi,

As advised by zollak I open a new issue.

I have automations, calling the service "fan.set_percentage" for entity_id "fan.comfoairq"

When home assistant start, all works well, but after "a while" the action/service seems to fail. If I delete the hub and readd it, all works well for a while until...

When the related automations are in mode single, I get warnings in the log the action is "Already running". I have switched the automation mode to restart. Now I get the below warnings. I can take a tcpdump, if needed.

Please advise...

2024-02-02 15:25:46.854 WARNING (MainThread) [aiocomfoconnect.bridge] Timeout while waiting for response from bridge
2024-02-02 15:25:46.854 WARNING (MainThread) [aiocomfoconnect.bridge] Timeout while waiting for response from bridge
2024-02-02 15:25:46.854 ERROR (MainThread) [homeassistant.helpers.entity] Update for select.comfoairq_balance_mode fails
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/asyncio/tasks.py", line 500, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/aiocomfoconnect/bridge.py", line 168, in _send
    return await asyncio.wait_for(fut, TIMEOUT)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/tasks.py", line 502, in wait_for
    raise exceptions.TimeoutError() from exc
TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 898, in async_update_ha_state
    await self.async_device_update()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1216, in async_device_update
    await self.async_update()
  File "/config/custom_components/comfoconnect/select.py", line 199, in async_update
    self._attr_current_option = await self.entity_description.get_value_fn(
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiocomfoconnect/comfoconnect.py", line 244, in get_balance_mode
    result_06 = await self.cmd_rmi_request(bytes([0x83, UNIT_SCHEDULE, SUBUNIT_06, 0x01]))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiocomfoconnect/bridge.py", line 171, in _send
    raise AioComfoConnectTimeout from exc
aiocomfoconnect.exceptions.AioComfoConnectTimeout
2024-02-02 15:25:46.865 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/asyncio/tasks.py", line 500, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/aiocomfoconnect/bridge.py", line 168, in _send
    return await asyncio.wait_for(fut, TIMEOUT)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/tasks.py", line 502, in wait_for
    raise exceptions.TimeoutError() from exc
TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/config/custom_components/comfoconnect/__init__.py", line 151, in send_keepalive
    await bridge.cmd_time_request()
  File "/usr/local/lib/python3.11/site-packages/aiocomfoconnect/bridge.py", line 171, in _send
    raise AioComfoConnectTimeout from exc
aiocomfoconnect.exceptions.AioComfoConnectTimeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/config/custom_components/comfoconnect/__init__.py", line 158, in send_keepalive
    await bridge.connect(entry.data[CONF_LOCAL_UUID])
  File "/usr/local/lib/python3.11/site-packages/aiocomfoconnect/comfoconnect.py", line 70, in connect
    await super().connect(uuid)
  File "/usr/local/lib/python3.11/site-packages/aiocomfoconnect/bridge.py", line 94, in connect
    await self.disconnect()
  File "/usr/local/lib/python3.11/site-packages/aiocomfoconnect/bridge.py", line 121, in disconnect
    await self._read_task
asyncio.exceptions.InvalidStateError: invalid state
roofburner commented 6 months ago

Forgot to mention:

FW hub: R1.11.0

zollak commented 6 months ago

hi, this log details are the same as I opened in the #33 issue. In this case it seems duplicated. You mentioned different error message in the comment there: https://github.com/michaelarnauts/home-assistant-comfoconnect/issues/33#issuecomment-1920947668

My solution was the following: -I updated it to the aiocomfoconnect==0.1.9 version (e.g. in the config/custom_components/comfoconnect/manifest.json) -reboot HA (not only restart) -reboot Zehnder ComfoAir Q350

The integration has been working without errors for about 6 days.

roofburner commented 6 months ago

Dear zollak,

This is very usefull info! I read about the aiocomfoconnet version, but did not know where to change it. I edited the manifest file, reooted the Comfoconnect LAN adapter and rebooted my HA server. Lets evaluate this... I will give feedback!

Thanks a lot.

michaelarnauts commented 6 months ago

I'm sorry for not replying more here. Unfortunately life is quite busy to find spare time that I'm willing to spend on this integration. I do notice the issue myself, but the occasional restart always solves it.

I also don't think this is fixed in a more recent version of the library. I think there is something wrong with my code that detects a disconnect. I have a feeling that there are some edge cases I didn't take into account.

roofburner commented 6 months ago

Dear Michael,

"Good news" you suffer the same issues as we do :-)

The solution zollak provide, did not work out for me.

Extra info:

When I manipulate the fan speed via the integration in home assistant, I see the fan speed change in my ComfoControl app. At that very moment I see the debug errors appearing in the home-assistant.log.

But: I don't see any messages anymore containing "[custom_components.comfoconnect.sensor] Handle update for sensor".

I there is anything I can do to help (test/tcpdump/...)

roofburner commented 6 months ago

I don't know who of you uses the same setup as I do? I use multiple Xiaomi Aqara sensors (WSDCGQ11LM) in combination with multiple automations to set the fan speed of my ComfoAirQ via the service fan.set_percentage. In the zigbee2mqtt log I saw that the MQTT bridge often publishes 3 messages for an aqara sensor at the same time. E.g.:

info  2024-02-05 08:25:19: MQTT publish: topic 'zigbee2mqtt/aqara_livingroom', payload '{"battery":100,"humidity":59.51,"linkquality":199,"power_outage_count":21527,"pressure":1018.7,"temperature":19.62,"voltage":3205}'
info  2024-02-05 08:25:19: MQTT publish: topic 'zigbee2mqtt/aqara_livingroom', payload '{"battery":100,"humidity":59.13,"linkquality":199,"power_outage_count":21527,"pressure":1018.7,"temperature":19.62,"voltage":3205}'
info  2024-02-05 08:25:19: MQTT publish: topic 'zigbee2mqtt/aqara_livingroom', payload '{"battery":100,"humidity":59.13,"linkquality":199,"power_outage_count":21527,"pressure":1019,"temperature":19.62,"voltage":3205}'

I suspect that this behaviour causes the "Already running" problem in Home Assistant, as the associated automation gets triggered 3 times.

This weekend, I did some tests. I ran the Comfoconnect integration in read-only mode. I.e. I disabled all automations calling fan.set_percentage. It seems the connection stays active. Based on that I made a couple of changes, still evaluating them. I see the changes as a best/better practice :-):

configuration.yaml:

input_select:
  comfoair_fan_speed:
    name: Comfoair fan speed
    options:
      -  0.00
      - 33.33
      - 66.66
      - 100.0
    initial: 33.33
    icon: mdi:fan

automation.yaml:

In every fan automation I use the service input_select.select_option:

- alias: 'xxx '
  initial_state: 'on'
  trigger:
  ...
  condition:
  ...
  action:
    - service: input_select.select_option
      target:
        entity_id: input_select.comfoair_fan_speed
      data:
        option: 0.00 or 33.33 or 66.66 or 100.0
  mode: queued

One automation manipulates the fan speed:

- alias: 'Fan - set speed'
  initial_state: 'on'
  trigger:
    - platform: state
      entity_id: input_select.comfoair_fan_speed
      for:
        seconds: 5
  action:
    - service: fan.set_percentage
      target:
        entity_id: fan.comfoairq
      data:
        percentage: '{{ states.input_select.comfoair_fan_speed.state }}'
    - delay: 30
  mode: restart
freemann commented 5 months ago

Any news here? I'm also have issues with the integration crashing and/or not sending commands to the ComfoConnect. Forked this repo and added the pull request to my own repo, but this isn't solving it. Now trying the "debounce" option.

michaelarnauts commented 5 months ago

No news. I got an issue a few days ago where the connection with the bridge was lost and it didn't reconnect as it should.

I've copied the stack trace, but haven't looked at it yet.

A restart of Home Assistant always solves it. I know that's a lousy solution, but it doesn't happen a lot here, so untill I find some time to fix this, it will do.

I think there is some exception happening that we don't catch when we poll for a keepalive. I think unplugging the network cable of the bridge, waiting a while (so it misses a keepalive) and reconnecting the cable could help to reproduce the issue. You might want to look into that if you are interested :)

michaelarnauts commented 5 months ago

Note to self, and maybe others. This was my stack trace.

Logger: homeassistant Bron: custom_components/comfoconnect/init.py:158 integratie: Zehnder ComfoAir Q First occurred: 11 maart 2024 om 18:49:14 (3316 gebeurtenissen) Laatst gelogd: 22:26:47

Error doing job: Task exception was never retrieved Traceback (most recent call last): File "/config/custom_components/comfoconnect/init.py", line 151, in send_keepalive await bridge.cmd_time_request() File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/bridge.py", line 138, in _send raise AioComfoConnectNotConnected() aiocomfoconnect.exceptions.AioComfoConnectNotConnected

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/config/custom_components/comfoconnect/init.py", line 158, in send_keepalive await bridge.connect(entry.data[CONF_LOCAL_UUID]) File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/comfoconnect.py", line 70, in connect await super().connect(uuid) File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/bridge.py", line 94, in connect await self.disconnect() File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/bridge.py", line 121, in disconnect await self._read_task ConnectionResetError: [Errno 104] Connection reset by peer

roofburner commented 5 months ago

My news: I have not experienced any issues since I made the changes mentioned in an earlier post... This maybe did not fix an existing problem, but only creates a work around...

freemann commented 5 months ago

Thanks for the comment! I've implemented the debounce and hopefully it survives the night.... until now its crashing multiple times a day.

freemann commented 5 months ago

The integration stopped several times this day, even with the debounce in place. My usecase; All my rooms have an Co2 sensor and I use them to control my Q350. I see that my automation set the Q350 to 100% around 07:15 because the Co2 value's are to high. Around 08:20-08:25 the value's were low enough to set the Q350 to 66% or even 33% but this didn't happen.

In the log's I see;

This error originated from a custom integration.

Logger: aiocomfoconnect.bridge
Source: custom_components/comfoconnect/fan.py:157
integration: Zehnder ComfoAir Q
First occurred: 08:14:31 (3 occurrences)
Last logged: 08:22:12

Timeout while waiting for response from bridge

And

Logger: homeassistant.components.automation.fan_set_speed
Source: helpers/script.py:485
integration: Automation (documentation, issues)
First occurred: 08:14:31 (3 occurrences)
Last logged: 08:22:12

Fan - set WTW speed: Error executing script. Unexpected error for call_service at pos 1:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
    return await fut
           ^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/bridge.py", line 168, in _send
    return await asyncio.wait_for(fut, TIMEOUT)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/asyncio/tasks.py", line 519, in wait_for
    async with timeouts.timeout(timeout):
  File "/usr/local/lib/python3.12/asyncio/timeouts.py", line 115, in __aexit__
    raise TimeoutError from exc_val
TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 485, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 723, in _async_call_service_step
    response_data = await self._async_run_long_action(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 685, in _async_run_long_action
    return long_task.result()
           ^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2319, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2356, in _execute_service
    return await target(service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 905, in entity_service_call
    single_response = await _handle_entity_call(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 975, in _handle_entity_call
    result = await task
             ^^^^^^^^^^
  File "/config/custom_components/comfoconnect/fan.py", line 157, in async_set_percentage
    await self._ccb.set_speed(speed)
  File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/comfoconnect.py", line 212, in set_speed
    await self.cmd_rmi_request(bytes([0x84, UNIT_SCHEDULE, SUBUNIT_01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x02]))
  File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/bridge.py", line 171, in _send
    raise AioComfoConnectTimeout from exc
aiocomfoconnect.exceptions.AioComfoConnectTimeout

and

Logger: homeassistant.components.automation.fan_set_speed
Source: components/automation/__init__.py:666
integration: Automation (documentation, issues)
First occurred: 08:14:31 (3 occurrences)
Last logged: 08:22:12

While executing automation automation.fan_set_speed
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
    return await fut
           ^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/bridge.py", line 168, in _send
    return await asyncio.wait_for(fut, TIMEOUT)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/asyncio/tasks.py", line 519, in wait_for
    async with timeouts.timeout(timeout):
  File "/usr/local/lib/python3.12/asyncio/timeouts.py", line 115, in __aexit__
    raise TimeoutError from exc_val
TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/automation/__init__.py", line 666, in async_trigger
    return await self.action_script.async_run(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 1600, in async_run
    return await asyncio.shield(run.async_run())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 435, in async_run
    await self._async_step(log_exceptions=False)
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 487, in _async_step
    self._handle_exception(
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 512, in _handle_exception
    raise exception
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 485, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 723, in _async_call_service_step
    response_data = await self._async_run_long_action(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 685, in _async_run_long_action
    return long_task.result()
           ^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2319, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2356, in _execute_service
    return await target(service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 905, in entity_service_call
    single_response = await _handle_entity_call(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 975, in _handle_entity_call
    result = await task
             ^^^^^^^^^^
  File "/config/custom_components/comfoconnect/fan.py", line 157, in async_set_percentage
    await self._ccb.set_speed(speed)
  File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/comfoconnect.py", line 212, in set_speed
    await self.cmd_rmi_request(bytes([0x84, UNIT_SCHEDULE, SUBUNIT_01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x02]))
  File "/usr/local/lib/python3.12/site-packages/aiocomfoconnect/bridge.py", line 171, in _send
    raise AioComfoConnectTimeout from exc
aiocomfoconnect.exceptions.AioComfoConnectTimeout

After the timeout the whole log is scattered with error's. I now had an automation to check if the outside temp. of exhaust flow is not change voor 30min. If so, I toggle the wallplug of the Q350 and reboot HASS. It's a nasty solution, but I need this to keep my home properly ventilated :)

Hopefully it can be fixed soon and if I can help with a debug/beta version; please let me know!

freemann commented 5 months ago

Last night HA restarted 8 times in around 10 hours. This morning I disabled a lot of unnecessary sensors:

My reboot automation didn't triggered for around 3,5 hours and the integration sensors are still updating.

[edit] If this is solving my issue, maybe the integration is overloading the comfoconnect box? Perhaps slowing things down could prevent the timeouts?

[edit_2] Around 12:30 HA rebooted and the previous reboot was at 07:30, so around 5 hours stable.

freemann commented 5 months ago

Update; I'm running with an minimum enabled sensors and for now the integration is "rock solid". The past 2 day's a enabled 1 sensor a day, lets see what going to happen the upcoming days...

With all sensors enabled it crashed sometimes >10 times a day...

litinoveweedle commented 4 months ago

I know it is very simple/stupid idea, but as it seems, that errors are related to the timeout exception, and it seems that decreasing number of polled sensors helped, would it be possible to increase timeout in the library?

freemann commented 4 months ago

``This evening I got the first reboot in days, after enabling several sensors for the past days. For now I stop enabling extra sensors and will monitor this for the upcoming days. If it keeps triggering my automation, I will disable one sensor to see if it's get stable again.

[edit 24-03-24 08:53] No reboots over night, so it's "stable" for now. Just enable another sensor.

[edit 24-03-24 09:54] My automation triggered, which means 4 sensors didn't update for more then 30 minutes. Looks like the integration is getting more troubles to update the sensors.

This is my trigger for detecting of the integration is updating my sensors. I use 4 sensors because I noticed that sometimes one or two sensors are not updating for XX minutes and for some reason the come back to life

{{
(
(now() - states.sensor.comfoairq_exhaust_airflow_2.last_changed >= timedelta(minutes=30)) 
and 
(now() - states.sensor.comfoairq_exhaust_temperature_2.last_changed >= timedelta(minutes=30))
and 
(now() - states.sensor.comfoairq_supply_airflow_2.last_changed >= timedelta(minutes=30)) 
and 
(now() - states.sensor.comfoairq_outside_temperature_2.last_changed >= timedelta(minutes=30))
)
}}

[edit 24-03-24 13:52] Another reboot...

[edit 24-03-24 19:57] Another reboot...

[edit 24-03-24 23:33] Another reboot...

freemann commented 4 months ago

Ok, that are "a lot" of reboots... Going to disable one sensor and that will be the first sensor that I enabled on 20-03-2024. Hopefully that will reduce the number of reboots and by disable the first sensor I enabled, will(hopefully) tell its the sensor or the sensor count that is giving troubles.

My excel sheet: image

[edit 25-03-24 14:48] Another reboot...

[edit 25-03-24 20:23] Another reboot...

[edit 25-03-24 21:46] Another reboot...

[edit 26-03-24 00:17] Another reboot...

[edit 26-03-24 01:24] Another reboot...

[edit 26-03-24 06:01] Another reboot...

[edit 26-03-24 08:40] Another reboot...

[edit 26-03-24 20:58] Another reboot...

Disabled "current RMOT sensor".

[edit 27-03-24 01:57] Another reboot...

[edit 27-03-24 06:42] Another reboot...

[edit 27-03-24 10:30] Another reboot...

Disabled "comfoairq_bypass_state_2"

michaelarnauts commented 4 months ago

Closing since I want to migrate all these issues in one issue. See https://github.com/michaelarnauts/home-assistant-comfoconnect/issues/43 for a possible solution and follow-up.