home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.67k stars 30.42k forks source link

radiotherm still losing connectivity but is it just how HA processes entity updates? #25701

Closed danbo closed 3 years ago

danbo commented 5 years ago

Home Assistant release with the issue: Home Assistant 0.96.5 Frontend version: 20190721.1 - latest

Last working Home Assistant release (if known): unknown

Operating environment (Hass.io/Docker/Windows/etc.): Docker version 19.03.1

Component/platform: https://www.home-assistant.io/components/radiotherm/

Description of problem:

Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):

climate:
  platform: radiotherm
  scan_interval: 120
  host:
    - a.b.c.d
  hold_temp: true

Traceback (if applicable):

2019-07-31 23:15:11 INFO (MainThread) [homeassistant.core] Starting Home Assistant
...
2019-08-01 06:00:00 WARNING (MainThread) [homeassistant.helpers.entity] Update of climate.thermostat01 is taking over 10 seconds,
2019-08-01 06:00:08 ERROR (MainThread) [homeassistant.helpers.entity] Update for climate.thermostat01 fails,
  File "/usr/local/lib/python3.7/urllib/request.py", line 1317, in do_open,
    encode_chunked=req.has_header('Transfer-encoding')),
  File "/usr/local/lib/python3.7/http/client.py", line 1244, in request,
  File "/usr/local/lib/python3.7/http/client.py", line 1290, in _send_request,
    self._send_request(method, url, body, headers, encode_chunked),
    self.endheaders(body, encode_chunked=encode_chunked),
  File "/usr/local/lib/python3.7/http/client.py", line 1239, in endheaders,
    self._send_output(message_body, encode_chunked=encode_chunked),
  File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output,
    self.send(msg),
  File "/usr/local/lib/python3.7/http/client.py", line 966, in send,
    self.connect(),
  File "/usr/local/lib/python3.7/http/client.py", line 938, in connect,
    (self.host,self.port), self.timeout, self.source_address),
  File "/usr/local/lib/python3.7/socket.py", line 727, in create_connection,
    raise err,
  File "/usr/local/lib/python3.7/socket.py", line 716, in create_connection,
    sock.connect(sa),
OSError: [Errno 113] No route to host,
During handling of the above exception, another exception occurred:,
,
Traceback (most recent call last):,
  File "/usr/src/app/homeassistant/helpers/entity.py", line 221, in async_update_ha_state,
    await self.async_device_update(),
  File "/usr/src/app/homeassistant/helpers/entity.py", line 380, in async_device_update,
    await self.hass.async_add_executor_job(self.update),
  File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run,
    result = self.fn(*self.args, **self.kwargs),
    data = self.device.tstat['raw'],
  File "/usr/src/app/homeassistant/components/radiotherm/climate.py", line 217, in update,
  File "/usr/local/lib/python3.7/site-packages/radiotherm/fields.py", line 45, in __get__,
    response = instance.get(self.url),
  File "/usr/local/lib/python3.7/site-packages/radiotherm/thermostat.py", line 35, in get,
    return request.urlopen(url),
  File "/usr/local/lib/python3.7/urllib/request.py", line 222, in urlopen,
    return opener.open(url, data, timeout),
  File "/usr/local/lib/python3.7/urllib/request.py", line 525, in open,
    response = self._open(req, data),
  File "/usr/local/lib/python3.7/urllib/request.py", line 543, in _open,
    '_open', req),
  File "/usr/local/lib/python3.7/urllib/request.py", line 503, in _call_chain,
    result = func(*args),
  File "/usr/local/lib/python3.7/urllib/request.py", line 1345, in http_open,
    return self.do_open(http.client.HTTPConnection, req),
  File "/usr/local/lib/python3.7/urllib/request.py", line 1319, in do_open,
urllib.error.URLError: <urlopen error [Errno 113] No route to host>,
    raise URLError(err)
...

2019-08-01 22:00:44 WARNING (MainThread) [homeassistant.helpers.entity] Update of climate.thermostat01 is taking over 10 seconds,
2019-08-01 22:02:35 WARNING (MainThread) [homeassistant.components.climate] Updating radiotherm climate took longer than the scheduled update interval 0:02:00,
2019-08-01 22:04:36 WARNING (MainThread) [homeassistant.components.climate] Updating radiotherm climate took longer than the scheduled update interval 0:02:00
...

2019-08-02 00:09:28 WARNING (MainThread) [homeassistant.components.climate] Updating radiotherm climate took longer than the scheduled update interval 0:02:00,
2019-08-02 00:11:29 WARNING (MainThread) [homeassistant.components.climate] Updating radiotherm climate took longer than the scheduled update interval 0:02:00
...

2019-08-02 00:16:38 INFO (MainThread) [homeassistant.core] Starting Home Assistant

...

2019-08-02 02:01:18 ERROR (MainThread) [homeassistant.helpers.entity] Update for climate.thermostat01 fails,
Traceback (most recent call last):,
  File "/usr/local/lib/python3.7/urllib/request.py", line 1317, in do_open,
    encode_chunked=req.has_header('Transfer-encoding')),
  File "/usr/local/lib/python3.7/http/client.py", line 1244, in request,
    self._send_request(method, url, body, headers, encode_chunked),
  File "/usr/local/lib/python3.7/http/client.py", line 1290, in _send_request,
    self.endheaders(body, encode_chunked=encode_chunked),
  File "/usr/local/lib/python3.7/http/client.py", line 1239, in endheaders,
    self._send_output(message_body, encode_chunked=encode_chunked),
  File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output,
    self.send(msg),
    self.connect(),
  File "/usr/local/lib/python3.7/http/client.py", line 938, in connect,
    (self.host,self.port), self.timeout, self.source_address),
  File "/usr/local/lib/python3.7/socket.py", line 727, in create_connection,
OSError: [Errno 113] No route to host,
    raise err,
  File "/usr/local/lib/python3.7/socket.py", line 716, in create_connection,
    sock.connect(sa),
,
During handling of the above exception, another exception occurred:,
  File "/usr/src/app/homeassistant/helpers/entity.py", line 221, in async_update_ha_state,
    await self.async_device_update(),
  File "/usr/src/app/homeassistant/helpers/entity.py", line 380, in async_device_update,
    await self.hass.async_add_executor_job(self.update),
  File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run,
    result = self.fn(*self.args, **self.kwargs),
  File "/usr/src/app/homeassistant/components/radiotherm/climate.py", line 217, in update,
    data = self.device.tstat['raw'],
  File "/usr/local/lib/python3.7/site-packages/radiotherm/fields.py", line 45, in __get__,
    response = instance.get(self.url),
  File "/usr/local/lib/python3.7/site-packages/radiotherm/thermostat.py", line 35, in get,
    return request.urlopen(url),
  File "/usr/local/lib/python3.7/urllib/request.py", line 222, in urlopen,
    return opener.open(url, data, timeout),
  File "/usr/local/lib/python3.7/urllib/request.py", line 525, in open,
    response = self._open(req, data),
  File "/usr/local/lib/python3.7/urllib/request.py", line 543, in _open,
    '_open', req),
  File "/usr/local/lib/python3.7/urllib/request.py", line 503, in _call_chain,
    result = func(*args),
  File "/usr/local/lib/python3.7/urllib/request.py", line 1345, in http_open,
    return self.do_open(http.client.HTTPConnection, req),
  File "/usr/local/lib/python3.7/urllib/request.py", line 1319, in do_open,
    raise URLError(err),
urllib.error.URLError: <urlopen error [Errno 113] No route to host>
...
2019-08-02 14:02:04 WARNING (MainThread) [homeassistant.helpers.entity] Update of climate.thermostat01 is taking over 10 seconds,
2019-08-02 18:01:47 WARNING (MainThread) [homeassistant.helpers.entity] Update of climate.thermostat01 is taking over 10 seconds,
2019-08-02 18:03:38 WARNING (MainThread) [homeassistant.components.climate] Updating radiotherm climate took longer than the scheduled update interval 0:02:00,
2019-08-02 18:05:39 WARNING (MainThread) [homeassistant.components.climate] Updating radiotherm climate took longer than the scheduled update interval 0:02:00
...

2019-08-04 22:47:14 WARNING (MainThread) [homeassistant.components.climate] Updating radiotherm climate took longer than the scheduled update interval 0:02:00
...

2019-08-04 22:54:52 INFO (MainThread) [homeassistant.core] Starting Home Assistant

Additional information:

For background, there were previous connectivity concerns: https://github.com/home-assistant/home-assistant/issues/8404

Also, I'm no python expert, but I was looking at the following code:

It seems that the async with mechnism will release the lock but what happens if it never comes back? Does it make sense to add timeout support where platforms can have an abort API method where they can handle cancellation for things like HTTP requests they may be making?

brendanm720 commented 4 years ago

I am having similar issues -- Should I open a new Issue, or add to this one?

ragd011 commented 4 years ago

I have the same issue. I'm on 0.101.3 running virtual environment on a RPI. I have a CT50 thermostat and it responds instantly to the /tsat/ url.

cytech commented 4 years ago

Same issue here. 0.102.3 in venv on RPI. CT80 thermostat. Once the log shows that an update is taking more than 10 seconds, all subsequent updates appear to fail until HA is restarted. Thermostat is reachable directly via http://host/tstat and responds with a json payload as expected.

kirbyschapman commented 4 years ago

I just updated to hass.io 0.104.3, and continue to have a similar issue. I get pages of Update of climate.thermostat_xxxxxxxx is taking over 10 seconds and then, at some in time, the front end reported temp for this CT50 Radio Thermostat is no longer reported correctly - it seems to freeze. As with @cytech and @ragd011, the thermostat responds fine when I access it through the radiotherm internet interface. Once I restart hass.io, it responds to temp changes and commands, as least for a while. Thoughts??

rjm12rjm12 commented 4 years ago

Since updating to 0.105 I cannot communicate with my CT50 Radio Thermostat anymore. At lease I don't see any information in the log and my entity is invalid. I will debug some more tonight.

kirbyschapman commented 4 years ago

Since updating to 0.105 I cannot communicate with my CT50 Radio Thermostat anymore. At lease I don't see any information in the log and my entity is invalid. I will debug some more tonight.

I was just getting ready to update to 0.105 - but now think I'll wait...

rjm12rjm12 commented 4 years ago

Not sure if there was a fix or not, but I just updated to 0.105.2 and my thermostats are back working.

DeviousVon commented 4 years ago

I'm on 0.105.4 and have the same issue. In addition to the 10 second errors, I'm also noticing unreachable errors.

File "/usr/local/lib/python3.7/urllib/request.py", line 1321, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno 113] Host is unreachable>

The thermostat is reachable via its app and web interface.

joshnorth commented 4 years ago

Attached log. Similar to previous comments except I'm on HA 0.106.2 and Python 3.8.

In troubleshooting, I find that the unit really is dropping link for a second. That part is ok, but it is not being handled gracefully.

ct50HAlog.txt

vinnyfuria commented 4 years ago

I ran some connection statistics against my CT50.

Count:    17121
Min:      2.238989
Max:      61.574224
Mean:     2.549727
dSquared: 27086.792889
Variance: 1.582080
Stdev:    1.257808

Note that 61s max is an outlier, but about 1 of every 20 hits against the CT50 took 7-9 seconds. With two queries in the update and no timeout, it's easy to see why the thermostat update fails to complete within 10 seconds sometimes. What's still unclear to me is why home-assistant handles this so poorly. However, we can fix this either in the home-assistant component by adding a timeout to calls to the radiotherm library. Even better would be to add a timeout exception that can be thrown by radiotherm library and caught within the update thread.

stale[bot] commented 4 years ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue now has been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

vinnyfuria commented 4 years ago

I confirmed that this is still an issue as of 0.114.3. Recent changes though, may have made this a more tractable problem. I suspect using try...catch around around the radiotherm library calls will fix the problem.

brendanm720 commented 4 years ago

I am also still having to restart Home Assistant periodically to fix this issue as of 0.114.3 (indeed this morning), however it has gotten less frequent over time. I agree with vinnyfuria that perhaps putting some error catching in would be beneficial.

vinnyfuria commented 4 years ago

I think I've figured out a fix for this. It requires this upstream change to the radiotherm library. Once that is merged, wrapping all this component's network calls in try...catch statements appears to fix the problem. With these fixes in place, I've yet to see to the component stall and I do see logging indicating that it would have.

I'll submit a PR to HASS once the upstream library is patched. 🤞

kirbyschapman commented 4 years ago

Thanks, @vinnyfuria!

I think I've figured out a fix for this. It requires this upstream change to the radiotherm library. Once that is merged, wrapping all this component's network calls in try...catch statements appears to fix the problem. With these fixes in place, I've yet to see to the component stall and I do see logging indicating that it would have.

I'll submit a PR to HASS once the upstream library is patched. 🤞

MickPBduece commented 4 years ago

I upgraded to 116.4 due to ONVIF issues at 116.3 and as usual the RadioTherm did not correct. I always end up needing to restart. Today however after several restarts - no climate.my_thermostat after several restarts

Logger: homeassistant.components.radiotherm.climate Source: components/radiotherm/climate.py:120 Integration: radiotherm (documentation, issues) First occurred: 3:51:43 PM (1 occurrences) Last logged: 3:51:43 PM

Unable to connect to Radio Thermostat: 192.168.1.28 Traceback (most recent call last): File "/usr/local/lib/python3.8/urllib/request.py", line 1350, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "/usr/local/lib/python3.8/http/client.py", line 1255, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/local/lib/python3.8/http/client.py", line 1301, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.8/http/client.py", line 1250, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.8/http/client.py", line 1010, in _send_output self.send(msg) File "/usr/local/lib/python3.8/http/client.py", line 950, in send self.connect() File "/usr/local/lib/python3.8/http/client.py", line 921, in connect self.sock = self._create_connection( File "/usr/local/lib/python3.8/socket.py", line 808, in create_connection raise err File "/usr/local/lib/python3.8/socket.py", line 796, in create_connection sock.connect(sa) OSError: [Errno 113] Host is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/radiotherm/climate.py", line 120, in setup_platform tstat = radiotherm.get_thermostat(host) File "/usr/local/lib/python3.8/site-packages/radiotherm/init.py", line 80, in get_thermostat model = initial.model.get('raw') File "/usr/local/lib/python3.8/site-packages/radiotherm/fields.py", line 45, in get response = instance.get(self.url) File "/usr/local/lib/python3.8/site-packages/radiotherm/thermostat.py", line 35, in get return request.urlopen(url) File "/usr/local/lib/python3.8/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/usr/local/lib/python3.8/urllib/request.py", line 525, in open response = self._open(req, data) File "/usr/local/lib/python3.8/urllib/request.py", line 542, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "/usr/local/lib/python3.8/urllib/request.py", line 502, in _call_chain result = func(*args) File "/usr/local/lib/python3.8/urllib/request.py", line 1379, in http_open return self.do_open(http.client.HTTPConnection, req) File "/usr/local/lib/python3.8/urllib/request.py", line 1353, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno 113] Host is unreachable>

arch armv7l
chassis  
dev false
docker true
docker_version 19.03.13
hassio true
host_os Raspbian GNU/Linux 10 (buster)
installation_type Home Assistant Supervised
os_name Linux
os_version 5.4.51-v7l+
python_version 3.8.5
supervisor 247
timezone America/Chicago
version 0.116.4
virtualenv false
vinnyfuria commented 3 years ago

It's taken a while, but #45482 is now submitted to home-assistant core. I've been running a version of this code locally for about 6 months and have not seen the component stall in that time, so I have confidence that the mentioned PR will resolve this ticket.

brendanm720 commented 3 years ago

Thank you, Vinnie.

On Sat, Jan 23, 2021, 8:40 PM Vinny Furia notifications@github.com wrote:

It's taken a while, but #45482 https://github.com/home-assistant/core/pull/45482 is now submitted to home-assistant core. I've been running a version of this code locally for about 6 months and have not seen the component stall in that time, so I have confidence that the mentioned PR will resolve this ticket.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/25701#issuecomment-766273155, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANTUHCOYDXTS7GUO4QJUG4TS3N275ANCNFSM4IJHEMBQ .