home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.56k stars 30.74k forks source link

Radio Thermostat Constantly Swings Between Working and Unavailable #76270

Open brendanm720 opened 2 years ago

brendanm720 commented 2 years ago

The problem

The climate entity (and the hold switch entity) both constantly flip back and forth between working and unavailable. It seems to stay in either status for fairly random intervals, but usually it's between 2-15 minutes.

The thermostats are, unfortunately, old and single threaded and don't always respond within Home Assistant's timeout.

Previous behavior before I upgraded was that Home Assistant would keep trying the thermostat, and would keep the entity status unless the thermostat was well and truly down.

What version of Home Assistant Core has the issue?

core-2022.8.0

What was the last working version of Home Assistant Core?

core-2022.6.7

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Radio Thermostat

Link to integration documentation on our website

https://www.home-assistant.io/integrations/radiotherm

Diagnostics information

config_entry-unifi-cd79fcd59fe9472a93f5838750383581.json.txt

Example YAML snippet

No response

Anything in the logs that might be useful for us?

Logger: homeassistant.components.radiotherm.coordinator
Source: helpers/update_coordinator.py:151
Integration: Radio Thermostat (documentation, issues)
First occurred: 6:32:48 PM (40 occurrences)
Last logged: 8:51:12 PM

Error fetching radiotherm Thermostat data: Thermostat (192.168.0.240) was busy (invalid value returned):
Error fetching radiotherm Thermostat data: Thermostat (192.168.0.240) connection error: timed out

Additional information

These log entries were present in core-2022.6.7, but the status of the entities were not made unavailable when the timeout happened, but after an unknown (to me) number of timeouts happened. (Sometimes things would hang if you wanted to update the temperature or change the mode,)

probot-home-assistant[bot] commented 2 years ago

radiotherm documentation radiotherm source (message by IssueLinks)

probot-home-assistant[bot] commented 2 years ago

Hey there @bdraco, @vinnyfuria, mind taking a look at this issue as it has been labeled with an integration (radiotherm) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

brendanm720 commented 2 years ago

I forgot to mention that I upgraded to 2022.7.7 and the issue was also present there.

I restored a backup (ESPHome was acting up too and I needed to fix that) to 2022.6.7 and radiotherm worked as expected.

I ran that way for about a week with no issue, and upgraded to 2022.8.0 today.

dieselrabbit commented 2 years ago

I am seeing the exact same issue (even the log errors), oddly though with only one of my two CT50 thermostats. Currently on 2022.8.2.

bdraco commented 2 years ago

These devices stop responding to commands and polling for more than a minute at a time if the wifi connection is poor.

I had one that did that but after locking it to a specific access point it started behaving

dieselrabbit commented 2 years ago

I'll try locking it to an access point and see if that helps. The thermostat I'm having a problem with does seem to prefer an AP slightly farther (3ft?) away.

That said, the pre-config-flow version of the integration seemed to just log, but largely ignore these connection issues. While that's not ideal, would it be valid to ignore N number of RadiothermTstatError before UpdateFailed is raised?

bdraco commented 2 years ago

I can't actually control the thermostat when it is in this state so I think the availability reporting is accurate so I'm no so keen on changing it as I think thats going to generate issues about why it can't be controlled.

bdraco commented 2 years ago

If it's something you feel strongly about changing, I'm also happy to review a PR if someone wants to submit one as and signs up to be a codeowner.

jaymemaurice commented 2 years ago

I like to bang at the CT80's PMA and UMA areas with Node RED. I had to make it serially message and queue. It worked okay with the old home assistant... seems the new home assistant is polling it so frequently the thermostat hangs up, reports/sets to -1*C and causes all sorts of issues. Can we get a parameter to how frequently the polling happens?? Also can we get it so the attributes like humidity and temperature are checked for sane values?

I have resorted to logging from template sensors :(

    sensors:
      radiotherm:
        device_class: temperature
        unit_of_measurement: '°C'
        friendly_name: 'Indoor Temperature'
        value_template: >-
          {%- if states.climate.thermostat.attributes.current_temperature|float >= 5 and states.climate.thermostat.attributes.current_temperature|float <= 45 -%}
          {{ states.climate.thermostat.attributes.current_temperature| round(1) }}
          {%- else -%}
          nan
          {%- endif -%}
      radiotherm_set:
        device_class: temperature
        unit_of_measurement: '°C'
        friendly_name: 'Set Temperature'
        value_template: >-
          {%- if states.climate.thermostat.attributes.temperature|float >= 5 and states.climate.thermostat.attributes.temperature|float <= 45 -%}
          {{ states.climate.thermostat.attributes.temperature | round(1) }}
          {%- else -%}
          nan
          {%- endif -%}
bdraco commented 2 years ago

Can we get a parameter to how frequently the polling happens??

You can already do this. Disable polling in the UI and then setup an automation to poll it as often as you like.

jaymemaurice commented 2 years ago

Can we get a parameter to how frequently the polling happens??

You can already do this. Disable polling in the UI and then setup an automation to poll it as often as you like.

I had discovered the disable polling but didn't find the method to kick off the polling manually. There didn't seem to be a service / method or anything in the documentation. I'll stop being lazy and read some code but I somewhat expected a thermostat.poll next to thermostat.hvac_mode...

bdraco commented 2 years ago

This one is for shades but should be easy to adapt

https://www.home-assistant.io/integrations/hunterdouglas_powerview/#force-update-shade-position

brendanm720 commented 2 years ago

TBH, I didn't know that the ability to disable polling was a thing. I disabled polling, and set up an automation to poll it, and it seems to be working better. I'm going to tweak the timeframes some, as five minutes does not seem to be often enough.

issue-triage-workflows[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

dsanner commented 1 year ago

It is still broken on latest HA. Disabling automatic polling and then implementing an automation to poll every minute fixes the issue. Is there a way to set automatic polling rates per device?

Cellivar commented 1 year ago

Seems like this is the line that sets the polling interval to 15 seconds: https://github.com/home-assistant/core/blob/7b2e743a6b0ce8dce5595eb54d089c3142cc6d94/homeassistant/components/radiotherm/coordinator.py#L18

I'm not familiar enough with HA to understand how this might be modified to be configurable, presumably by adding to the config flow's user input would allow overriding it without needing to set up a custom automation.

issue-triage-workflows[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

stefanhra commented 1 year ago

This is still an issue on the latest version of Home Assistant. I have disabled automatic polling and and setup an automation to do polling every 2 minutes. This has lessened the issue but it still occurs.

There are several people patiently waiting for a fix for this so please stop trying to close this ticket due to inactivity.

The integration broke when it was changed from manual configuration via YAML to automatic configuration. Please either fix this issue or revert it back to manual configuration because at least that worked which is IMO is preferable to the current state of this integration.

el0552t commented 1 year ago

I recently installed Radio Thermostats to my Home Assistant and I'm experiencing the same issue. I have thirteen thermostats with all of them randomly switching between on and offine.

All the thermostats are showing excellent signal strength when viewing the stats on my wireless network. Access points were placed near all of these units way back when originally installed due to the poor range of the USNAP wifi modules used in these thermostats.

I am relatively new to Home Assistant. Can someone tell me where I can find the file referenced in this thread above? I have browsed through the files with File Editor and have not found it. I would like to apply this to see if would help eliminate this issue. Otherwise, this is an excellent program.

I believe a lot of people are trying this avenue of communications to their Radio Thermostats since Energy Hub recently discontinued their online portal. Finding a solution to this problem would be very much appreciated.

vinnyfuria commented 1 year ago

To reiterate, there is a work around for this issue:

  1. Disable Polling on the Radio Thermostat: Home Assitant -> "Settings" -> "Devices and Services" -> "Radio Thermostat" kebab menu -> "System Options" -> deselect "Enable polling for updates"

  2. Create an polling automation (note the below is untested, but should work): Home Assistant -> "Settings" -> "Automations and Scenes" -> "Automations" -> "Creat Automation" -> "Create new automation" -> kebab menu -> "Edit in YAML"

    alias: Force RadioTherm Update
    description: 'Query radiotherm status when polling is disabled.'
    mode: single
    trigger:
    - platform: time_pattern
    minutes: 1
    action:
    - service: homeassistant.update_entity
    target:
      entity_id:
        - climate.<radiothermostat_entity_name>

Note: the radiothermostats have an issue where they occassionally do not respond to API requests. This is not a function of WIFI signal strength and appears to be an inherint issue with the thermostats. There is also an issue where the thermostat can take longer to respond than HA's http request timeout allows. In both situations the only workaround/solution is to requery the device.

Edit 1: Fixed (2) replacing "seconds: 60" with "minutes: 1"

el0552t commented 1 year ago

Thank you for your response.  This maybe a stupid question but your commit about the thermostat response takes too long for HA’s http request response seems like the reasonable problem considering the actions I’m seeing.  Is there any way to change the http response timeout?Sent from my iPhoneOn May 31, 2023, at 3:51 PM, Vinny Furia @.***> wrote: To reiterate, there is a work around for this issue:

Disable Polling on the Radio Thermostat: Home Assitant -> "Settings" -> "Devices and Services" -> "Radio Thermostat" kebab menu -> "System Options" -> deselect "Enable polling for updates"

Create an polling automation (note the below is untested, but should work): Home Assistant -> "Settings" -> "Automations and Scenes" -> "Automations" -> "Creat Automation" -> "Create new automation" -> kebab menu -> "Edit in YAML"

alias: Force RadioTherm Update description: 'Query radiotherm status when polling is disabled.' mode: single trigger:

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

brendanm720 commented 1 year ago

This is still an issue.

On Mon, May 29, 2023, 11:41 AM stefanhra @.***> wrote:

This is still an issue on the latest version of Home Assistant. I have disabled automatic polling and and setup an automation to do polling every 2 minutes. This has lessened the issue but it still occurs.

There are several people patiently waiting for a fix for this so please stop trying to close this ticket due to inactivity.

The integration broke when it was changed from manual configuration via YAML to automatic configuration. Please either fix this issue or revert it back to manual configuration because at least that worked which is IMO is preferable to the current state of this integration.

— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/76270#issuecomment-1567287575, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANTUHCJQRU2OZYXZEBECVLTXIS7Q5ANCNFSM55UMMVTQ . You are receiving this because you authored the thread.Message ID: @.***>

stefanhra commented 1 year ago

@vinnyfuria I have tried this work around and it doesn't work for me. I don't know if it because I am using a CT-80 model thermostat while most others seem to be using a CT-50 model. BTW the YAML you posted throws an error. The "seconds:" parameter only accepts values between 0 and 59.

Can anyone else confirm or deny if this work around is working with their CT-80 thermostat? Also does this integration also work if the thermostat is configured to use z-wave? I have a z-wave module for my thermostat as well and am wondering if I can fix this issue by switching to z-wave.

I have tried to eliminate other possible causes of loss of communication like switching ports that the wifi module is in and moving a wifi access point withing 2/3 feet of the thermostat. No matter what I do Home Assistant loses communication with the thermostat every 2-15 minutes. This is quite frustrating because I cannot use it in any automatons and it used to work fine when it was configured via YAML.

vinnyfuria commented 1 year ago

I updated the workaround to have minutes: 1 instead of seconds: 60. I was able to verify that the automation now has no syntax errors. Unfortnately I only have a CT50, so I can not help troubleshoot why this would not be working. I also do not have a z-wave module. I imagine use of the z-wave module would mean this integration would not be used (z-wave has a built in generic thermostat component that would likely be used instead). I can't speak at all to how that performs.

No matter what I do Home Assistant loses communication with the thermostat every 2-15 minutes

This is likely a problem with the thermostat rather a networking or home assistant issue. As mentioned, the API is inconsistently responsive which we have not figured out how to work around. I have gathered plenty of evidence of this using networking tools (curl, wireshark, etc). You may be able to work around this by handling possible missing data in your automations (for example, by forcing a refresh or polling).

stefanhra commented 1 year ago

updated the workaround to have minutes: 1 instead of seconds: 60

@vinnyfuria according to this page

https://www.home-assistant.io/docs/automation/trigger/#time-pattern-trigger

"minutes: 1" will poll once an hour 1 minute after the hour. To get it to poll once per minute use either "minutes: *" or "seconds: <any value between 0 and 59>". Ie, "seconds: 0" will poll once per minute when the seconds column is equal to 0.

If I monitor it with Wireshark is there something in particular I can look for to verify if I am having the same issue or if something else is happening? Or is there some other test I can do

issue-triage-workflows[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

Cellivar commented 1 year ago

Still having this issue.

lizaoreo commented 1 year ago

My theory has always been that the thermostat can't respond to API calls while it's phoning home, not really relevant now, but I'm assuming it's still "trying" to talk to the cloud service even if that service doesn't talk back.

I really feel like having a way to add error handling (such as missed update or no response) for components that maybe have poorer reception or whatever causes the issues we see here. I think we should be able to configure a "Hey, if I get no response, try 2 more times and then flag it as unresponsive" setting on a per component basis, as well as adjusting the polling rate as mentioned here. I think components should also have customizable response times, some things just take longer to respond, my error logs on Home Assistant at times are virtually useless because of this fact.

AdShea commented 1 year ago

At least you can stop the cloud phone-home by setting enable to zero and clearing the authkey:

curl http://$THERMOSTAT_IP/cloud -d '{"enabled":0}' -X POST curl http://$THERMOSTAT_IP/cloud -d '{"authkey":""}' -X POST

That seems to help a bit on the dropouts.

There's also someone who has reverse engineered the cloud connection if it'd be easier to do a Local-Push type control https://github.com/ceesb/radiothermostat_cloud

issue-triage-workflows[bot] commented 10 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

laez commented 9 months ago

At least you can stop the cloud phone-home by setting enable to zero and clearing the authkey:

curl http://$THERMOSTAT_IP/cloud -d '{"enabled":0}' -X POST curl http://$THERMOSTAT_IP/cloud -d '{"authkey":""}' -X POST

That seems to help a bit on the dropouts.

There's also someone who has reverse engineered the cloud connection if it'd be easier to do a Local-Push type control https://github.com/ceesb/radiothermostat_cloud

Do you just need to execute this once from a terminal, or as a part of an automation?

eftimg commented 8 months ago

once - from any terminal. helps only a bit. you will continue to see the issue.

issue-triage-workflows[bot] commented 5 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

Cellivar commented 5 months ago

This is still an issue.

rjm12rjm12 commented 3 months ago

Agree, this is still an issue for me also.

edwin053 commented 3 months ago

Also still an issue - although (at least for myself) - it seems to have been playing nicely for the most part for several releases (or at least for me, for several months) - until something interrupts the system, ip, ARP, etc... like HA decides to disappear for some reason, and then all bets are off for a while.

vinnyfuria commented 3 months ago

This cannot be fixed. The underlying problem is that the thermostat, on occasion, stops responding to requests from clients. The workaround of disabling polling only helps as you query the radiotherm less frequently and hence are less likely to run into one of these unresponsive timeout periods.

This change to the radiotherm config helps some, but does not eliminate the issue: https://github.com/home-assistant/core/issues/76270#issuecomment-1763230534

These are very, very old devices that are never really worked all that well. If anyone has other configuration changes that they have found that help, please suggest them.

Recommend that this be closed. Unfixable.

stefanhra commented 3 months ago

This cannot be fixed. The underlying problem is that the thermostat, on occasion, stops responding to requests from clients.

Back when this integration was configured via YAML file it didn't exhibit this behaviour. I'm not saying the thermostat didn't become unresponsive but the dashboard display didn't change to "unavailable". When you want to see what the temperature is or what the thermostat is set too it doesn't really matter if that info is stale because the last update failed due to the thermostat not responding to a polling update. But, when half the time you look at the dashboard thermostat it says "unavailable" and you have to wait a minute or several before it updates that does matter.

I understand that it is probably much more difficult than I think and that I am basically saying "why don't you just..." but it seems to me that this integration might be able to be written in a way to works around the short comings of this thermostat. For example if integration would only update when it has received valid data. As long as the thermostat is unresponsive it could display that last known good data. I suppose some way to buffer inputs like changing the thermostat set point would be good too.

I don't expect anyone to spend what I can only imagine would be a considerable amount of time to implement this just because I asked so this is just my 2 cents on the topic. If I had the ability I would try but I don't.

jaymemaurice commented 3 months ago

Yeah I agree. We the community had polling working and then we broke it and don’t really have the effort to find why and fix it. It’s not the device. It still does what it always did and although most of us moved on to sexier products, I disagree with the assertion that it never worked well. I had my ct-80 reliably displaying random things and the schedule set from home assistant and it worked perfectly. It’s not secure, sexy or popular but it does work. It’s probably something like request headers, line endings or something silly.

On Fri, Aug 9, 2024 at 7:55 PM stefanhra @.***> wrote:

This cannot be fixed. The underlying problem is that the thermostat, on occasion, stops responding to requests from clients.

Back when this integration was configured via YAML file it didn't exhibit this behaviour. I'm not saying the thermostat didn't become unresponsive but the dashboard display didn't change to "unavailable". When you want to see what the temperature is or what the thermostat is set too it doesn't really matter if that info is stale because the last update failed due to the thermostat not responding to a polling update. But, when half the time you look at the dashboard thermostat it says "unavailable" and you have to wait a minute or several before it updates that does matter.

I understand that it is probably much more difficult than I think and that I am basically saying "why don't you just..." but it seems to me that this integration might be able to be written in a way to works around the short comings of this thermostat. For example if integration would only update when it has received valid data. As long as the thermostat is unresponsive it could display that last known good data. I suppose some way to buffer inputs like changing the thermostat set point would be good too.

I don't expect anyone to spend what I can only imagine would be a considerable amount of time to implement this just because I asked so this is just my 2 cents on the topic. If I had the ability I would try but I don't.

— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/76270#issuecomment-2278890564, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJIX5V7BD6MBDZTB5NLLG2TZQVJIHAVCNFSM55UMMVT2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMRXHA4DSMBVGY2A . You are receiving this because you commented.Message ID: @.***>

issue-triage-workflows[bot] commented 2 days ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

dfprl commented 1 day ago

Still an issue as of 2024.10.4 .. I have two of these installed. When I initially commissioned these (I can't say what version of HA that might have been, but around Fall 2023), I wasn't seeing an issue. Only one of them does this. Both have solid network connections as far as I can tell.

It's a mild annoyance, but it's got to be an easy fix. I haven't looked at the integration code...or any HA code yet, but I'm going to assume it just needs some simple state debouncing to make it more robust. The log sort of implies it only tries once per 15? seconds.

2024-11-07 21:22:44.001 ERROR (MainThread) [homeassistant.components.radiotherm.coordinator] Error fetching radiotherm thermostat-1 data: thermostat-1 (-----) was busy (invalid value returned): 2024-11-07 21:23:57.750 ERROR (MainThread) [homeassistant.components.radiotherm.coordinator] Error fetching radiotherm thermostat-1 data: thermostat-1 (-----)) timed out waiting for a response: timed out

These are the two messages I observe. Invalid value more often than the timeout. Not sure what to make of that, the API is super dumb, it's odd that invalid value is being logged. I was expecting just timeouts.

I'm going to experiment a bit to see if there a difference in config between the two that would cause it to behave this way and report back.

edwin053 commented 1 day ago

Also still an issue here - not adding a +1 or emoji - just to be clear. I have found that deleting and re-adding the devices under integrations seems to resolve it pretty consistently - BUT - that’s not a fix. And like @dfprl commented above - a little more logic in the state mechanism would seem to be straightforward- of course - I didn’t volunteer (yet) to write it - was hoping the maintainers would - but - I will be happy to test/validate any fixes.

this all said - adding a separate input to help keep the issue open for now.