Closed LGH531 closed 1 year ago
I woke up this morning and saw steam coming off the pool. I angrily said "the darn kids left the pool heater on for a 4th day in a row?" I've been saying to them each day don't leave the pool heater on because I wake up each morning and it's back on, even though I turned it off. Leaving that heater running all night is probably $100 in natural gas.
Then I realized this morning it's very possibly this thread with the Home Assistant integration and iAqualink not communicating to each other. I had turned the pool heat on initially via Home Assistant before I realized there was a bug with this re-authentication issue. Thus I had left the pool heat on in Home Assistant. It's been that way for about 2 weeks. In that time I've just been logging in direct to iAqualink's web interface (which is awful and their authentication on that site is way too aggressive, so I miss Home Assistant.)
We had a friend in town, so we were running the Pool Heater constantly until 4 days ago when they left and then I turned the pool heat off. However, each morning it would be back on and I truly thought it was just a kid. So as much as the iAqualink in Home Assistant is broken, it seems to maybe be able to recommunicate each night to try to resync things???
The point of me posting this, is this bug (if it truly is from this) actually has cost me hundreds of dollars in natural gas. I write open source code too, and I get how hard it is to check your brain back into a project. I'm bad at wanting to go re-open code years later and update it due to a new bug that crept up and an active Github issues thread where I can feel the pressure from across the world of people waiting and hoping for me to update code.
I do think we should probably go edit the code instead of @flz. It's a community effort after all. I don't think it's fair to just keep pestering him. I wish Home Assistant made it easier to edit and test the code. Every time I try to write a plugin I feel like I'm walking around in the dark. I wish there were more good Youtube videos of how to write a real-world plugin. I wish it was easier to step through the code when running inside Home Assistant. I wish the documentation was better of the callback events for the plugins. I wish there were more examples of code for push/pull web API's and websocket API's. Then I could help more.
Wow… You certainly have new perspectives on this subject. Sorry to hear that. You mention good perspectives which the Home Assistant leadership team should take in strides. A failure of an integration that causes your automation(s) to fail as a result is a huge issue in terms of the resiliency/architecture of the software.
In regards to FLZ, I do not think that we are pestering him. We merely are asking him for help, and asking for communication in terms of when he might have time to address a fix. Let me know if you or anybody thinks that I have not been considerate or greatful for his awesome code. Because of it I have learned a lot about the Jandy cloud integration. I also learned that security for this integration is extremely important and am glad that Jandy is stepping up its security protocol by reducing the session token and requiring people to reauthenticate. I also figure that If FLZ is not able or not interested, then we can make a branch and code our own fix, correct? So then how to get the new code in Home Assistant?
Also again shows the weakness in cloud integrations. At any point in time said manufacturer can push out a code change and have negative impacts on your life. Let’s not even discuss the ramifications if the Jandy cloud was breached and the number of pool heaters or coolers it could activate and the downstream impacts of that.
As a lesson of your experience - Thank you! - , I will flip the breaker off because I do not intend to heat my pool unless for certain odd situations where I am happy to flip the breaker back…
Hi All,
I am in the process of testing the PR for https://github.com/flz/iaqualink-py/pull/19 currently. It seems to have fixed the HA issue for me for the last hour of testing. Here is a quick fix if anyone else wants to test it as well. Create a custom component from https://github.com/home-assistant/core/tree/dev/homeassistant/components/iaqualink and then update the manifest.json file with the following. Until the PR gets merged into the main python branch, this is the easiest fix.
{
"domain": "iaqualink",
"name": "Jandy iAqualink",
"codeowners": ["@flz"],
"config_flow": true,
"documentation": "https://www.home-assistant.io/integrations/iaqualink/",
"iot_class": "cloud_polling",
"loggers": ["iaqualink"],
"version": "0.0.1",
"requirements": ["git+https://github.com/jbronikowski/iaqualink-py@master#iaqualink", "h2==4.1.0"]
}
The two key components needed are
"version": "0.0.1",
"requirements": ["git+https://github.com/jbronikowski/iaqualink-py@master#iaqualink", "h2==4.1.0"]
@jbronikowski, thank you for this. I attempted to load this as a custom component as you suggested. Updated the manifest.json as suggested. In the logs it loaded the custom component, and operated correctly for just over an hour. Then the <ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:961, additional_data:None>, began showing up in the logs and lost connection. It must have not picked up your changes from iaqualink-py. Going back through everything again to see what I missed. Any suggestions?
I'm back home now so I'll have some time to look into this.
The issue isn't to write a fix, it's to ensure that it is correct by being able to reproduce the issue (or lack thereof). On top of that, to ensure that the client has not changed (as in the mobile app), it's a bit more involved to set up mitmproxy and verify that the behavior is consistent with the library and HA.
With all that being said, if anybody's interested in submitting PRs for the library or the integration, you should feel empowered to do it. As was mentioned, this is all opensource.
Awesome! I will be able to test anything later today. The fix should be relatively straight forward. When the token expires then re-login. I think that I posted the json of the error message you get.
If needed I can give you my contact info and we can work together.
Sent from my iPhone
On Feb 26, 2023, at 1:01 PM, Florent Thoumie @.***> wrote:
I'm back home now so I'll have some time to look into this.
The issue isn't to write a fix, it's to ensure that it is correct by being able to reproduce the issue (or lack thereof). On top of that, to ensure that the client has not changed (as in the mobile app), it's a bit more involved to set up mitmproxy and verify that the behavior is consistent with the library and HA.
With all that being said, if anybody's interested in submitting PRs for the library or the integration, you should feel empowered to do it. As was mentioned, this is all opensource.
— Reply to this email directly, view it on GitHubhttps://github.com/home-assistant/core/issues/88033#issuecomment-1445421239, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARBHCMGFPEM75QMXQRD4KM3WZOK5ZANCNFSM6AAAAAAU2XPYHU. You are receiving this because you were mentioned.Message ID: @.***>
I appreciate the help but I'm not convinced it's not a login issue (or credentials expiring).
There's an outstanding issue that seems related here. From a library customer's perspective, I shouldn't have to create a new client because a server closed a connection. A client is a higher level abstraction supporting connections to multiple endpoints. This does feel like a library bug.
In the meantime, we can re-create the client in the library since HA doesn't pass a shared httpx client object, which we used to do when using aiohttp.
Separately, I just checked the app and it looks like it auto-refreshes every minute and is able to reuse the previous connection, suggesting keepalive is set to some value equal to or greater than 60s. Setting the refresh interval to something lower than this (which is currently the case, we use 15s), does give us faster updates for entity status but it also means the server can easily differentiate "legitimate" clients from HA.
I'm hopeful I should have enough info by EOD to produce some fix. This will most likely mean a new version of the iaqualink library and a diff to pull it in HA.
Ok, i appreciate the challenge back as I could be wrong. I will then give you my proof or log when I get home. We can then sync our perspectives. M
Is the home assistance log sufficient or what will help you? Or do you want the mitm proxy log?
At this stage I will do anything to get this resolved as it is so silly that I could puke. I am very happy and appreciative that you’re willing to take this on.
Sent from my iPhone
On Feb 26, 2023, at 1:57 PM, Florent Thoumie @.***> wrote:
I appreciate the help but I'm not convinced it's not a login issue (or credentials expiring).
There's an outstanding issue that seems related herehttps://github.com/encode/httpx/discussions/2112. From a library customer's perspective, I shouldn't have to create a new client because a server closed a connection. A client is a higher level abstraction supporting connections to multiple endpoints. This does feel like a library bug.
In the meantime, we can re-create the client in the library since HA doesn't pass a shared httpx client object, which we used to do when using aiohttp.
Separately, I just checked the app and it looks like it auto-refreshes every minute and is able to reuse the previous connection, suggesting keepalive is set to some value equal to or greater than 60s. Setting the refresh interval to something lower than this (which is currently the case, we use 15s), does give us faster updates for entity status but it also means the server can easily differentiate "legitimate" clients from HA.
I'm hopeful I should have enough info by EOD to produce some fix. This will most likely mean a new version of the iaqualink library and a diff to pull it in HA.
— Reply to this email directly, view it on GitHubhttps://github.com/home-assistant/core/issues/88033#issuecomment-1445436089, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARBHCMEN6X5WC7AFG7JNMSTWZORRDANCNFSM6AAAAAAU2XPYHU. You are receiving this because you were mentioned.Message ID: @.***>
I don't need you to test anything. Since I can't seem to be able to create a simple test repro, the only way I can gather data and test is by hacking my docker container, restart and wait for an hour.
Ok, not a proper fix since it will be overwritten by future upgrade but if you can modify homeassistant/components/iaqualink/__init__.py
locally, you can try this.
In _async_systems_update(), under the "Failed to refresh..." logging call, in the same block, add this:
await system.aqualink.close()
This will force recreation of the httpx client inside the library.
Please don't ask me how to edit this file, there are a lot of different ways to do it based on whether you're using baremetal, docker, HASS, ... Once the file has been edited, restart HA and it should fix itself automatically every hour.
Again, this is a temporary workaround, you cannot update HA or you'll lose this "fix" and exception raised when sending commands (vs polling for updates) still aren't caught properly.
thanks @flz. If folks want to try the temporary fix I would recommend they create a custom component - the process will be consistent for all HA installs and you can still update HA.
/config/custom_components
manifest.json
to add a version
entry (any number is fine):
"version": "1.0.0",
edit: if the custom component loads successfully, this message will show in the full logs:
We found a custom integration iaqualink which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant
edit: @flz the temporary fix does not work for me - seeing the same errors as before.
I had an error in the manifest file, so the custom integration wasn't loading. Testing again...
@matthewhadley @flz The fix DOES seem to be working for me. I loaded a custom component like you recommended rather than making changes to the core file and it's been about 2 hours and I'm still getting updates and controls are operational.
@yuckypants good to hear. I edited my post to say I don't think the custom component was loading properly for me. I've fixed that and am waiting an hour or so to test.
Just confirming the temp fix is working for me
Temp fix has worked for me as well for the past 9 hours… devices are still functional overnight however the following error appears in my logs.
Bad logger message: Failed to refresh system %s state: %s (('XXXXXXXXXXX', RemoteProtocolError('<ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:963, additional_data:None>'), None))
@flz @matthewhadley @smshapiro85, unfortunately, I lost connection about 6 hours ago.
Sorry to hear that @yuckypants. Over here the temp fix still going strong, just the same error message I posted above. Looks like like flz is honing in on the issue tho based on his previous comment so that is good.
The temp fix has held for me for 24 hours, thank you!
Is there a PR for this fix into main?
Not sure what I got wrong here. I tried to create a custom component as descibed above and get an error on startup:
2023-03-01 15:07:53.763 WARNING (SyncWorker_4) [homeassistant.loader] We found a custom integration iaqualink which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant 2023-03-01 15:08:02.724 ERROR (MainThread) [homeassistant.loader] Unexpected exception importing component custom_components.iaqualink File "/config/custom_components/iaqualink/init.py", line 156 2023-03-01 15:08:02.728 ERROR (MainThread) [homeassistant.setup] Setup failed for custom integration iaqualink: Unable to import component: Exception importing custom_components.iaqualink
The updated function is:
async def _async_systems_update(now):
"""Refresh internal state for all systems."""
for system in systems:
prev = system.online
try:
await system.update()
except (AqualinkServiceException, httpx.HTTPError) as svc_exception:
if prev is not None:
_LOGGER.warning(
"Failed to refresh system %s state: %s",
system.serial,
svc_exception,
)
await system.aqualink.close()
else:
cur = system.online
if cur and not prev:
_LOGGER.warning("System %s reconnected to iAqualink", system.serial)
async_dispatcher_send(hass, DOMAIN)
async_track_time_interval(hass, _async_systems_update, UPDATE_INTERVAL)
return True
Did I get the syntax wrong somehow?
Thanks,
Ian
@smshapiro85 I restarted HA and it's good...until it's not. I wonder if there are conflicting issues with the currently installed iaqualink core integration.
As an aside, got the notification for 2013.3.0 today and was hopeful that the fix was included; alas, it's not.
@igoldsmith here is a screenshot of my init.py file. Also, did you update the manifest.json to include an entry for a version? Then confirm that you see this message in the logs: "We found a custom integration iaqualink which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant."
init.py
manifest.json
@yuckypants I am sorry to hear this. The temp fix has been working for me since I created the custom component Sunday night, with the exception of the error log entry I mentioned above, but it has not affected the functionality. Please make sure all of the steps and notes I mentioned to igoldsmith above are followed.
@smshapiro85 Thank you for providing screenshots - I had a missing comma. Updated and retesting.
Thanks, I didn't realized the await needed to be inside the logger call. Working now. Let's hope for stability..:-)
The await system.aqualink.close()
call shouldn't need to be inside the logger function itself. just at the same indent.
The
await system.aqualink.close()
call shouldn't need to be inside the logger function itself. just at the same indent.
It failed to start when I had it like this, worked when I moved it into the logger call.
Does this then still show the exception in the log?
So then this has to be redone every time there is a Home Assistant upgrade?
Sent from my iPad
The await system.aqualink.close() call shouldn't need to be inside the logger function itself. just at the same indent. [Screenshot 2023-03-02 at 09 44 38]https://user-images.githubusercontent.com/36013/222477603-ef43cd9a-6437-41fb-86e9-55fcd8d9ec76.png
— Reply to this email directly, view it on GitHubhttps://github.com/home-assistant/core/issues/88033#issuecomment-1452085486, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARBHCMCCKHE6QD2MZQWVDQDW2C6F7ANCNFSM6AAAAAAU2XPYHU. You are receiving this because you were mentioned.Message ID: @.***>
@devMarcus I updated to 2023.3.0 yesterday and the integration is still working. The logs do indeed still show an exception error.
I have been running this as a custom component for a few days without issues.
Same, after I fixed the missing comma, things have been GREAT.
Yep. Been running without issue since for a week now. I did decide to completely remove the logger call to keep my logs nice and clean. Not ideal, but figured since it always makes that call to reset the connection in the trapped condition then I wasn't getting much out of the logged warning.
So big thanks to @flz. Hopefully this will buy enough time to get a more robust fix. This component is one of the most important to my setup but also one of the last that is still dependent on cloud. Maybe it is time to tackle this one too. My frustration level improved dramatically when I got out of the MyQ and Alarm cloud business. Historically, this integration has been solid. But I don't trust Jandy. The pool equipment vendors never seem to run out of creative ways for me to spend money.
I appreciate the work that everyone has done on this integration. It demonstrates the value of Home Assistant as a community built and maintained project.
I have had very little experience with Python programming, but was able to follow what was described here, and install the edited variant of the iAqualink integration as a custom component. So far, it is reliably maintaining the connection (after 24 hours, so far), at least if it's left alone.
I have discovered that if I reload the integration while it is running, it breaks and requires an HA restart. While I'm pleased to be able NOT to have to reload the integration, I know this is odd behavior, and I just wanted to bring it to the attention of those of you who are working on this repair.
I've submitted the temporary diff that I suggested a few days ago as a stop-gap. A proper fix in the library would have similar logic but I will have to add proper unit-testing and will require more time.
At least in the meantime, home-assistant instances will work again.
Thank you! We appreciate your efforts!
Unfortunately, I was not able to get the workaround working. So, very appreciative of your extra work. I think that I got everything packaged up correctly for the workaround, but after restarting HA the integration was unavailable. Same as if I restarted the integration every 30 minutes. Not going to worry about the workaround if the “fix” (even if it is a patch) is in process.
Just keep in mind that you’re not alone in this endeavor. Let us know how we can help!
Again, many thank you’s from this household!
The problem
Logger: homeassistant.components.iaqualink Source: components/iaqualink/init.py:151 Integration: Jandy iAqualink (documentation, issues) First occurred: 11:54:33 AM (189 occurrences) Last logged: 12:41:33 PM
Failed to refresh system XXXXXXXXXXX state: <ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:961, additional_data:None>
(actual system ID has been replaced with "X"s.)
What version of Home Assistant Core has the issue?
2023.2.4
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
No response
Link to integration documentation on our website
No response
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
No response