Closed keteflips closed 5 months ago
Hey,
In the meantime, I also believe that something is happening here with the CPU usage. After a HA restart, it is norm again at first and then the CPU load increases again. I have not yet tried to find out the cause.
Von meinem iPhone gesendet
Hans Moggert
Am 28.04.2024 um 13:27 schrieb keteflips @.***>:
First of all, I love this integration, and this is not hate, my theory has strong foundations.
I have my HomeAssistant motorized with Zabbix, and I started suffering random crashes. All I can see it's that around every hour the CPU consumption goes crazy and HA restart...
image.png (view on web)https://github.com/fdebrus/hayward-ha/assets/22074069/2a83fba7-79c7-4056-b6b2-6b2681ebe744
I checked all and tested all, I do a lot of test, and finally I can see in the logs that every time the CPU usage peaks this component its refreshing the token.
2024-04-26 10:37:44.678 INFO (MainThread) [custom_components.aquarite.aquarite] Token expired, refreshing...
I disabled this component and the issue disappears.
Thanks for your job in this integration.
— Reply to this email directly, view it on GitHubhttps://github.com/fdebrus/hayward-ha/issues/8, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BFFSQBEBP4AZDYQY2HPCUSDY7TMITAVCNFSM6AAAAABG45CJT6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3DONBZG4ZTCOI. You are receiving this because you are subscribed to this thread.Message ID: @.***>
I re-enabled the integration with the 0.0.8.b1 upgrade and now it works flawless.
I thing some thing changes in the token renovation in this version that solves my problem :)
Ok, you should be lucky.
The problem repeats itself for me. After a restart, the CPU load increases continuously to just under 40%. When I first noticed the problem, it even went up to 100%
Von: keteflips @.> Gesendet: Mittwoch, 1. Mai 2024 19:13 An: fdebrus/hayward-ha @.> Cc: Hans1205 @.>; Comment @.> Betreff: Re: [fdebrus/hayward-ha] HomeAssistant crash caused by this integration (Issue #8)
I re-enabled the integration with the 0.0.8.b1 upgrade and now it works flawless.
I thing some thing changes in the token renovation in this version that solves my problem :)
— Reply to this email directly, view it on GitHubhttps://github.com/fdebrus/hayward-ha/issues/8#issuecomment-2088785551, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BFFSQBEYYJFGO56IBFK2AT3ZAEPATAVCNFSM6AAAAABG45CJT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBYG44DKNJVGE. You are receiving this because you commented.Message ID: @.**@.>>
What I see, is that approximately every 55 minutes, there's an increase in requests and both CPU and network usage increases in a step like pattern.
Same problem here. CPU load and network usage increases a lot, eventually causing HA to crash and restart.
Thanks for the feedback, every 55min the token expires and needs to be refreshed. I do not know yet what can cause CPU / Network increase at that point in time. I will be looking
Could not find (yet) a better way, open for idea and suggestion.
When the token expires, I reconnect hayward and re-subscribe to the pool document in firestore.
I trust HomeAssistant is not so happy with the coordinator being re-created with the new data source. I could not find a way to "update" the data source in home assistant.
Current way of thinking, I can be wrong and keep reading HA docs...
async def get_token_and_expiry(self):
"""Fetch token and expiry using Google API."""
url = f"{GOOGLE_IDENTITY_REST_API}:signInWithPassword?key={API_KEY}"
headers = {"Content-Type": "application/json; charset=UTF-8"}
data = json.dumps({
"email": self.username,
"password": self.password,
"returnSecureToken": True
})
resp = await self.aiohttp_session.post(url, headers=headers, data=data)
if resp.status == 400:
raise UnauthorizedException("Failed to authenticate.")
self.tokens = await resp.json()
self.expiry = datetime.datetime.now() + datetime.timedelta(seconds=int(self.tokens["expiresIn"]))
self.credentials = Credentials(token=self.tokens['idToken'])
self.client = Client(project="hayward-europe", credentials=self.credentials)
if hasattr(self, 'handlers') and self.handlers:
for pool_id, handler in self.handlers:
_LOGGER.debug(f"Resubscribing to pool {pool_id}")
await self.subscribe(pool_id, handler)
async def subscribe(self, pool_id, handler) -> None:
doc_ref = self.client.collection("pools").document(pool_id)
def on_snapshot(doc_snapshot, changes, read_time):
"""Handles document snapshots."""
try:
for change in changes:
_LOGGER.debug(f"Received change {change.type} in firestore")
for doc in doc_snapshot:
try:
handler(doc)
except Exception as handler_error:
_LOGGER.error(f"Error executing handler: {handler_error}")
except Exception as e:
_LOGGER.error(f"Error in on_snapshot: {e}")
doc_ref.on_snapshot(on_snapshot)
self.handlers.append((pool_id, handler))
max_size = 10
self.handlers = self.handlers[-max_size:]
The firestore documentation writes something about detaching the listener to avoid keeping bandwidth open on the client. https://firebase.google.com/docs/firestore/query-data/listen#detach_a_listener Maybe explicitly unsubscribing before the re-subscribe can do it?
I was under the assumption that with token expiration, the connectivity to the document will be broken and listener detached. Will investigate further. thanks !
version 0.0.9 is out, have a try. I could not 100% test it as we need to wait for token expiration (55 min).
I'm removing 0.0.9, it does not work as expected.
According my reading, aihttp_session shall manage pretty well by itself session opening / closing / ... As I reuse the same session upon token refresh, it shall be fine. So I turned my investigation towards firestore. I have changed the routine to re-connect a document after token refresh. validation is ongoing. Will update by end of day.
No luck, will need more time to troubleshoot and fix.
More testing ongoing but it's stable and working for 24hrs... I will publish a new beta for you to validate on your setup
2024-05-08 14:17:51.792 DEBUG (MainThread) [custom_components.aquarite.aquarite] Token expired, refreshing...
2024-05-08 14:17:52.002 DEBUG (MainThread) [custom_components.aquarite.aquarite] Unsubscribing old listener
2024-05-08 14:17:53.005 DEBUG (MainThread) [custom_components.aquarite.aquarite] Re-subscribing with new token
2024-05-08 14:17:53.012 DEBUG (MainThread) [custom_components.aquarite.aquarite] Subscribed with new listener for pool_id 05DE2D353837574E43180937
2024-05-08 14:17:53.185 DEBUG (Thread-ConsumeBidirectionalStream) [custom_components.aquarite.aquarite] Received change ChangeType.ADDED in firestore
2024-05-08 14:17:53.186 DEBUG (MainThread) [custom_components.aquarite.coordinator] Manually updated Aquarite data
I do get warning on the google.api but it does have any impact. Checking on the same
Logger: google.api_core.bidi Source: custom_components/aquarite/aquarite.py:110 integration: Aquarite (documentation, issues) First occurred: 3:55:42 PM (1 occurrences) Last logged: 3:55:42 PM
Background thread did not exit.
Ran the 0.09 update overnight, seems solid, no errors and no runaway CPU usage. I am however also getting the google.api_core.bidi warnings at each refresh. 2024-05-09 03:34:23.187 WARNING (MainThread) [google.api_core.bidi] Background thread did not exit. 2024-05-09 04:31:23.487 WARNING (MainThread) [google.api_core.bidi] Background thread did not exit. 2024-05-09 05:28:23.778 WARNING (MainThread) [google.api_core.bidi] Background thread did not exit. 2024-05-09 06:25:24.138 WARNING (MainThread) [google.api_core.bidi] Background thread did not exit. 2024-05-09 07:22:24.407 WARNING (MainThread) [google.api_core.bidi] Background thread did not exit. 2024-05-09 08:19:24.697 WARNING (MainThread) [google.api_core.bidi] Background thread did not exit.
Did the same and no issues with CPU usage so far.
Token refresh and CPU usage peak resolved with 0.0.9 Google api warning is tracked under https://github.com/fdebrus/hayward-ha/issues/10
First of all, I love this integration, and this is not hate, my theory has strong foundations.
I have my HomeAssistant motorized with Zabbix, and I started suffering random crashes. All I can see it's that around every hour the CPU consumption goes crazy and HA restart...
I checked all and tested all, I do a lot of test, and finally I can see in the logs that every time the CPU usage peaks this component its refreshing the token.
2024-04-26 10:37:44.678 INFO (MainThread) [custom_components.aquarite.aquarite] Token expired, refreshing...
I disabled this component and the issue disappears.
Thanks for your job in this integration.