DavidMStraub / homeconnect

MIT License
31 stars 15 forks source link

Integration stops working after several hours #3

Open darootler opened 4 years ago

darootler commented 4 years ago

Hi!

I am using Home Assistant 0.102.3 and Home Connect 0.2.3 (installed via HACS). For me this integration just stops updating the entities after several hours. If i restart HA the integration is working again for several hours.

I have set up an automation wich notifies me if the dishwasher finished the program, this automation just stops working because the sensor never updates:

image

Anything i am doing wrong here? Any suggestions?

Regards Richard

DavidMStraub commented 4 years ago

Is there anyting in the logs? If now, could you enable debug logging and check? Maybe the SSE listener drops connection and doesn't reconnect (which it should in principle) or maybe token refresh failed (which is now handled internally by Home Assistant).

When the sensor stops updating, can you try if you can still do things like starting programs or switching on/off? I'm asking because these trigger direct calls to the API rather than listening to a stream, so if they don't work either, it indicates a permission problem or the API being down.

darootler commented 4 years ago

Is there anyting in the logs? If now, could you enable debug logging and check? Maybe the SSE listener drops connection and doesn't reconnect (which it should in principle) or maybe token refresh failed (which is now handled internally by Home Assistant).

When the sensor stops updating, can you try if you can still do things like starting programs or switching on/off? I'm asking because these trigger direct calls to the API rather than listening to a stream, so if they don't work either, it indicates a permission problem or the API being down.

I will check that tomorrow. Enabling debug logging on my raspberry will lock the SD card down and make the whole system nearly unusable, maybe we can determine the problem without that.

Regards Richard

DavidMStraub commented 4 years ago

OK thanks in advance! (In principle I think one can enable debug logging for individual components, but I never actually tried.)

darootler commented 4 years ago

OK thanks in advance! (In principle I think one can enable debug logging for individual components, but I never actually tried.)

Yes, thats possible. If we need it i will try that.

Regards Richard

darootler commented 4 years ago

I just checked the entities and they are "out of sync" now:

image

image

Is there a way to help troubleshoot the problem? There are no warnings or errors according to home connect in the logs.

Regards Richard

DavidMStraub commented 4 years ago

Did you try if you can still switch power or start programs after they go out of sync? If no, does that lead to something in the log? If yes, I think it's defintely due to SSE dropping connection, that would be good to know.

darootler commented 4 years ago

Did you try if you can still switch power or start programs after they go out of sync? If no, does that lead to something in the log? If yes, I think it's defintely due to SSE dropping connection, that would be good to know.

I am able to start the dishwasher but that does not update the sensors. What is SSE?

Regards Richard

DavidMStraub commented 4 years ago

Thanks! That's indeed useful. SSE (server sent events) is a stream that allows the API to push events rather than requiring polling. Home Assistant opens a separate thread for each HC device and listens to events on these streams, but it seems the connection dies and is not restarted.

darootler commented 4 years ago

I see. Are you able to reproduce that behavior? Is there anything i can do?

Regards Richard

DavidMStraub commented 4 years ago

Yes, I think I have also run into this problem now. I will have to look at the SSE implementation in the homeconnect Python library and see if I can find a bug there. I'll update you when there is something new to test.

lancer73 commented 4 years ago

Now I am completely green with respect to this code (been using HA for two weeks now), but if nothing is going over the line, the connection will get killed by a router doing NAT.

DavidMStraub commented 4 years ago

Oops. As stated in the commit message, this is an attempted fix. I'll now update the HA component with this and would be grateful about testing.

darootler commented 4 years ago

Will this potential fix be part of a release or do i have to update the custom component manually?

Regards Richard

DavidMStraub commented 4 years ago

I will publish a release later today.

ghost commented 4 years ago

Still same issue when used in HA.

Br Reidar

lancer73 commented 4 years ago

I had the impression that before te latest changes, with HA 103.5 it was better. Everything was still working after 24H.

DavidMStraub commented 4 years ago

I don't see how it could have gotten worse. So what are you seeing now? Any errors in the log? Do all your appliance stop updating at the same time? (Each of them is listened to in a separate thread, so I would expect them not to fail simultaneously.)

darootler commented 4 years ago

I can confirm that this problem still persists.

Regards Richard

DirkSDT commented 4 years ago

With 104.3 HA stopps updating after 20 - 24 hours The Log shows this: homeassistant | File "/usr/local/lib/python3.7/site-packages/homeconnect/sseclient.py", line 72, in __next__ homeassistant | File "/usr/local/lib/python3.7/site-packages/homeconnect/api.py", line 217, in _listen homeassistant | File "/usr/local/lib/python3.7/site-packages/homeconnect/sseclient.py", line 81, in __next__ homeassistant | File "/usr/local/lib/python3.7/site-packages/homeconnect/sseclient.py", line 51, in _connect

DavidMStraub commented 4 years ago

Thanks! Does the traceback continue? I would be curious which exception it raises.

DirkSDT commented 4 years ago

homeassistant | Exception in thread Thread-3: homeassistant | Traceback (most recent call last): homeassistant | File "/usr/local/lib/python3.7/site-packages/homeconnect/sseclient.py", line 72, in next homeassistant | next_chunk = next(self.resp_iterator) homeassistant | StopIteration homeassistant | homeassistant | During handling of the above exception, another exception occurred: homeassistant | homeassistant | Traceback (most recent call last): homeassistant | File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner homeassistant | self.run() homeassistant | File "/usr/local/lib/python3.7/threading.py", line 870, in run homeassistant | self._target(*self._args, self._kwargs) homeassistant | File "/usr/local/lib/python3.7/site-packages/homeconnect/api.py", line 217, in _listen homeassistant | for event in sse: homeassistant | File "/usr/local/lib/python3.7/site-packages/homeconnect/sseclient.py", line 81, in next homeassistant | self._connect() homeassistant | File "/usr/local/lib/python3.7/site-packages/homeconnect/sseclient.py", line 51, in _connect homeassistant | self.resp = requester.get(self.url, stream=True, self.requests_kwargs) homeassistant | File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 546, in get homeassistant | return self.request('GET', url, **kwargs) homeassistant | File "/usr/local/lib/python3.7/site-packages/requests_oauthlib/oauth2_session.py", line 478, in request homeassistant | url, http_method=method, body=data, headers=headers homeassistant | File "/usr/local/lib/python3.7/site-packages/oauthlib/oauth2/rfc6749/clients/base.py", line 198, in add_token homeassistant | raise TokenExpiredError() homeassistant | oauthlib.oauth2.rfc6749.errors.TokenExpiredError: (token_expired)

DavidMStraub commented 4 years ago

Oooh. That might be why I wasn't able to understand this. It's not actually a problem with the SSE stream, it's a problem with auth! Thanks a lot for this hint!

Now the question is why the token is not refreshed. I thought that HA internally takes care of this but maybe I misunderstood. I will ask the core devs.

DavidMStraub commented 4 years ago

So here is my theory: Home Assistant refreshes the tokens (otherwise I wouldn't be able to switch on/off my devices several days after HA startup), but this is not propagated to the threads that listen to the SSE stream, since they are supplied with a copy of the Auth instance that becomes stale after 24h.

So a solution would probably either be catching the TokenExpiredError and restarting the thread and/or restarting the thread on token refresh.

If anyone has a more educated view on this I'd be grateful.

DavidMStraub commented 4 years ago

I just released homeconnect v0.40 with a possible fix and this is now used in the master branch of the Home Assistant integration. I just restarted my own instance, I'll know in 24h hours whether it works... would be curious about feedback from others as well.

darootler commented 4 years ago

I installed the "master" branch via HACS and unfortunately the problem still persists.

Regards Richard

DavidMStraub commented 4 years ago

Thanks, I can confirm 😕

DirkSDT commented 4 years ago

It might be a new issue. The log I posted in https://github.com/DavidMStraub/homeconnect/issues/3#issuecomment-581589643 happened 20 - 24 hours after a restart of homeassistant.( with version 104.3) Since I have updated to 105.x homeconnect stops updating between 50 to 120 minutes after restarting homeassistant - I can't see anything in the logs: But going back to 104.3 dosn't change anything - homeconnect stops updating 50- 120 minutes after restarting homeassistant.

dvbit commented 4 years ago

Same Issue over here unfortunately. Dishwasher and Hoven

DavidMStraub commented 4 years ago

That it stops after 50-120 minutes I find very surprising. Is this observed by others as well? I just tested on an otherwise empty installation of HA 0.105.5 and all the devices are still reporting state updates after more than 9 hours.

lancer73 commented 4 years ago

It changes, I’ve had cases where it was still working after 24h, but also cases where it stopped within 50 mins.

DirkSDT commented 4 years ago

Thats why I wrote that it might be a new issue. The switches work correct, I'm able to switch on the dishwasher, but the state reported in HA remains off. I will try to make a fresh installation with no other integrations.

DavidMStraub commented 4 years ago

OK thanks. I just added some more logging to the SSE client module, released this as v0.4.1 and bumped the dependency in the homeassistant-homeconnect master branch. Hopefully this makes debugging easier.

DavidMStraub commented 4 years ago

OK, I think we're making progress. With the new logging, I got

2020-02-27 08:46:38 ERROR (Thread-4) [homeconnect.sseclient] 
2020-02-27 08:46:38 ERROR (Thread-3) [homeconnect.sseclient] 
2020-02-27 08:46:38 ERROR (Thread-5) [homeconnect.sseclient] 
2020-02-27 08:46:39 INFO (Thread-4) [homeconnect.sseclient] Connecting ...
2020-02-27 08:46:39 INFO (Thread-3) [homeconnect.sseclient] Connecting ...
2020-02-27 08:46:39 INFO (Thread-5) [homeconnect.sseclient] Connecting ...
2020-02-27 08:46:39 ERROR (Thread-5) [homeconnect] Token expired in event stream for device Kochfeld
2020-02-27 08:46:39 ERROR (Thread-3) [homeconnect] Token expired in event stream for device Trockner
2020-02-27 08:46:39 INFO (Thread-5) [homeconnect.sseclient] Connecting ...
2020-02-27 08:46:39 ERROR (Thread-4) [homeconnect] Token expired in event stream for device Backofen
2020-02-27 08:46:39 INFO (Thread-3) [homeconnect.sseclient] Connecting ...
2020-02-27 08:46:39 INFO (Thread-4) [homeconnect.sseclient] Connecting ...

but nothing from the event stream after that. Later, when switching on/off, I got

2020-02-27 19:12:25 DEBUG (SyncWorker_15) [custom_components.homeconnect.switch] tried to switch on Backofen
2020-02-27 19:12:25 WARNING (SyncWorker_15) [homeconnect] Token expired.
2020-02-27 19:12:27 DEBUG (SyncWorker_15) [custom_components.homeconnect.api] Entity update triggered on <Entity Backofen: off>

From this I conclude that

Something is apparently wrong with multithreading & refreshing the tokens, but I still don't know what.

DavidMStraub commented 4 years ago

If anyone can confirm that this is the problem they are seeing too, that would be helpful.

DavidMStraub commented 4 years ago

OK, I'm optimistic that my last commit actually fixes the problem! But I didn't have time to test is myself yet. I released a new homeconnect version and bumped the homeassistant-homeconnect dependency to it in master. Please give it a try and let me know in 24h if it still works! :pray:

Krocko commented 4 years ago

@DavidMStraub can you make an update in hacs?

DavidMStraub commented 4 years ago

You can update by selecting master in HACS, right? I'd release a proper version once the fix is confirmed.

Krocko commented 4 years ago

Ah cool! Didn’t know that.

darootler commented 4 years ago

You can update by selecting master in HACS, right? I'd release a proper version once the fix is confirmed.

Yes, that‘s possible. I just installed the master branch and will post feedback here.

Regards Richard

Krocko commented 4 years ago

Same problem after 15 hours.

darootler commented 4 years ago

I am receviving the following errors:

Log Details (ERROR)
Logger: homeconnect.sseclient
First occured: 8:58:26 AM (2 occurences)
Last logged: 8:58:26 AM

Cannot refresh token in event stream.
Logger: homeconnect.sseclient
First occured: 8:58:25 AM (2 occurences)
Last logged: 8:58:25 AM

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/homeconnect/sseclient.py", line 86, in __next__
    next_chunk = next(self.resp_iterator)
StopIteration

Regards Richard

DavidMStraub commented 4 years ago

Thanks a lot, I see my mistake now. We're close.

DavidMStraub commented 4 years ago

OK, next try... https://github.com/DavidMStraub/homeassistant-homeconnect/commit/921a2e64cbd1c9e0fe04de7e61c6a2aa7d4e7c29

Krocko commented 4 years ago

Now I got this error:

Error setting up entry Configuration.yaml for homeconnect
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 685, in _update_chunk_length
    self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 425, in _error_catcher
    yield
  File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 752, in read_chunked
    self._update_chunk_length()
  File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 689, in _update_chunk_length
    raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 750, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 560, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 781, in read_chunked
    self._original_response.close()
  File "/usr/local/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 443, in _error_catcher
    raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 215, in async_setup
    hass, self
  File "/config/custom_components/homeconnect/__init__.py", line 77, in async_setup_entry
    await update_all_devices(hass, entry)
  File "/config/custom_components/homeconnect/__init__.py", line 111, in update_all_devices
    await hass.async_add_executor_job(device_dict["device"].initialize)
  File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/config/custom_components/homeconnect/api.py", line 93, in initialize
    self.appliance.get_settings()
  File "/usr/local/lib/python3.7/site-packages/homeconnect/api.py", line 318, in get_settings
    settings = self.get("/settings")
  File "/usr/local/lib/python3.7/site-packages/homeconnect/api.py", line 256, in get
    return self.hc.get("{}/{}{}".format(ENDPOINT_APPLIANCES, self.haId, endpoint))
  File "/usr/local/lib/python3.7/site-packages/homeconnect/api.py", line 78, in get
    res = self.request("get", endpoint)
  File "/usr/local/lib/python3.7/site-packages/homeconnect/api.py", line 69, in request
    return getattr(self._oauth, method)(url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests_oauthlib/oauth2_session.py", line 516, in request
    method, url, headers=headers, data=data, **kwargs
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 686, in send
    r.content
  File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 828, in content
    self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
  File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 753, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

Edit: Ok. A second restart of ha solves this.

Krocko commented 4 years ago

The problem persists for the on/off switch. The coffee program switches and the progress sensor are still working.

Edit: also the program switches and the sensor stopped working now.

DavidMStraub commented 4 years ago

Can you please post your log?

duftstenen commented 4 years ago

I can confirm same issue here. I would love to send some log to you, but how do I enable the debug logging that you need? My HA.log has different exception for other components, but none for your HC.

And what log-level will you prefer?

DavidMStraub commented 4 years ago

Are you sure you are using the most recent version? homeassistant-homeconnect from master commit 921a2e6 that was committed 26 hours ago and depends on homeconnect v0.4.3? It should emit errors in the log that are displayed even if you don't have debug logging.

I have this in my configuration.yaml in addition:

logger:
  default: info
  logs:
    homeassistant.components.homeconnect: debug
    custom_components.homeconnect: debug
duftstenen commented 4 years ago

Yup. HACS says - Installed: 921a2e6. And I restarted afterwards. I have added your logger example and restarting HA again. Thanks.

DavidMStraub commented 4 years ago

I can confirm that it stopped working also for me and there is no error in the log, so it cannot be due to the token this time. Now I'm really about to give up. There is no exception raised, the stream thinks it's still connected but doesn't read any events. How can this be worked around? I have no idea.