home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
74.19k stars 31.16k forks source link

2021.5.2 : Starting cloud, not everything will be available until it is finished.... First HassOS reboot only #50392

Closed pergolafabio closed 3 years ago

pergolafabio commented 3 years ago

The problem

i Just rebooted my HassOS , now my istance is unable to connect from nabucasa, i receive error below.. Rebooting HassOS again doesnt help... when i just do a restart HA afterwards when i acces HA locally , no hassOS reboot, then it runs without an issue, and is connected to cloud...

EDIT1: after doing some more testing, when i roll back to 2021.4.6 (snapshot esxi) , i dont have the issue, i can reboot HassOS everytime without a glitch... when i update to 2021.5.2 , then i have the error again

EDIT2 : some more testing, i reverted back to 2021.5.2 , then download the code from 2021.4.6 , and copy/pasted the "cloud" folder from 2021.4.6 , pasted in custom components, now it loads correctly, no issue anymore with the older "cloud" folder so there must be something going on with this new version

EDIT3: installed back the cloud from 2021.5.2 , loaded is as custom, then it fails again, but my guess its related to nabucasa , so i only changed the manifest file, and loaded the older version, now is OK, so i think the issue is with the new nabucasa? 0.43.0 ?

"requirements": ["hass-nabucasa==0.42.0"],

2021-05-11 11:40:48 WARNING (MainThread) [homeassistant.loader] You are using a custom integration cloud which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant

image

What is version of Home Assistant Core has the issue?

core-2021.5.2

What was the last working version of Home Assistant Core?

core-2021.4.6

What type of installation are you running?

Home Assistant OS 5.13

Integration causing the issue

nabucasa

Link to integration documentation on our website

https://www.home-assistant.io/integrations/cloud/

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2021-05-10 11:42:54 ERROR (MainThread) [hass_nabucasa.remote] Unexpected error in Remote UI loop
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/remote.py", line 379, in _certificate_handler
    if not await self.load_backend():
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/remote.py", line 137, in load_backend
    resp = await cloud_api.async_remote_register(self.cloud)
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/cloud_api.py", line 16, in check_token
    await cloud.auth.async_check_token()
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/auth.py", line 172, in async_check_token
    await self._async_renew_access_token()
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/auth.py", line 199, in _async_renew_access_token
    await self.cloud.run_executor(cognito.renew_access_token)
  File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.8/site-packages/pycognito/__init__.py", line 636, in renew_access_token
    self._set_tokens(refresh_response)
  File "/usr/local/lib/python3.8/site-packages/pycognito/__init__.py", line 708, in _set_tokens
    self.verify_token(tokens["AuthenticationResult"]["IdToken"], "id_token", "id")
  File "/usr/local/lib/python3.8/site-packages/pycognito/__init__.py", line 254, in verify_token
    raise TokenVerificationException(
pycognito.exceptions.TokenVerificationException: Your 'id_token' token could not be verified.
2021-05-10 11:42:56 ERROR (MainThread) [hass_nabucasa.iot] Unexpected error
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/iot_base.py", line 108, in connect
    await self._handle_connection()
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/iot_base.py", line 147, in _handle_connection
    await self.cloud.auth.async_check_token()
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/auth.py", line 172, in async_check_token
    await self._async_renew_access_token()
  File "/usr/local/lib/python3.8/site-packages/hass_nabucasa/auth.py", line 199, in _async_renew_access_token
    await self.cloud.run_executor(cognito.renew_access_token)
  File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.8/site-packages/pycognito/__init__.py", line 636, in renew_access_token
    self._set_tokens(refresh_response)
  File "/usr/local/lib/python3.8/site-packages/pycognito/__init__.py", line 708, in _set_tokens
    self.verify_token(tokens["AuthenticationResult"]["IdToken"], "id_token", "id")
  File "/usr/local/lib/python3.8/site-packages/pycognito/__init__.py", line 254, in verify_token
    raise TokenVerificationException(
pycognito.exceptions.TokenVerificationException: Your 'id_token' token could not be verified.

and afterwards :

2021-05-11 09:32:42 ERROR (MainThread) [homeassistant.setup] Setup of cloud is taking longer than 300 seconds. Startup will proceed without waiting any longer
2021-05-11 09:32:42 ERROR (MainThread) [homeassistant.setup] Unable to prepare setup for platform cloud.binary_sensor: Unable to set up component.
2021-05-11 09:32:42 ERROR (MainThread) [homeassistant.setup] Unable to prepare setup for platform cloud.stt: Unable to set up component.
2021-05-11 09:32:42 ERROR (MainThread) [homeassistant.setup] Unable to prepare setup for platform cloud.tts: Unable to set up component.

Additional information

No response

cogneato commented 3 years ago

Are you now missing the cloud option on the configuration page? If so, can you remove the hidden .cloud directory within your config directory and restart HA?

pergolafabio commented 3 years ago

indeed, when nabu casa is not working, i dont see the cloud options if i then access it locally, then do a restart HA, sometimes it connects fine, sometimes not ... quite random

maybe the .cloud is corrupt? its safe to delete?

cogneato commented 3 years ago

Yes, go ahead and delete it and restart HA. You should get the cloud card back on the configuration page.

pergolafabio commented 3 years ago

Ok, but how come it sometimes start up fine and sometimes not? What can be the cause?

pergolafabio commented 3 years ago

deleting the .cloud dir doesnt help seems the issue is only occuring if a do a full reboot of my HassOs on first start i see the errors if i then just do a HA restart service, its gone

pergolafabio commented 3 years ago

now seeing this :

2021-05-10 19:57:35 ERROR (SyncWorker_7) [hass_nabucasa.acme] Can't connect to ACME server: urn:acme:error:serverInternal :: The server experienced an internal error :: The service is down for maintenance or had an internal error. Check https://letsencrypt.status.io/ for more details.
2021-05-10 19:57:46 ERROR (SyncWorker_17) [hass_nabucasa.acme] Can't connect to ACME server: urn:acme:error:serverInternal :: The server experienced an internal error :: The service is down for maintenance or had an internal error. Check https://letsencrypt.status.io/ for more details.
2021-05-10 19:57:58 ERROR (SyncWorker_10) [hass_nabucasa.acme] Can't connect to ACME server: urn:acme:error:serverInternal :: The server experienced an internal error :: The service is down for maintenance or had an internal error. Check https://letsencrypt.status.io/ for more details.
2021-05-10 19:58:09 ERROR (SyncWorker_10) [hass_nabucasa.acme] Can't connect to ACME server: urn:acme:error:serverInternal :: The server experienced an internal error :: The service is down for maintenance or had an internal error. Check https://letsencrypt.status.io/ for more details.
2021-05-10 19:58:20 ERROR (SyncWorker_0) [hass_nabucasa.acme] Can't connect to ACME server: urn:acme:error:serverInternal :: The server experienced an internal error :: The service is down for maintenance or had an internal error. Check https://letsencrypt.status.io/ for more details.
2021-05-10 19:58:31 ERROR (SyncWorker_17) [hass_nabucasa.acme] Can't connect to ACME server: urn:acme:error:serverInternal :: The server experienced an internal error :: The service is down for maintenance or had an internal error. Check https://letsencrypt.status.io/ for more details.
cogneato commented 3 years ago

ok, so it looks like letsencrypt is down for maintenance, which means you'll have to wait for those new certificate files https://letsencrypt.status.io/

pergolafabio commented 3 years ago

ok, certiticate issue is gone now, but the original problem stays

seems only happening when i do a Full HassOS reboot, then it always fails when i do afterwards a restart service, its OK

what could be the issue?

cogneato commented 3 years ago

No idea what is happening when you do a reboot. Are you using wifi?

pergolafabio commented 3 years ago

No, all configured with LAN, running on an esxi server... All other platforms are loading correctly, except the cloud Also other cloud integrations work correctly

pergolafabio commented 3 years ago

after doing some more testing, when i roll back to 2021.4.6 , i dont have the issue, i can reboot HassOS everytime without a glitch... when i update to 2021.5.2 , then i have the errors on first HassOS COLD reboot

pergolafabio commented 3 years ago

some more testing, i reverted back to 2021.5.2 , then download the code from 2021.4.6 , and copy/pasted the "cloud" folder from 2021.4.6 , pasted in custom components, now it loads correctly, no issue anymore with the older "cloud" folder so there must be something going on with this new version

pergolafabio commented 3 years ago

@balloob , sorry to ping you, but you are owner of the cloud integration, right? ca, you help me with this case?

thnx in advance!

pergolafabio commented 3 years ago

installed back the cloud from 2021.5.2 , loaded is as custom, then it fails again, but my guess its related to nabucasa , so i only changed the manifest file, and loaded the older version, now is OK, so i think the issue is with the new nabucasa? 0.43.0 ?

"requirements": ["hass-nabucasa==0.42.0"],

pergolafabio commented 3 years ago

the errors are coming from pycognito, and that component is indeed changed

pycognito.exceptions.TokenVerificationException: Your 'id_token' token could not be verified.

https://github.com/NabuCasa/hass-nabucasa/pull/230

github-actions[bot] commented 3 years ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

pergolafabio commented 3 years ago

solved by HassOS 6.1 , si Hassos gets synched with NTP/time