home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.03k stars 30.55k forks source link

"Authentication expired" messages #122643

Open LordShadowen opened 3 months ago

LordShadowen commented 3 months ago

The problem

Every so often "Repair" warnings for the bridges show up. "Authentication expired for [bridge name]"

I've observed that the locks seem to continue to operate normally and respond to commands from HA. I've also observed that the token is not in fact expired - simply resubmitting the stored token with no changes dismisses the message and things continue to work as before.

I suspect that this happens after firmware updates pushed by Tedee, but I can't confirm that (could be coincidental). Perhaps a communication errors occur during the update, which is interpreted by the integration as token expired, and no further retries are done to dismiss the warning. But I'm just guessing here....

What version of Home Assistant Core has the issue?

2024.7.3

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Container

Integration causing the issue

tedee

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

home-assistant[bot] commented 3 months ago

Hey there @patrickhilker, @zweckj, mind taking a look at this issue as it has been labeled with an integration (tedee) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `tedee` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign tedee` Removes the current integration label and assignees on the issue, add the integration domain after the command. - `@home-assistant add-label needs-more-information` Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue. - `@home-assistant remove-label needs-more-information` Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


tedee documentation tedee source (message by IssueLinks)

zweckj commented 3 months ago

@LordShadowen I heard about that multiple times now, but am not having this issue. The code is already retrying the calls, there is not much more I can do. I think it might be related to the time of the HA instance not being accurate. I also would need debug logs of when it happens (shortly before)

LordShadowen commented 3 months ago

I've enabled debug on homeassistant.components.tedee and will let it stay like that. If it happens again, I'll let you know what shows up on the logs.

LordShadowen commented 3 months ago

Ok, it happened again - there was a firmware update during the night, and this morning I have the "Authentication expired" warnings. At this point I'm fairly certain they are related.

But unfortunately, I have no logs at all :( For some reason, my home-assistant.log file seems to have stopped - the last line was written about 3 hours before the update happened (really bad luck). Since then nothing. Even now, HA is working fine, but nothing is getting logged.... I have no idea why - the file is about 25MB. It's big, but not huge. Is there some sort of max size on log files that I don't know about?

I could use some advice... I can restart HA and wait for it to happen again, but if the log will keep stopping then it won't help.

dxfoxd commented 3 months ago

I had the same issue and after some checks it seems that the Bridge API disabled itself. After switching it back on a reload of the integration nor a HA restart did not resolve the issue. The integration docs of Tedee did not include a description how to ck or update the IP or token of the bridge. I then installed the integration (without deleting the previous one) and I was able to fill in the IP and token. The new entities of both locks and the bridge appeared again. I had to delete the other locks manually. The only issue now is that my dashboard had a button to open the locks: button.lock_xxxx_frontdoor_unlatch and button.lock_xxxx_frontdoor_unlock_unlatch (redacted) but both entities did not reappear. I can open the door by clicking the default lock card but this means that now two mouse clicks are needed for the same event. How can i get the unlock buttton(s) back again?

Art

LordShadowen commented 2 months ago

I had it happened again last night - there was a firmware update, and this morning I have the repair warning. Unfortunately I didn't have the debug on, but this is what I got from the logs:

2024-08-08 03:34:01.990 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Error fetching tedee data: Error while updating data: Error while calling local API endpoint /lock.

Then nothing for a couple of hours....

And then:

2024-08-08 05:06:32.663 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Error fetching tedee data: Error while updating data: Error while calling local API endpoint /lock.
2024-08-08 05:10:21.679 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Error fetching tedee data: Error while updating data: Error while calling local API endpoint /lock.
2024-08-08 05:19:41.447 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Authentication failed while fetching tedee data: Authentication failed. Local access token is invalid
2024-08-08 05:37:27.317 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Authentication failed while fetching tedee data: Authentication failed. Local access token is invalid
2024-08-08 05:41:27.892 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Error fetching tedee data: Error while updating data: Error while calling local API endpoint /lock.
2024-08-08 05:46:41.685 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Authentication failed while fetching tedee data: Authentication failed. Local access token is invalid
2024-08-08 05:57:50.403 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Authentication failed while fetching tedee data: Authentication failed. Local access token is invalid
2024-08-08 06:05:22.440 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Authentication failed while fetching tedee data: Authentication failed. Local access token is invalid
2024-08-08 06:13:50.819 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Error fetching tedee data: Error while updating data: Error while calling local API endpoint /lock.
2024-08-08 06:35:30.808 ERROR (MainThread) [homeassistant.components.tedee.coordinator] Error fetching tedee data: Error while updating data: Error while calling local API endpoint /lock.

I will try to have debug on for next time

zweckj commented 2 months ago

tbh debug logs won't really help in that case. Somehow your bridge updating appear to mess up authentication, which is weird, because I do not have this issue. My update installed fine today, without crashing the integration. Also, I'm not really sure how to solve this tbh, as you can already see that the integration is retrying a couple of times before giving up.

LordShadowen commented 2 months ago

I've since updated to 2024.8.0 in the hopes that somehow it would fix it, but it has not.

This evening it happened again (and again there were firmware updates, so they are very, very likely related).

A couple of interesting observations:

LordShadowen commented 2 months ago

Still haven't given up on this one :)

Yesterday I got a notification from the Tedee app that there was a new firmware available. So I knew it would happen during the night, and I turned on Debug before going to bed, just in case.

Sure enough, around 4:24AM the update must have started, since I see a few exceptions of 'pytedee_async.tedee_client' (with the message Connect call failed). As expected, eventually they stopped and everything went back to normal (as you described).

However, very interestingly, hours before that, at around 1:16am, I found this in the debug log:

2024-08-20 01:16:52.141 DEBUG (MainThread) [pytedee_async.tedee_client] Syncing locks
2024-08-20 01:16:52.141 DEBUG (MainThread) [pytedee_async.tedee_client] Getting locks from Local API...
2024-08-20 01:16:52.579 DEBUG (MainThread) [homeassistant.components.tedee.coordinator] Manually updated tedee data
2024-08-20 01:16:53.823 DEBUG (MainThread) [homeassistant.components.tedee.coordinator] Manually updated tedee data
2024-08-20 01:16:54.357 DEBUG (MainThread) [pytedee_async.tedee_client] Local API authentication failed.

This Local API authentication failed message only appears this one time in the entire log file spanning the entire night.

So, if I had to guess, a few hours before the firmware happens, Tedee is doing something with the bridge, which makes it so that for a very a small amount time it refuses the authentication API call (basically, it returns HTTPStatus.UNAUTHORIZED). When this happens, it probably triggers that "Repair" message in HASS. On the very next retry, it works again, so the integration overall is fine, but once triggered, the Repair message stays in the UI.

Does this make sense?

I have no idea why the bridge returns UNAUTHORIZED, but it seems to be in that state for such a short amount of time that it would be quite easy to miss - it must happen around the same time the periodic check from the integration happens. I probably get this issue much more easily then you do because I have several bridges, so my odds go up a lot that at least one will be in that state at the right time.

zweckj commented 2 months ago

When the repair pops it is already out of retries and HA can’t automatically recover from that state. I still don’t really understand what’s happening and why it’s only happening to you. Did you already try resyncing your HA hosts clock?

LordShadowen commented 1 month ago

My HA runs on a docker container and the host is synchronizing with an NTP server (specifically Google's time server). A clock desync shouldn't be the issue...

Anything else I can check?

Btw, it has happened again since last time. And once more, it happened several hours before a firmware update. This reinforces my suspicion that something happens on the bridges in preparation for the firmware update, and while that happens, local API calls fail. I don't suppose you have a contact person from Tedee that you could check if this sounds plausible?

zweckj commented 1 month ago

Anything else I can check?

Check the bridge's logs is the only thing that comes to mind, but iirc that only allows you to collect logs over a very limited timespan...

I don't suppose you have a contact person from Tedee that you could check if this sounds plausible?

I do have a contact there, but without concrete repro steps it's a bit difficult.

LordShadowen commented 1 month ago

Check the bridge's logs is the only thing that comes to mind, but iirc that only allows you to collect logs over a very limited timespan...

Yeah, that's not feasible. You must be within BT range and for a maximum of 15 minutes, from what I read.

I do have a contact there, but without concrete repro steps it's a bit difficult.

Just ask them if they do some sort of "pre-check" before the firmware updates, or some sort of maintenance in the hours prior to it, that would affect API calls. That might ring a bell :)