gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.37k stars 1.74k forks source link

Headless watchers in auth service can get into bad state #42680

Open ravicious opened 3 months ago

ravicious commented 3 months ago

I ran into two issues with headless auth. The first one manifested itself by tsh hanging when waiting for approval after approving a headless auth request through the Web UI. The second one was Connect not showing a modal when a new request is created.

I figured that the issue was related to some problems with watchers in the auth service. @gzdunek reported running into the same problem. Then a minute later I get a message from him that restarting teleport helped. I restarted my cluster and everything started working again.

Searching for "headless" in the logs doesn't bring up anything interesting, though I didn't have debug logs enabled. Searching for "watcher" shows that there's a ton of the following logs. idk if they're related to headless auth.

2024-06-10T12:11:15+02:00 WARN [REVERSE:M] Re-init the cache on error: watcher is closed: error reading from server: EOF cache/cache.go:1131

Any further tips on how to debug this when it happens again?

ravicious commented 3 months ago

@gzdunek had a good point, could it be just related to the computer going to sleep and then watchers not resuming correctly on wake up or something?