gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.62k stars 1.76k forks source link

Requiring new principals or new DNS names halfway through a CA rotation causes a restart loop #11149

Open espadolini opened 2 years ago

espadolini commented 2 years ago

Description

Changing the configuration in such a way that the stored credentials will require a reissue due to additional principals or DNS names as per the checkServerIdentity call in (*TeleportProcess).rotate() at time of writing (c167bb46db0c9a25e269060be17d8fd5a03d7606) while the cluster is undergoing a CA rotation will either not reissue the certificate with the new names (with whatever consequences that will cause, likely a failure in TLS routing) if the cluster is in init (or standby while the rotation is marked in_progress which shouldn't really happen), or trigger a restart without updating the current identity with the required additional names - and since the current identity was not updated, it'll keep reissuing the certificates and restarting.

A more fundamental issue is that we have no way to ask for a certificate reissue against the old CA - which is what we'd actually need to reenter the update_clients state correctly; in such case the best we can do is probably to log a warning saying "additional principals/SANs required but the cluster is currently rotating credentials, services may malfunction until the rotation is completed or rolled back". Other states just need to handle the case in which we want to reissue certs for non-rotation-related reasons.

Workarounds

Rolling back the rotation or advancing from update_clients to update_servers (after restarting at least once in update_clients state) will stop the restarts and will issue the correct certificate to the node.

Reproduction Steps

Run proxy_service without configuring kube_listen_addr, advance the cluster's CA rotation to update_clients, stop the proxy, change the configuration to include a kube_listen_addr and start the proxy.

Server Details

espadolini commented 2 years ago

As experienced by @Valien, this issue also applies (in a slightly different way) when registering a new node to Teleport while a CA rotation is happening.