spiffe / spire

The SPIFFE Runtime Environment
https://spiffe.io
Apache License 2.0
1.81k stars 478 forks source link

CLI Option to delete SPIRE agent directory/reset existing creds #5442

Open faali1 opened 2 months ago

faali1 commented 2 months ago

We use SPIRE agents in our k8s clusters that connect to SPIRE servers. We have multiple trust domains and, some times, users create a cluster and put in the wrong trust domain (accidentally or they were mistaken etc.) To fix this, we have to perform multiple steps:

The current way we solve this is by doing a node scale down and then up. This resets the data for SPIRE agent. I propose adding an option to the SPIRE agent CLI that essentially resets the data directory/resets the spire agent so it can connect to the correct server.

Another benefit we can see is that, we started of by using keeping our keys on disk. If/when we move to KMS to manager our keys, our root signing key will change and we will have to deal with the above error. It's a lot nicer to ask teams to run a CLI command (while keeping their other services running) rather than doing a full node scale down/up on all clusters.

faali1 commented 2 months ago

https://spiffe.slack.com/archives/CBNCC2V17/p1724959960812299

Some context, this is already possible for re-attestable attestors like k8s_psat using the new emptyDir config in the hardened helm charts. However, for non re-attestable attestors like aws_iid, this is not possible as the spire-agent needs to be persistent.

azdagron commented 2 months ago

I agree that we need to provide at least documentation on the best way to wipe agent state in Kubernetes. If that becomes too hard, we'll consider adding a command as a last resort (we're worried about that command being invoked accidentally).