Open faali1 opened 2 months ago
https://spiffe.slack.com/archives/CBNCC2V17/p1724959960812299
Some context, this is already possible for re-attestable attestors like k8s_psat using the new emptyDir
config in the hardened helm charts. However, for non re-attestable attestors like aws_iid, this is not possible as the spire-agent needs to be persistent.
I agree that we need to provide at least documentation on the best way to wipe agent state in Kubernetes. If that becomes too hard, we'll consider adding a command as a last resort (we're worried about that command being invoked accidentally).
We use SPIRE agents in our k8s clusters that connect to SPIRE servers. We have multiple trust domains and, some times, users create a cluster and put in the wrong trust domain (accidentally or they were mistaken etc.) To fix this, we have to perform multiple steps:
x509svid: could not verify leaf certificate: x509: certificate signed by unknown authority
The current way we solve this is by doing a node scale down and then up. This resets the data for SPIRE agent. I propose adding an option to the SPIRE agent CLI that essentially resets the data directory/resets the spire agent so it can connect to the correct server.
Another benefit we can see is that, we started of by using keeping our keys on disk. If/when we move to KMS to manager our keys, our root signing key will change and we will have to deal with the above error. It's a lot nicer to ask teams to run a CLI command (while keeping their other services running) rather than doing a full node scale down/up on all clusters.