Open azdagron opened 2 years ago
Thank you for making this Andrew!
Our team is currently planning out the process of migrating our customers to use the new Agent ID path. Just wanted some clarifications on a few things if possible:
Thanks again!
Great questions @boyanguber!
When this becomes enforced in 1.4, does that mean that old agent ID's that don't conform to the new convention will not be able to attest/sync with spire server anymore? Or is it only applied to newly created agent ID paths?
I think all we plan to do as part of that release is ensure that newly attested nodes conform. We haven't discussed a timeline to enforce this for existing agents.
Will there be any auto-magic for nodes re-attesting under old formats to be auto-migrated to their new format?
This gets tricky. If we do decide to enforce this for existing agents, I can think of two mechanisms:
This is further complicated by plugins needing to be careful that they do TOFU checks on the old ID shape until agents have migrated. This complication exists independent of the choice to enforce the ID shape on existing agents.
I need to think about this a bit more. I think it's probably safe to say that unless we come up with a really good migration story that we won't be enforcing this for existing agents any time soon, but I'll circle back with the maintainers and see what comes up.
In the meantime if you have any suggestions, we're all ears :)
So the maintainers huddled on this. I think this is the plan of action we'd like to follow:
We will also provide a migration guide for operators who have found themselves impacted by this enforcement. This will cover topics like TOFU, which is the most security sensitive outcome of this enforcement (i.e. having to check both the old and new ID shape for TOFU during node attestation until all agents are migrated).
As far as timeline, we're looking at roughly 6 months until 1.4.0 and another 3 months after that for 1.5.0.
Notes from discussion today:
I need to work on this in conjunction with #3527. Even if I complete the PR in the 1.8.1 timeframe, we should not land it until 1.9.0.
It is clear to me now that these will never be high enough priority among my competing work priorities that I will ever accomplish them. Please re-label help wanted. I apologize to the community for holding these so long and not completing them.
Hello @azdagron,
I would be happy to work on this feature after I finish the retry bootstrap one.
SPIRE has assumed that node attestors would produce agent IDs that conform to the following convention:
(e.g.
spiffe://example.org/spire/agent/join_token/21B6D625-CCF3-49E1-8E7C-812B3F55B3CB
)Although this convention is not required for agent authorization to take place safely, it does aid in debuggability, auditability, and the ability to prevent misconfiguration (e.g. cannot accidentally assign a workload an agent ID since we can validate that the requested ID is not reserved). Being able to answer the question "was this ID produced by a node attestor" is also important for at-a-glance understanding of logs.
Unfortunately, SPIRE does not enforce this convention currently, and cannot due to backcompat concerns. However, a deprecation warning is introduced with #2694. This should give us the ability to enforce this beginning with SPIRE 1.4 at the earliest (since 1.3 will be the first minor to include the deprecation warning).
This issue tracks the enforcement of the agent ID convention.