Open ericsuhong opened 1 year ago
This is a must-have for multi-tenancy, where the cluster lifecycle is typically controlled by a platform team, but the federated credentials and managed identities are controlled by users/developer teams.
Without a static/predictable/recoverable OIDC issuer URL, if the platform team needs to recreate the cluster for any reason, the OIDC issuer URL would get rotated and cause a breaking change for users' workload identity federations.
Thanks for letting us know your feedback and user scenario. There is security risk for BYO (bring your own) OIDC Issuer url. We are seeking potential workaround.
Thanks for attention @CocoWang-wql. I don't think we necessarily need the ability to BYO issuer URL at cluster creation, as long as all federated credentials "find their way back" after a cluster recreation. I suppose there could be a couple of angles to approach this from, here's a few:
Microsoft.Authorization/OIDCIssuer
that we can attach to an AKS cluster (or even better, attach it simultaneously to multiple AKS clusters). Needless to say, its lifecycle would need to be decoupled from the AKS cluster's lifecycle.Thanks for the info. Would like to know more details:
From your description, I understand the pain point is: you need to update OIDC urls on all services after cluster re-creation.
The question here is: in pod yaml, the only introduced parameter is service account name
.
IMO, after OIDC url changes, you only need to establish federated identity credential and doesn't need to update each pod yaml file as the service account name doesn't change.
@illrill @ericsuhong
The problem is not with the Pod or Service Account. The problem is that the User-assigned Managed Identity to which the Service Account is federated via a Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials
resource, has a property issuer
that requires the cluster's OIDC issuer URL (which is unpredictable).
{
"audiences": [
"api://AzureADTokenExchange"
],
"id": "/subscriptions/<subscription>/resourcegroups/<resource-group>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<managed-identity>/federatedIdentityCredentials/<federated-credential>",
"issuer": "https://westeurope.oic.prod-aks.azure.com/<subscription>/<oidc-issuer-url>/",
"name": "<federated-credential>",
"resourceGroup": "<resource-group>",
"subject": "system:serviceaccount:<namespace>:<service-account>",
"systemData": null,
"type": "Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials"
}
Here's the scenario.
federatedIdentityCredentials
resource on their userAssignedIdentities
resource. They have specified the current cluster's OIDC issuer URL. This enables the Service Account to assume the identity of the User-assigned Managed Identity when interacting with Azure. Everything is fine.federatedIdentityCredentials
issuer
property has now become outdated. The Service Account is no longer allowed to assume the identity of the Managed Identity, because there is no valid federatedIdentityCredentials
anymore. In other words, the workload identity federation is broken.federatedIdentityCredentials
resource's issuer
property with the new OIDC issuer URL.The practical result of this is that an AKS cluster must be treated like a "pet" that can never be recreated. Because if we do, we cause a breaking change for all users/developers in the sense that all workload identity federations stop working and we need to call every developer/user and ask them to update their federatedIdentityCredentials
issuer
property.
The pain point is having to re-establish federated identity credential with updated OIDC url for all services. Imagine running 100+ services (with distinct MSI per each) in a cluster and having to update OIDC url for each.
Also feeling the pain of this issue. We have to coordinate the recreation of all managed identity federations.
If there was a way to programmatically find all federated credentials for a cluster (by tag name or something), then we could automate this, but currently we'd have to search through all managed identities to find matches.
I guess we could experiment with storing this URL behind some reverse proxy, but that seems like a lot of experimentation for something that might not work.
+1 on this issue, this makes it a huge pain for platform teams that need to replace clusters. There should be a way to have a 'static' endpoint so we do not need to update the federations on all off the identities
+1 on the issue. we don't have control on downstream configuration which add complexity and dependency.
I have now paused migration to workload identity as this would make DR so much harder. Will stick with aad pod identity until resolved.
The inability to share OIDC issuer URIs for clusters is a pain point for workload identity adoption. We do blue/green kubernetes cluster deployments to avoid potential issues during infrastructure updates. E.g:
aks-cluster-dev-blue <- active ingress
aks-cluster-dev-green
The problem is that every update cycle; the new cluster would get a new issuer URI and we'd have to keep track of and re-create every federation again (actually keep two instances because both clusters are online at the same time). This is something we have solved for our self hosted clusters where we can bring our own static issuer.
Having an issuer as a separate object in azure would be great, and the ability to optionally specify one when creating/recreating a cluster. In our case the two clusters would simply point at the same issuer. In this scenario it doesn't matter what the URI is as long as it's static. The complexity of creating and rotating keys could also be abstracted away from the user.
EDIT: Reading this again I realized what I suggested above is exactly what @illrill suggested, I missed that somehow :)
+1 for this
A possible workaround would be to use Terraform to destroy/create an AKS cluster, and then a Terraform apply to update the identities based on the new cluster OIDC issuer URL. It would be great not to have to do this.
+1 for this
A possible workaround would be to use Terraform to destroy/create an AKS cluster, and then a Terraform apply to update the identities based on the new cluster OIDC issuer URL. It would be great not to have to do this.
that's a potential workaround but a bad one, as called out above, my team manages the "platform" and we have 100's of services deployed on the cluster, we don't manage their identities and in many cases can't even see them.
We've engaged with Microsoft professional services (how I was linked to this thread) as we have a similar issue as to what's mentioned, our DR strategy is to replace the cluster if something goes catastrophically wrong, we also utilize our own on-prem clusters which allow us to manage our own JWKS/oidc endpoints, which is not plausible with Azure since we have no read/write access to the service signing key nor cluster configuration at that level. We recently went through trials with GKE and with their fleet (what used to be Anthos) there's a single endpoint that multiple clusters take on, that was part of the question request as well as we run many clusters per environment (effectively one federated credential for "prod" instead of one per cluster).
Will this be tracked/resolved by #2861?
@CocoWang-wql any updates in this topic?
Is your feature request related to a problem? Please describe. Right now, when an AKS cluster is created with OIDC issuer enabled, OIDC issuer url is generated randomly such as:
https://westus2.oic.prod-aks.azure.com/[tenantId]/[random-guid]/
This poses a maintenance problem when we need to delete and recreate a cluster, because we need to ask all deployed services to update their identities to update their federated credentials with new randomly generated OIDC urls.
Describe the solution you'd like An ability to specify an OIDC issuer guid deterministically during cluster creation time such as:
This will allow us to keep the same OIDC issuer url even when clusters are destroyed and recreated, and allow such cluster recreation process transparent to deployed services without having to ask them to update federated credentials.