Azure / azure-workload-identity

Azure AD Workload Identity uses Kubernetes primitives to associate managed identities for Azure resources and identities in Azure Active Directory (AAD) with pods.
https://azure.github.io/azure-workload-identity
MIT License
298 stars 92 forks source link

Federated identity credentials support for wildcards #373

Open nclaeys opened 2 years ago

nclaeys commented 2 years ago

Is your feature request related to a problem? Please describe. We are porting our product from AWS to Azure and in AWS you can use wildcards in your trust relationships between your serviceaccount and a role (similar to azure ad application in azure) as follows:

  statement {
    actions = [
    "sts:AssumeRoleWithWebIdentity"]
    effect = "Allow"

    condition {
      test     = "StringLike"
      variable = "<oidc_url>"
      values = [
      "system:serviceaccount:environmentprefix-*:saprefix-*"]
    }

    principals {
      identifiers = [<oidc_arn>]
      type = "Federated"
    }
  }

Is this something that you are considering as well? At the moment it is rigid to work with federated identity credentials in Azure:

Describe the solution you'd like It would be great if the federated identity credential had support for wildcards to for example allow multiple environments or allow creating a dedicated service account for each pod. An example of a credential could be as follows: resource "azuread_application_federated_identity_credential" "app" { application_object_id = azuread_application.object_id display_name = "uuid" audiences = ["api://AzureADTokenExchange"] issuer = var.oidc_issuer_url subject = "system:serviceaccount:*:service_account_name-????" } wildcard support: * for any string or ? for 1 random character

Describe alternatives you've considered The other path we are thinking of is managing the federated identity credentials using a kubernetes operator. This way we can dynamically create the federated identity credential, when an application is deployed to a new environments. Issues we see there are:

Additional context

ekristen commented 2 years ago

I really need this. I was going to go down the creating of federated credentials but the limit of 20 is going to kill me. I need my pods to use different service accounts and sometimes these are dynamically created.

Anything you can do to add in support on AAD side to allow wildcards on this would be HUGE! Thanks.

ekristen commented 2 years ago

@aramase I see you added the label aad -- is there a change this gets supported, especially since this has nothing to do with the kubernetes webhook and everything to do with the aad side of things in Azure? Thanks!

aramase commented 2 years ago

Thank you for the feedback. I've shared this issue with the AAD team and will update the issue here once I hear from them.

cc @udayxhegde

ekristen commented 2 years ago

@aramase any update from the AAD team? Thanks.

udayxhegde commented 2 years ago

@ekristen thanks for the feedback! We will consider this support in our future planning cycles. Right now, we are heads down on completing the work needed to allow customers to use this capability in production for both app registration and managed identities.

ekristen commented 2 years ago

@udayxhegde appreciate the information and that you are heads down. I think without supporting wildcards you are going to extremely limit people's ability to use this. Service Accounts are often created in larger quantities per workload even to help with permission restrictions and other needs, not being able to wildcard service accounts is very very limiting.

Has there been any further discussion in supporting this and if so any ETA? Thank you.

udayxhegde commented 2 years ago

Hi @ekristen : thanks for your feedback. In our initial release we will not be able to support wildcards unfortunately. The additional support in the form of wildcards or custom claims is indeed very important: but there's no ETA for that support.

ekristen commented 2 years ago

@udayxhegde alright, that's unfortunately. I'll have to fallback to using secrets or certificates, with the limit of 20 federated identities and no wildcards, while this is the preferred way to auth, it's not usable which is unfortunate. If there's someone I can elevate this to to get higher priority via my company, please let me know. Thanks.

udayxhegde commented 2 years ago

@ekristen : are there no other options to consider here? Since each pod can only use one service account, what is causing you to intentionally use different service accounts or create them dynamically? I am sure there is a good reason why this is being done, just trying to understand it.

ekristen commented 2 years ago

Maybe there is, any chance the limit of 20 federated identities can be increased for an app? Like 250?

We are also using a service principal to target a couple different tenants as well.

We have a lot of automation and various workloads that need specific access to resources within the the cluster, this all happens via automated means.

This ends up meaning that we have 20-100+ (this number will grow) different workloads with service accounts that can access specific resources like dedicated secrets or config maps.

This is why we can't use a single service account.

If the limit of 20 federated identities wasn't there I might be able to make this work.

I started down the path of dynamically editing the federated identities until I ran into the 20 limit.

udayxhegde commented 2 years ago

changing by that order of magnitude is not practical: another alternative is to use multiple service principals, but that is not easy to manage either.

ekristen commented 2 years ago

The wildcard is the best approach and least amount of work for the Azure team by far. I'm unfortunately going to have to use certificates or shared secrets until wildcards are supported which is a huge bummer. Hopefully it won't take long to implement, it's not a very complex mechanism and is going to unlock you and your customers a ton to do more amazing integrations.

ekristen commented 2 years ago

@udayxhegde following up on this, I thought of another user-case/reason this is needed. For teams that are helping to manage or do things in multiple tenants.

Use Case: I'm a company with a product to help people with Azure, our software is deployed and managed on k8s and needs to use a single service principal (or two) to talk to dozens if not hundreds of other tenants.

AZWI would be the best from a security perspective, but since only the service account can change the tenant being targeted, and since wildcards aren't supported and the limit on federated identities is 20 this can't be used, unfortunately that means hard coded certificates or secrets. :(

ekristen commented 2 years ago

@udayxhegde any movement on this feature request?

smokedlinq commented 2 years ago

Just hit this after a few GitHub repos converted. What's the recommended best practice for a GitHub repo? One per repo? Was hoping one per team would be sufficient, especially when the team repos would be given the same permission to Azure and app resources.

ekristen commented 2 years ago

You have to do one federated identity per repo and limited to 20 per app, then you have to create another one. Unfortunately this is a very painful feature to use.

ekristen commented 2 years ago

@udayxhegde is there anyone we can elevate with this on the Azure side and put in direct contact with. To be very honest, this feature is useless at scale. Without wildcard support, it's just easier and better to use hard coded client secrets/certificates which is a shame.

salaxander commented 2 years ago

Hey @ekristen - would you maybe be willing to have a chat with me on this? I'm not on the Azure Identity team, but I am the PM for Workload ID at Microsoft, so I'd really love to be sure I've got your use case well understood and documented to maybe help push this forward.

If you're on Kubernetes Slack you're welcome to ping me there (@ Xander), otherwise xgrzywinski @ microsoft.com

ekristen commented 2 years ago

Hey @ekristen - would you maybe be willing to have a chat with me on this? I'm not on the Azure Identity team, but I am the PM for Workload ID at Microsoft, so I'd really love to be sure I've got your use case well understood and documented to maybe help push this forward.

If you're on Kubernetes Slack you're welcome to ping me there (@Xander), otherwise xgrzywinski @ microsoft.com

Absolutely! I'll reach out.

udayxhegde commented 2 years ago

Sorry @ekristen for the late reply... I recognize this capability is important to manage things at scale, but unfortunately, we don't have any updates on this yet.

kevinharing commented 1 year ago

This is holding us back as well. We deploy resources dynamically at request with a pre-existing managed identity. The namespace name is dynamic and is unique for each deployment. The service account is created in this namespace. It is really cumbersome to have to create/delete the federated identity credential for each deployment, and of course there's the limit of 20 credentials. Also, the delay in identity/credential propagation is exactly what we're trying to fix by moving away from AAD Pod Identity. A wildcard for the namespace name would fix this problem, as we would be able to have a pre-existing federated identity credential that can be reused.

ekristen commented 1 year ago

I had a call with a PM a few months back on this. I said match what AWS does, allow wildcards anywhere, or StringLike matching. This is still a HUGE pain point.

kevinharing commented 1 year ago

@salaxander So no plan for implementing this for now? Is there a workaround?

ekristen commented 1 year ago

@kevinharing I had a call with a product person from Microsoft to explain the situation and they seemed responsive. Unfortunately that was 6+ months ago. I unfortunately have taken the position that this will never be addressed.

salaxander commented 1 year ago

@kevinharing @ekristen Sorry for the confusion here folks! I did remove this from our roadmap project board because the limitation is on the AAD side, and so it's not an actionable feature item on our roadmap. I have been told by AAD that they will support wildcards though. I don't know a timeline, but I do know that it they are planning to do it. I'll pin this issue on the repo though for visibility.

kevinharing commented 1 year ago

@kevinharing @ekristen Sorry for the confusion here folks! I did remove this from our roadmap project board because the limitation is on the AAD side, and so it's not an actionable feature item on our roadmap. I have been told by AAD that they will support wildcards though. I don't know a timeline, but I do know that it they are planning to do it. I'll pin this issue on the repo though for visibility.

Thank you for the quick reply!

pockyhe commented 1 year ago

@kevinharing wildcards is very necessary. For now, we have multi(more than 20) namespaces in aks. Within all of these namespaces, we need to access different Azure Resources. Besides, these namespaces have same prefix naming. However, we hope to only aissgn credential in single 3rd party app. we don't want to create multi 3rd party app, which will be very hard to manage.

jmyers82 commented 1 year ago

A recent feature release in Terraform now allows for the use of OIDC auth with Azure and required a federated credential and this limit of 20 has quickly become an issue. We need the wildcard capability to be prioritized quickly as this is a huge hold up on our deployment of using OIDC with Terraform Cloud. https://developer.hashicorp.com/terraform/cloud-docs/workspaces/dynamic-provider-credentials/azure-configuration

tkellen commented 1 year ago

Where can we elevate our request for AAD supporting wildcards and increasing the maximum credentials in flight per identity? My client wants to use user managed identities rather than application registrations due to access concerns (subscription owner level access is not sufficient to create applications and service principals for role assignment and group membership). The limitations discussed at length above are a blocker for us as well.

joshua-hancox commented 1 year ago

A recent feature release in Terraform now allows for the use of OIDC auth with Azure and required a federated credential and this limit of 20 has quickly become an issue. We need the wildcard capability to be prioritized quickly as this is a huge hold up on our deployment of using OIDC with Terraform Cloud. https://developer.hashicorp.com/terraform/cloud-docs/workspaces/dynamic-provider-credentials/azure-configuration

Facing this exact same issue, the absence of wildcards combined with the limit of 20 federated credentials per app registration makes this feature pretty useless at the minute. Would love an update on expected timeframe?

joshua-hancox commented 1 year ago

@udayxhegde @aramase do you guys know if this issue is in the right place/being looked at currently?

tkellen commented 1 year ago

Met with a Microsoft rep ~2 weeks ago who reported that they knew this solution is entirely unsuitable for real world use cases at even the smallest of scales and that we should continue beating the drum about how completely useless it is. There is no public roadmap to this being resolved and nothing internal either. Federated identities are not ready, just mint and rotate your own tokens using applications and service principals and move on.

nickludwig commented 1 year ago

Hey folks - I'm not sure who on our end reported that there wasn't an internal plan to solve this, but I'm going to assume they were mistaken. We're definitely aware of this limitation, and there is an internal plan to solve this (I am one of the folks directly involved). That said, the solution won't be to raise the 20 federated identity credential limit (which is what I'm assuming the rep @tkellen talked with was referring to). Instead, there will be expression support that'll allow FIC's to map in a 1:many manner between entities instead of the current 1:1 mapping limitation affecting you all. We're nearing the finish line on a design for this that, alongside some other behavior, will allow you to configure a wildcard-like expression (we're using prefix matching) to match against the subject claim in incoming tokens from external IdP's. So, in the context of AKS, you'd be able to configure an expression such that any service account name or namespace/service account name combination would be authorized.

I can't really comment on the timeline as there are still some things for us to figure out, but I can assure you all that this is something we're actively working on. I'm also more than happy to discuss this further with anyone, so if you have interest in chatting more feel free to shoot me an email at ludwignick@microsoft.com.

tkellen commented 1 year ago

Thanks for the update @nickludwig. If the 20 token limitation isn't raised, would you agree that a workload with more than 20 instances running concurrently would need to mint multiple identical federated identities (in all but name)? Also that it will be the onus of the consumer to intelligently "load balance" which identity is used to account for this? For example, if you require 50 replicas of a given pod an operator will need 3 identical federated identities (first 20 replicas use the first identity, second 20 the second, and last ten the third with 10 tokens available on reserve).

timown commented 1 year ago

Thanks for the update @nickludwig. If the 20 token limitation isn't raised, would you agree that a workload with more than 20 instances running concurrently would need to mint multiple identical federated identities (in all but name)? Also that it will be the onus of the consumer to intelligently "load balance" which identity is used to account for this? For example, if you require 50 replicas of a given pod an operator will need 3 identical federated identities (first 20 replicas use the first identity, second 20 the second, and last ten the third with 10 tokens available on reserve).

why? you create fic per service account, so for one workload you'll need one fic, the number of replicas of this workload is irrelevant, they are all using the same service account

tkellen commented 1 year ago

You're right of course, @timown. Got a bit turned around about the resources involved.

dghubble commented 1 year ago

@nickludwig Something like a wildcard would be very useful for GitHub OIDC federated identity setups that trigger workflows on push (rather than pull_request). Today, you have to create a federated identity credential for each branch. There is no way to allow a push to any branch to use the federated identity.

nickludwig commented 1 year ago

@dghubble yup, agreed. We intend to cover GitHub scenarios with this work.

artificial-aidan commented 1 year ago

@nickludwig thank you for your time the other day. Just wanted to make note of our issue with prefix matching, in case others have the same.

Prefix matching doesn't allow matching a service account across multiple namespaces, which is a requirement in our workload. Something that wildcard support does. It only allows support for a wildcard in a single namespace, or across a whole cluster.

litvinolek commented 1 year ago

Just curious how it is possible to make Workload Identity GA when we have this limit of 20 FI per MI and no support of wildcard per namespace

ilmax commented 1 year ago

@nickludwig Something like a wildcard would be very useful for GitHub OIDC federated identity setups that trigger workflows on push (rather than pull_request). Today, you have to create a federated identity credential for each branch. There is no way to allow a push to any branch to use the federated identity.

maybe O.T. but you can configure what GitHub will issues in the token. In my case I'm using one one fic for each environments and the environment (not the branch) is emitted in the token issued by GitHub, works like a charm for me

dghubble commented 1 year ago

@ilmax the comment was about using GitHub push events for workflow triggers. Other event types may or may not be suitable for a repo.

nickludwig commented 1 year ago

@dghubble just want to clarify my initial comment. We intend to cover GitHub Actions scenarios with this work, but what we're able to do is limited by what is returned in the subject claim in the incoming token from GitHub. That said, the scenario of push events (where the subject claim is based on branch, i.e., repo:contoso/contoso-repo:ref:refs/heads/branch-foo) would fit into this model.

dghubble commented 1 year ago

@nickludwig yeah, in my case, the subject claim from GitHub looks like repo:myorg/myrepo:ref:refs/heads/somebranch and I'd like to match against repo:myorg/myrepo:ref:refs/heads/*, so that a push to any branch can use a federated identity. Hopefully that fits the upcoming model.

Today, my GitHub pushes can be associated with a Google Cloud workload identity or AWS OIDC, but Azure has this caveat I'm hoping can disappear.

floge07 commented 1 year ago

Also want to add my use case here. Our application is automatically deployed (via ci tools using a helm chart) multiple times to different namespaces. For pull requests and different public versions.

Right now, using AadPodIdentity, the pods in all namespaces use the same binding. But with the current state of WorkloadIdentity, we can't easily migrate to it. Being able to put a wildcard for the namespace would allow us to keep the same infrastructure setup.

litvinolek commented 1 year ago

Also want to add my use case here. Our application is automatically deployed (via ci tools using a helm chart) multiple times to different namespaces. For pull requests and different public versions.

Right now, using AadPodIdentity, the pods in all namespaces use the same binding. But with the current state of WorkloadIdentity, we can't easily migrate to it. Being able to put a wildcard for the namespace would allow us to keep the same infrastructure setup.

Exactly same situation

nbusseneau commented 1 year ago

@nickludwig yeah, in my case, the subject claim from GitHub looks like repo:myorg/myrepo:ref:refs/heads/somebranch and I'd like to match against repo:myorg/myrepo:ref:refs/heads/*, so that a push to any branch can use a federated identity. Hopefully that fits the upcoming model.

Today, my GitHub pushes can be associated with a Google Cloud workload identity or AWS OIDC, but Azure has this caveat I'm hoping can disappear.

@nickludwig Just want to voice support for implementing this. At Cilium we would also like to be able to have the subject claim match any branch, since we have cases where the CI needs to run against arbitrary references. Up to now this wasn't a blocker but due to recent CI changes we will have to move back to using client secrets until this is supported as the only available workaround would be adding bogus environments to all of our workflows' jobs and use that to match with an environment subject claim.

For what it's worth, before finding this issue I've tried to manually edit the subject identifier to the following values, as I expected I could somehow make it work (because what is the point of being able to edit the subject identifier otherwise 😅):

nickludwig commented 1 year ago

@floge07 / @litvinolek - thanks for listing your scenarios, it's super helpful for us to see the different use cases. I do have a question for y'all - our current design is centered around prefix matching instead of wildcards. So, you'd be able to achieve flexible namespaces with prefix matching, but you'd also need to authorize any service account name as well. Essentially, you'd match against system:serviceaccount:, and then namespace/service account name could be anything.

Would this be a blocking issue for you all?

nickludwig commented 1 year ago

@nickludwig yeah, in my case, the subject claim from GitHub looks like repo:myorg/myrepo:ref:refs/heads/somebranch and I'd like to match against repo:myorg/myrepo:ref:refs/heads/*, so that a push to any branch can use a federated identity. Hopefully that fits the upcoming model. Today, my GitHub pushes can be associated with a Google Cloud workload identity or AWS OIDC, but Azure has this caveat I'm hoping can disappear.

@nickludwig Just want to voice support for implementing this. At Cilium we would also like to be able to have the subject claim match any branch, since we have cases where the CI needs to run against arbitrary references. Up to now this wasn't a blocker but due to recent CI changes we will have to move back to using client secrets until this is supported as the only available workaround would be adding bogus environments to all of our workflows' jobs and use that to match with an environment subject claim.

For what it's worth, before finding this issue I've tried to manually edit the subject identifier to the following values, as I expected I could somehow make it work (because what is the point of being able to edit the subject identifier otherwise 😅):

  • repo:cilium/cilium:ref:refs/heads/* (I assumed this would match any branch)
  • repo:cilium/cilium:ref:* (I assumed this would match any reference)
  • repo:cilium/cilium:ref (idem)
  • repo:cilium/cilium (I assumed this would match anything coming from the cilium/cilium repository)

Hey @nbusseneau - thank you as well! This scenario is accounted for in our design.

floge07 commented 1 year ago

@floge07 / @litvinolek - thanks for listing your scenarios, it's super helpful for us to see the different use cases. I do have a question for y'all - our current design is centered around prefix matching instead of wildcards. So, you'd be able to achieve flexible namespaces with prefix matching, but you'd also need to authorize any service account name as well. Essentially, you'd match against system:serviceaccount:, and then namespace/service account name could be anything.

Would this be a blocking issue for you all?

No, that would work. By specifying system:serviceaccount: we would essentially trust all jwt issued by that Kubernetes service, right? But what if I wanted to contrain the subject identifier a bit further, would the prefixing logic also allow system:serviceaccount:myapp-? Or must it end with a colon?

Now that I think about how the "Subject identifier" gets kinda useless when matching everything, why is that field even mandatory on the Azure side? image

If I add the URI of an OIDC Issuer that I say I trust, shouldn't that already be enough? I mean sure it's nice to optionally constrain it further to specific subject values with a prefix or equal match, but are there security concerns I'm not seeing right now? (I can only think of when using a shared kubernetes with multiple apps and rbac)

But why stop with the sub claim? To have support to check for any arbitrary claim would probably be useful. After all, Kubernetes already adds its own claim there. Even includes the pod name. image

There are certainly oidc issuer services out there that will just put a random id into sub and the maybe important stuff into other custom claims.

Ah well, just a bit stream of thoughts there at the end. Probably even the wrong place here.