Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.97k stars 308 forks source link

Azure Blob/File Storage as PVC: Support authorization via Workload Identities #3432

Open karlschriek opened 1 year ago

karlschriek commented 1 year ago

Describe the solution you'd like

We use Azure AD Workload Identities extensively in order to authorize various services on our AKS clusters to interact with other Azure Services. We follow a zero-credential approach as far as possible and workload identities a pivotal part of making this possible.

(With workload identities you can federate an Azure AD Service Principal to a specific service account within a specific namespace, within a specific cluster. The cluster itself is acts as an OIDC provider and the service account acts as an identity associated with the cluster's OIDC provider. By federating the service principal authentication to this identity - and only that identity! - it is therefore possible for log in as that service principal without needing any credentials).

We would like to be able to also use this with PVCs that attach blob/file storage. Currently only storage account keys and SAS keys are supported.

For example, currently you would set up something like this (abbreviated):

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-blob
  namespace: my-namespace
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  ...
  csi:
    ...
    nodeStageSecretRef:
      name: azure-secret  # <--- credentials need to be stored in this secret

We would like to here instead attach a ServiceAccount to the PV provisioner and have it authenticate using the Workload Identity. For example by specifying:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-blob
  namespace: my-namespace
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  ...
  csi:
    ...
    nodeStageServiceAccount
      name: my-service-account

Describe alternatives you've considered

The only alternatives are to fetch storage account or SAS keys into a Secret on the Namespace where you want to register the PVC. Our interim solution is to store these keys in AKV and fetch them into the cluster using external-secrets. However, that still requires that the credentials to be nakedly exposed on the cluster itself. We want to be able to avoid this.

Additional context

Not directly relevant, but here are some issues on major open-source projects where we've contributed to the adoption of workload identities. Some of the discussions there might be of useful for context:

cert-manager (in order to enable interaction with Azure DNS Zones )

external-dns (in order to enable interaction with Azure DNS Zones )

andyzhangx commented 1 year ago

cc @cvvz

olsenme commented 1 year ago

We do have a WI for this one scheduled for this sprint. @andyzhangx to confirm

andyzhangx commented 1 year ago

Hi @karlschriek, below config is actually not supported by the csi driver spec, we are still working on workload identity support, it could be different style, will share with you more details when we have some updates in this month, stay tuned.

  csi:
    ...
    nodeStageServiceAccount
cvvz commented 1 year ago

Hi, @karlschriek , we are working on this feature. But there won't be any field like nodeStageServiceAccount added in PVC, PV or storage class, in fact, a service account federated with Managed Identity(or Service Principal) will be created and linked with Blob/File CSI driver at the phase CSI driver is created and started. When CSI is going to mount File/Blob, it will try to use the service account as authentication, so no nakedly exposed credentials will exist in the cluster. Is that ok for you?

karlschriek commented 1 year ago

If I understand you correctly, you would bind the ServiceAccount (which would have to reside in the kube-system namespace) with the WI to the driver Pods. These Pods run as DaemonSets (one per node) and are responsible cluster-wide for provisioning PVs.

But that would defeat the purpose of using Workload Identities in the first place, wouldn't it? In this case it would be no different from using the node's Managed Identity!

We would need to be able to deploy a ServiceAccount in a specific Namespace and use its WI only for PVs rolled out there.

cvvz commented 1 year ago

So you will create Managed Identity(Service Principal) for each Namespace? Since every ServiceAccount would federate with a Managed Identity(Service Principal).

karlschriek commented 1 year ago

A Managed Identity is something that is attached directly to an Azure Resource, such as for example the Node. The attractiveness of Workload Identities is that you can attach an identity to a specific workload, i.e. to a Pod. The way you do this, is by creating an identity that you attach to a specific ServiceAccount. If you attach that ServiceAccount to a Pod, the Pod now has that identity.

This allows us to do permissioning on an application level. Application A needs access to a specific database; Application B needs access to a specific key vault. If I were to use Managed Identities, I would need to create one identity (for the Node) that has access to both the key vault and the database. But with Workload Identities I can create two identities, attach them to two different ServiceAccounts and start up Application A with the one of those ServiceAccounts and Application B with the other one.

Now take the exact same logic for Application C: it needs to provision a PVC that is backed by a storage account. This means when I deploy Application C, I want to be able to attach a specific ServiceAccount to the PVC so that only Application C gets access to the storage account.

cvvz commented 1 year ago

Got it. So only applications with ServiceAccount(federated with MSI/SP beforehand) that has the permission to create and mount storage account can trigger CSI to provision and mount volume for them. Otherwise CSI would return permission error. Is that correct?

karlschriek commented 1 year ago

I am not sure that is the correct way to look at it. At least, that doesn't really conform with how Workload Identities are supposed to work. It should work as:

  1. Create an identity
  2. Give the identity permissions (in this case read/write on a storage account)
  3. Attach the identity to a workload (i.e. start up a Pod with this identity)

There isn't supposed to be anything else that checks for permissions. In that sense, your very first suggestion was correct. I.e., we attach the WI to the driver Pod, which allows that Pod to connect to the storage account. This is exactly how a Workload Identity works (technically), but unfortunately defeats the purpose of using it in the first place if you are going to attach it to a cluster-wide resource.

How feasible would it be to create a driver that spawns additional drivers based on a PVC spec? So for example, if I specify this:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-blob
  namespace: my-namespace
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  ...
  csi:
    ...
    nodeStageServiceAccount
      name: my-service-account

Then a dedicated driver Pod is spun up in my-namespace with my-service-account attached to it and is used only to to provision that PVC?

cvvz commented 1 year ago

Hi @karlschriek , I think https://kubernetes-csi.github.io/docs/token-requests.html is what we are looking for. This feature allows CSI drivers to impersonate the pods that they mount the volumes for. Thus we can mount the volume using the Pod's token rather than the CSI driver's token.

cvvz commented 1 year ago

Got it. So only applications with ServiceAccount(federated with MSI/SP beforehand) that has the permission to create and mount storage account can trigger CSI to provision and mount volume for them. Otherwise CSI would return permission error. Is that correct?

After reading the design docs of this feature, I think the final implementation would be like what I mentioned above. CSI driver would use the token federated with the Pod's ServiceAccount to authenticate whether the Pod has the permission to access Blob/File Storage account before actually mounting it to the Pod.

Anyway, I think in this way, application with different ServiceAccount could have different permission to access storage. Does that make sense to you?

karlschriek commented 1 year ago

That sounds like a great solution!

pinkfloydx33 commented 1 year ago

We currently need to grant permissions to the cluster's/node's managed identity. It would definitely be interesting to leverage the WI of the application somehow. But from what I'm reading above, that would only account for the provisioning process. Once the PV has been provisioned using the credentials of Application A, it exists and would be available for Application B to claim and mount, regardless of the WI/SA of B's pods, right? Seems like a potential loophole.

cvvz commented 1 year ago

There are two steps for a Pod to use a volume: provisioning and mounting.

For the provisioning process, actually, it's the csi driver's responsibility to interact with Azure storage API to create storage resource in Azure. So, the authorization is to verify whether csi driver's WI/SA has the permission to provision volume.

For the mounting process, we need to make sure the Pod has the permission to mount to the volume, csi driver would get the Pod's WI/SA first, then decide whether the Pod has the permission to mount the volume or not.

cvvz commented 1 year ago

@pinkfloydx33 Does that implementation make sense to you?

pinkfloydx33 commented 1 year ago

Yes if it can be controlled that way

AnthonyDewhirst commented 1 year ago

Hi, It's there any update on when this is expected? Just looking at this exact problem right now for the exact same reasons.

Thanks

Ottovsky commented 1 year ago

Hey, Any update on when this functionality can be expected?

Thanks

jonasnorlund commented 1 year ago

Any update on this, It would be really nice to have this feature when you are working with multi tenancy scenarios. Each team has their own WLI bound to a namespace and the Azure RBAC are used for auth against different azure services. This works fine now, except for storage in AKS. This feature would be used for external storage such as Azure file share or similar.

akhiljimmy7 commented 1 year ago

Hi @olsenme ,

Any updates on this? we do have a similar requirement and would like to make use of this feature to ensure a zero-credential approach.

Understand blobfuse2 support WLI via OAuthTokenfile, but I hope this feature might simplify the deployment.

cvvz commented 1 year ago

We are gonna implement this feature, below is the specific use case, could you please take a look and see if it satisfies your requirement? Any suggestion would be helpful, thanks! cc @karlschriek @pinkfloydx33 @AnthonyDewhirst @Ottovsky @akhiljimmy7

  1. Create a storage account and a WLI and grant enough permissions for the WLI to access the storage account
  2. Create a Kubernetes Storage Class or PV that explicitly specify the storage account and fileshare(for azurefile)/container(for blob) you created, of course, you don't have to specify the secret of the storage account.
  3. Create Kubernetes service account
  4. Establish federated identity credential between Service Account and WLI
  5. Deploy a pod that references the service account created in the previous step, and a PVC that uses the PV/storage class you created.
  6. The pod can start up normally and mount the volume successfully. Any other Pods that do not reference the service account can not start up and the failure message is that it doesn't have permission to access the storage account.

No credential will be exposed on AKS in above steps.

rouke-broersma commented 1 year ago

@cvvz how will you use wli for azurefile when azurefile afaik doesn't support azure ad authentication, only access keys? Is the plan then to create on demand access keys using the wli or something? Or did I miss that azurefile now has support for azure ad authentication?

cvvz commented 1 year ago

I'm not sure whether azurefile can support wli authentication now, I think not. But at least we can use wli to get the access key, that's the plan for now. WDYT @rouke-broersma

rouke-broersma commented 1 year ago

I'm not sure whether azurefile can support wli authentication now, I think not. But at least we can use wli to get the access key, that's the plan for now. WDYT @rouke-broersma

Sounds good. Would be better if azurefile would support wli directly 😁

akhiljimmy7 commented 1 year ago

We are gonna implement this feature, below is the specific use case, could you please take a look and see if it satisfies your requirement? Any suggestion would be helpful, thanks! cc @karlschriek @pinkfloydx33 @AnthonyDewhirst @Ottovsky @akhiljimmy7

  1. Create a storage account and a WLI and grant enough permissions for the WLI to access the storage account
  2. Create a Kubernetes Storage Class or PV that explicitly specify the storage account and fileshare(for azurefile)/container(for blob) you created, of course, you don't have to specify the secret of the storage account.
  3. Create Kubernetes service account
  4. Establish federated identity credential between Service Account and WLI
  5. Deploy a pod that references the service account created in the previous step, and a PVC that uses the PV/storage class you created.
  6. The pod can start up normally and mount the volume successfully. Any other Pods that do not reference the service account can not start up and the failure message is that it doesn't have permission to access the storage account.

No credential will be exposed on AKS in above steps.

Yes, this satisfies our requirement!

cvvz commented 1 year ago

Sounds good. Would be better if azurefile would support wli directly 😁

That's right, it will be more efficient and safer if azurefile/blob can support wli authentication directly. But let me implement the feature first and refine it in the next stage.

AnthonyDewhirst commented 1 year ago

Sounds good

On Fri, 10 Nov 2023 at 08:14, weizhi @.***> wrote:

Sounds good. Would be better if azurefile would support wli directly 😁 That's right, it will be more efficient and safer if azurefile/blob support wli authentication directly. But let me implement the feature first and refine it in the next stage.

β€” Reply to this email directly, view it on GitHub https://github.com/Azure/AKS/issues/3432#issuecomment-1805278075, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE44CPDNDPY5M5PPWOHOT3YDXO4PAVCNFSM6AAAAAAUDRL362VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBVGI3TQMBXGU . You are receiving this because you were mentioned.Message ID: @.***>

rudolphjacksonm commented 10 months ago

@cvvz the use-case/solution you mentioned (and the PRs I've seen merged above) indicate that this might be solved in a future release? Any idea of a timeline when we could try this out?

ceilingfish commented 10 months ago

Hi folks, am I reading it correctly that #1220 added this support for statically provisioned storage accounts in release1.23.3. Am I reading this right?

Forkey0 commented 5 months ago

Hi @cvvz any updates on this ?

cvvz commented 5 months ago

Hi @Forkey0 , we have supported this feature in azurefile and blobfuse csi driver already, please refer to the documentation for more details: https://github.com/kubernetes-sigs/azurefile-csi-driver/blob/master/docs/workload-identity-static-pv-mount.md https://github.com/kubernetes-sigs/blob-csi-driver/blob/master/docs/workload-identity-static-pv-mount.md

Forkey0 commented 5 months ago

Hi @Forkey0 , we have supported this feature in azurefile and blobfuse csi driver already, please refer to the documentation for more details: https://github.com/kubernetes-sigs/azurefile-csi-driver/blob/master/docs/workload-identity-static-pv-mount.md https://github.com/kubernetes-sigs/blob-csi-driver/blob/master/docs/workload-identity-static-pv-mount.md

Cheers @cvvz !!

miqm commented 5 months ago

Hello,

I'd like to get some clarification - does the Workload Identity needs to read the storage account key or needs to have access to provided blob container? Or in other words - does the driver use the identity to get the account key or to make operations on the container?

When I configured my pod according to the docs and I assumed identity is to get data, not the account key (I would like to not use account key at all), I'm getting authorization error.

mathieu-clnk commented 5 months ago

Hi @cvvz, Thank you for pointing out at this link. I was able to mount a Blob storage by using volumeAttributes.clientID instead of volumeAttributes.AzureStorageIdentityClientID as per indicated under Microsoft documentation would it be possible to update the documentation accordingly?

Here an example of a statically provision volume which is working:

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: blob.csi.azure.com
  name: pv-blob
  namespace: app1
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: storage-class-blob
  csi:
    driver: blob.csi.azure.com
    volumeHandle: storage01_container01_app1_fuse
    volumeAttributes:
      resourceGroup: resource-group-of-storage-account
      storageAccount: storage01
      containerName: container-name
      protocol: fuse
      clientID: "abc00000-0000-0000-0000-000000000001"
      #AzureStorageIdentityClientID: "abc00000-0000-0000-0000-000000000001"
rouke-broersma commented 5 months ago

Per our requirements, the PV needs to be on a File Share. This is a hard requirement and cannot be changed due to the supportability of another (terrible) app that can only leverage Azure Files and seems to have no concept of Azure blobs.

Azure file share itself does not support Microsoft entraid authentication so by extension it cannot support workload identity without shared access secret: https://learn.microsoft.com/en-us/azure/storage/files/storage-files-active-directory-overview

This is a current limitation of the service.

You have to understand that defender offers a general set of security guidelines that will not apply to each and everyone's scenario. You have to customize this to your specific use case and sometimes this means being aware of and accepting a larger attack surface for the purpose of using features your solution requires.

Azure blob storage does support managed identity so for blob storage it is a fair ask for the aks team to use workload identity directly instead of only to load the sas.

MedAnd commented 5 months ago

The following documentation would be of interest to those following the thread:

Can someone from AKS / Azure Storage teams clarify the following:

β€œEach pod has its own standalone blobfuse mount, but if multiple pods are present on a single node, it may lead to performance problems.”

For example under what criteria & volume could performance problems be encountered? The current wording is vague & should be updated.

PixelRobots commented 5 months ago

@andyzhangx are you able to answer this question?

miqm commented 4 months ago

@cvvz when we can expect version 1.24.x of the blob csi driver will be available in aks 1.29? In 2 recent updates (May and June) it has not been updated (at least according to the release notes)

miqm commented 3 months ago

@cvvz @andyzhangx @miwithro anyone? will the patched version 1.24.x be added to 1.29? on https://github.com/Azure/AKS/blob/master/CHANGELOG.md#release-2024-04-28 it's said:

Upgraded Azure Blob CSI driver to v1.24.1 on AKS 1.28 and to v1.22.6 on AKS 1.27.

Is there any reason why it's not upgraded to 1.24.x on 1.29?

TBH I'm losing a bit confidence in using managed add-ons as they tend to lag with upstream fixes.

envycz commented 1 month ago

Hello,

I'd like to get some clarification - does the Workload Identity needs to read the storage account key or needs to have access to provided blob container? Or in other words - does the driver use the identity to get the account key or to make operations on the container?

When I configured my pod according to the docs and I assumed identity is to get data, not the account key (I would like to not use account key at all), I'm getting authorization error.

+1 I can confirm that the volume cannot be mounted to the pod if the workload identity is not authorized to list the storage account keys. It would be great if the Storage Blob Data Owner role were sufficient when working with blob containers.

miqm commented 1 month ago

@envycz on which version did you test? I was checking this on v1.23.x - by looking at the source code I think this was fixed in 1.24 but my aks (v1.29) didn't get the update of the blob fuse extension.

envycz commented 1 month ago

@miqm we are currently using v1.24.3, but it is deployed manually (not using the managed Blob CSI driver deployed via AKS).

miqm commented 1 month ago

@miqm we are currently using v1.24.3, but it is deployed manually (not using the managed Blob CSI driver deployed via AKS).

So it's not fixed... 😟

andyzhangx commented 1 month ago

Hello, I'd like to get some clarification - does the Workload Identity needs to read the storage account key or needs to have access to provided blob container? Or in other words - does the driver use the identity to get the account key or to make operations on the container? When I configured my pod according to the docs and I assumed identity is to get data, not the account key (I would like to not use account key at all), I'm getting authorization error.

+1 I can confirm that the volume cannot be mounted to the pod if the workload identity is not authorized to list the storage account keys. It would be great if the Storage Blob Data Owner role were sufficient when working with blob containers.

that's correct, even with workload identity today, the csi driver still requires retrieving the account key by using workload identity auth, the backend storage(e.g. smb file share, blobfuse) does not support mounting with workload identity auth directly.

MedAnd commented 1 month ago

Hi @andyzhangx - if I understood your comment above, it's the BlobFuse driver that requires the Workload Identity to have contributor role on the Azure Blob Storage Account?

miqm commented 1 month ago

So it's not a true support of identity-based connection 😟

MedAnd commented 1 month ago

I wonder if this is what @andyzhangx means: Mount volume

Kubernetes needs credentials to access the Blob storage container created earlier, which is either an Azure access key or SAS tokens. These credentials are stored in a Kubernetes secret, which is referenced when you create a Kubernetes pod.

rouke-broersma commented 1 month ago

There are two issues:

Therefore on the whole azure csi drivers cannot properly implement workload identity and have to resort to using access key authentication.

What adding workload identity support in this way does add is that the csi driver checks if the pod is authorized to load the volume, and only then fetches the access keys. So while not ideal, it does make sure that only those pods using the correct service account can access the storage.

PatrickSpies commented 1 month ago

Seems like blobfuse-driver did migration from adal to msal with v2.3.0 https://github.com/Azure/azure-storage-fuse/blob/main/CHANGELOG.md#230preview1-2024-04-04

Migrated from deprecated ADAL to MSAL through the latest azidentity SDK.

rouke-broersma commented 1 month ago

Seems like blobfuse-driver did migration from adal to msal with v2.3.0 https://github.com/Azure/azure-storage-fuse/blob/main/CHANGELOG.md#230preview1-2024-04-04

Migrated from deprecated ADAL to MSAL through the latest azidentity SDK.

In that case the azure csi driver may be able to implement this for blob without access keys once they update to this version!