Closed zoli-opslogic closed 1 month ago
Can you please double check and make sure the AmazonEBSCSIDriverPolicy managed policy is actually attached to your role? this looks like a misconfiguration.
I just went through the following steps:
Create role and attach managed policy:
eksctl create iamserviceaccount --cluster dev --namespace kube-system --name ebs-csi-controller-sa --approve --role-only --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy --region us-east-1 --role-name EBSCSIDriverRole
Retrieve role ARN:
ROLE_ARN=$(aws iam get-role --query 'Role.Arn' --output text --role-name EBSCSIDriverRole)
Install snapshot controller:
eksctl create addon --name snapshot-controller --cluster dev --region us-east-1
Install driver:
eksctl create addon --name aws-ebs-csi-driver --cluster dev --service-account-role-arn $ROLE_ARN --region us-east-1
and was able to successfully create a volume from a snapshot using the example manifests without having to modify the managed policy. I also enabled SDK logs to confirm that the driver correctly added the necessary tags in the CreateVolume request.
If you continue to run into auth issues, taking a look at the decoded version of the authorization failure will help with debugging.
Can you please double check and make sure the AmazonEBSCSIDriverPolicy managed policy is actually attached to your role? this looks like a misconfiguration.
I just went through the following steps:
1. Create role and attach managed policy:
eksctl create iamserviceaccount --cluster dev --namespace kube-system --name ebs-csi-controller-sa --approve --role-only --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy --region us-east-1 --role-name EBSCSIDriverRole
2. Retrieve role ARN:
ROLE_ARN=$(aws iam get-role --query 'Role.Arn' --output text --role-name EBSCSIDriverRole)
3. Install snapshot controller:
eksctl create addon --name snapshot-controller --cluster dev --region us-east-1
4. Install driver:
eksctl create addon --name aws-ebs-csi-driver --cluster dev --service-account-role-arn $ROLE_ARN --region us-east-1
and was able to successfully create a volume from a snapshot using the example manifests without having to modify the managed policy. I also enabled SDK logs to confirm that the driver correctly added the necessary tags in the CreateVolume request.
Oh, there is a snapshot-controller
EKS addon, nice. At the time I started implementing this solution the official AWS docu was pointing to installing this external snapshotter. I tested to uninstall the CRDs together with the external snapshot controller and install the snapshot-controller
EKS addon but it seems it does not install the CRDs and I also had to create a VolumeSnapshotClass in order to be able to use VolumeSnapshots
and VolumeSnapshotContents
. Is that accurate, you had those already in place when you ran your tests? Having the CRDs installed separately from here and the VolumeSnapshotClass
works and I do not get the IAM error I was getting before (yay!) Can you confirm that this setup is ok? I mean, mostly that the snapshot-controller
EKS addon does not install the required CRDs and configure a default VolumeSnapshotClass
(these have to be done separately like I did) or am I still missing something? Thank you!
Having the CRDs installed separately from here and the VolumeSnapshotClass works and I do not get the IAM error I was getting before (yay!)
Glad to hear 👍
Can you confirm that this setup is ok? I mean, mostly that the snapshot-controller EKS addon does not install the required CRDs and configure a default VolumeSnapshotClass (these have to be done separately like I did) or am I still missing something?
The snapshot-controller EKS addon installs everything needed for snapshots to work - including the necessary CRDs - but you do need to manually create the VolumeSnapshotClass
as its configuration varies based on user requirements.
Re-tested this and indeed the EKS addon installs also the required CRDs. Manually created the VolumeSnapshotClass
, all seems fine. Thank you for your answers, I will close this.
Issue non existent with snapshot-controller EKS addon
/kind bug
What happened? Using this procedure to import and restore a previously created AWS EBS snapshot. (not sure if relevant, but the snapshot was created with this procedure in another namespace on the same EKS cluster, works perfectly) My
storageclass
resource andvolumesnapshotclass
resource looks like this:When creating a PVC with source from a
VolumeSnapshot
I get the following error event and the PVC remains in a Pending state forever :E0915 10:31:11.191835 1 driver.go:108] "GRPC error" err="rpc error: code = Internal desc = Could not create volume \"pvc-321ad319-b39d-654a-9710-406543984532\": could not create volume in EC2: operation error EC2: CreateVolume, https response error StatusCode: 403, RequestID: b3b4128b-6caa-49df-900d-762d342d0901, api error UnauthorizedOperation: You are not authorized to perform this operation. User: arn:aws:sts::<AWS ACC NR>:assumed-role/AmazonEKS_EBS_CSI_DriverRoleTesting/1321496256754713177 is not authorized to perform: ec2:CreateVolume on resource: arn:aws:ec2:us-east-2::snapshot/snap-001sdfda734438e64 because no identity-based policy allows the ec2:CreateVolume action. Encoded authorization failure message: ...."
Comment for the above situation:
VolumeSnapshotContent
is pointing to a specifcAWS EBS snapshot handle
andVolumeSnapshot
previously created and inReadyToUse
state. ThePVC
error events can be seen on the logs fromebs-csi-controller pod -> ebs-plugin container
. Yes, theWaitForFirstConsumer
directive is also reconciled with scaling up theStatefulset
that spins up the pod that uses it.The EBS CSI driver is installed via EKS addons -
Amazon EBS CSI Driver -v1.34.0-eksbuild.1
. It uses IRSA -arn:aws:iam::<AWS ACC NR>::role/AmazonEKS_EBS_CSI_DriverRoleTesting
with the AWS maintainedAmazonEBSCSIDriverPolicy - Version 2
attached policy. There is no issue if I create aPVC
without referencingVolumeSnapshot
. Double-checked theSA
is correctly configured using the procedure from https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi.html#csi-iam-role Tried different solutions, restarted ebs-csi-controller, recycled eks all eks nodes, added additional tagging via storageclass, etc..The only thing that worked, was adding explicitly a policy to the
arn:aws:iam::<AWS ACC NR>::role/AmazonEKS_EBS_CSI_DriverRoleTesting
that allowsEC2:CreateVolume
without conditions.Although this above workaround solves my issue, this should work with the rules in place in the attached
AmazonEBSCSIDriverPolicy - Version 2
policy, as all the official documentation presents it. See below an excerpt from the policy to highlight lines related to ec2:CreateVolume:What you expected to happen? To be able to create PVCs referencing VolumeSnapshots.
How to reproduce it (as minimally and precisely as possible)? Explained in the What happened? section.
Anything else we need to know?:
Environment AWS EKS
kubectl version
): Client Version: v1.29.7 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.7-eks-2f46c53