kubernetes-sigs / aws-ebs-csi-driver

CSI driver for Amazon EBS https://aws.amazon.com/ebs/
Apache License 2.0
989 stars 795 forks source link

CSI Driver PV creation failure #1140

Closed brickpattern closed 2 years ago

brickpattern commented 2 years ago

Followed the examples given in this repo creating a POD "app" with Claim and StorageClass.

Resulting in an error ...as seen in PVC describe as well the same in controller logs. Any suggestions on what am i missing?

Type Reason Age From Message


Normal WaitForFirstConsumer 11m persistentvolume-controller waiting for first consumer to be created before binding Warning ProvisioningFailed 11m ebs.csi.aws.com_ebs-csi-controller-56c69bcf65-d5vjw_a9ed942d-7fa7-4133-aa06-2be287a4969e failed to provision volume with StorageClass "ebscsi": rpc error: code = Internal desc = Could not create volume "pvc-fe352e9e-5d8b-4fd1-a5ab-6fc2493cce37": failed to get an available volume in EC2: InvalidVolume.NotFound: The volume 'vol-036f02de0e81cbb95' does not exist. status code: 400, request id: f4fc598c-5822-4c5b-9719-40e3741f6349 Normal Provisioning 3m16s (x11 over 11m) ebs.csi.aws.com_ebs-csi-controller-56c69bcf65-d5vjw_a9ed942d-7fa7-4133-aa06-2be287a4969e External provisioner is provisioning volume for claim "default/ebscsi" Warning ProvisioningFailed 3m16s (x10 over 11m) ebs.csi.aws.com_ebs-csi-controller-56c69bcf65-d5vjw_a9ed942d-7fa7-4133-aa06-2be287a4969e failed to provision volume with StorageClass "ebscsi": rpc error: code = AlreadyExists desc = Could not create volume "pvc-fe352e9e-5d8b-4fd1-a5ab-6fc2493cce37": Parameters on this idempotent request are inconsistent with parameters used in previous request(s) Normal ExternalProvisioning 101s (x42 over 11m) persistentvolume-controller

brickpattern commented 2 years ago

this is using the latest v1.4.0-eksbuild.preview of AWS EBS CSI Driver add-on

gtxu commented 2 years ago

Hi @brickpattern, would you please point me to which example you are trying to follow and details of .yaml file (only in which you modified)? Besides, can you double check your aws cli credential that match the cluster owner's.

brickpattern commented 2 years ago

repurposed this ... the only update i made is the zone /region im operating on. https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/examples/kubernetes/storageclass/specs/example.yaml

Yes , my AWS CLI is good as im able to deploy other K8S intree storage pods/pvc. Just that EBS CSI driver example is failing.

"ebscsi" is storageclass name in the above error logs. Since i had tried "ebs-sc" and had same error... reattempt have updated to "ebscsi"

brickpattern commented 2 years ago

Error logs from csi-plugin container

driver.go:119] GRPC error: rpc error: code = AlreadyExists desc = Could not create volume "pvc-0315dafb-245d-47e3-adef-74f782688d1c": Parameters on this idempotent request are inconsistent with parameters used in previous request(s)

since the PV never got created , how to track this volume id is truly existing on EBS end. Searching in EBS AWS console doesnt find any match?

brickpattern commented 2 years ago

Deleted/Uninstalled the CSI Driver add-on from AWS console... thought on the lines of caching issue at the controller / csi-provisioner end. Reinstalled the add-on. Still the same error . This leaves me to the API endpoint (csi driver) is talking to.

juergenz commented 2 years ago

@brickpattern i had the same / similar problem and error logs.

The IAM role used by aws-ebs-csi-driver had no permission to use the CMK KMS Key i had referenced in the storageclass.

brickpattern commented 2 years ago

i made the storageclass definition with parameter of encryption: "false"

Wouldnt that make the KMS key and EBS encryption void ?

brickpattern commented 2 years ago

@juergenz tks for the tip.

Resolved!!!

Added this policy snippet to the role/policy attached to the CSI Driver and its able to create the volume.

{ "Sid": "MinimalEBSKMSCreateandAttach", "Effect": "Allow", "Action": [ "kms:Decrypt", "kms:GenerateDataKeyWithoutPlaintext", "kms:CreateGrant" ], "Resource": "*" }

However its still curious to me ... the StorageClass definition of parameter. encrypted:"false" did not mandate a unencrypted volume. It appears parameter is only a qualifier. Anyways ... this is hardway.

rayjanoka commented 2 years ago

Thanks, I found that I was missing one of my kms_key_arn in the IAM policy for ebs-csi-plugin.

sumanthkumarc commented 2 years ago

This https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/1140#issuecomment-993711047 worked for me. I had to add below to my driver policy. Just wondering why this isnt part of the official Driver policy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:GenerateDataKeyWithoutPlaintext",
                "kms:CreateGrant"
            ],
            "Resource": "*"
        }
    ]
}
fheinecke commented 1 year ago

For anybody else running into this issue, this can also happen if you use a KMS key alias on the storage class. Essentially what is happening is that the ec2:CreateVolume API call reports a success, but fails behind the scenes due to the role lacking KMS permissions to encrypt the new volume with the listed key. The CSI controller fails to attach the volume (which doesn't actually exist), then tries repeatedly to create the volume using the same client token for idempotency. The API calls then fail because the token has already been used to create the volume, despite internally failing the first time.

headyj commented 1 year ago

@fheinecke how did you manage to solve this issue? because on my side all permissions above are correct, the kms alias on the storageclass is also correct and correspond to the one specified in my role. annotation is also correctly set on ebs-csi-controller-sa service account

luis-fnogueira commented 7 months ago

This error is still happening. It should be on the official documentation.