awslabs / mountpoint-s3-csi-driver

Built on Mountpoint for Amazon S3, the Mountpoint CSI driver presents an Amazon S3 bucket as a storage volume accessible by containers in your Kubernetes cluster.
Apache License 2.0
151 stars 18 forks source link

Can't mount s3 bucket. (Permission denied) #173

Closed PaveLGIL closed 2 months ago

PaveLGIL commented 3 months ago

/kind bug Problem not exist in Kubernetes version 1.27 Hello! I have a service account with a role that contains the following policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:ListBucket",
            "Effect": "Allow",
            "Resource": [
                "bucket"
            ],
            "Sid": "S3ListBuckets"
        },
        {
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:DeleteObject"
            ],
            "Effect": "Allow",
            "Resource": [
                "bucket/*"
            ],
            "Sid": "S3CRUD"
        },
        {
            "Action": [
                "kms:ReEncrypt*",
                "kms:GetPublicKey",
                "kms:GenerateDataKey*",
                "kms:Encrypt",
                "kms:DescribeKey",
                "kms:Decrypt"
            ],
            "Effect": "Allow",
            "Resource": "key-arn",
            "Sid": "KMS"
        }
    ]
}

My PV:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: {{ .Values.volume.pvcName }}-pv
spec:
  capacity:
    storage: 1200Gi # ignored, required
  accessModes:
    - ReadWriteMany # supported options: ReadWriteMany / ReadOnlyMany
  mountOptions:
    - allow-delete
    - region eu-central-1
  csi:
    driver: s3.csi.aws.com # required
    volumeHandle: s3-csi-driver-volume
    volumeAttributes:
      bucketName: {{ .Values.volume.versions.s3VersionsBucketName }}

My PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: {{ .Values.volume.pvcName }}-claim
spec:
  accessModes:
    - ReadWriteMany # supported options: ReadWriteMany / ReadOnlyMany
  storageClassName: "" # required for static provisioning
  resources:
    requests:
      storage: 1200Gi # ignored, required
  volumeName: {{ .Values.volume.pvcName }}-pv

But I encounter this problem in the logs:

1 node.go:65] NodePublishVolume: req: volume_id:"s3-csi-driver-volume" target_path:"/var/lib/kubelet/pods/420753ac-d284-4e86-bc6f-4083ae1de68c/volumes/kubernetes.io~csi/<volume>/mount" volume_capability:<mount:<mount_flags:"allow-delete" mount_flags:"region eu-central-1" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > volume_context:<key:"bucketName" value:"bucket" > 
1 node.go:112] NodePublishVolume: mounting bucket at /var/lib/kubelet/pods/420753ac-d284-4e86-bc6f-4083ae1de68c/volumes/kubernetes.io~csi/<pv>/mount with options [--allow-delete --region=eu-central-1]
1 driver.go:96] GRPC error: rpc error: code = Internal desc = Could not mount "bucket" at "/var/lib/kubelet/pods/420753ac-d284-4e86-bc6f-4083ae1de68c/volumes/kubernetes.io~csi/<pv>/mount": Mount failed: Failed to start service output: Error: Failed to create S3 client  Caused by:     0: initial ListObjectsV2 failed for bucket bucket in region eu-central-1     1: Client error     2: Forbidden: Access Denied Error: Failed to create mount process

 Could you please help me with this?

**Environment**
- Kubernetes version (use `kubectl version`): 1.28
- Driver version: 1.4.0 (same with 1.1.0)
happosade commented 3 months ago

Same problem with latest EKS, IPv6 and bottlerocket AMIs.

Thought it was related to IPv6 stack, but maybe not. https://github.com/awslabs/mountpoint-s3-csi-driver/issues/158#issuecomment-2022552227

ali-panahi commented 2 months ago

Same problem with latest EKS (1.29) and bottlerocket AMIs. Thought service account's (s3-csi-controller-sa) annotated role is not being used during API call to S3.

jjkr commented 2 months ago

I tried this on a bottlerocket EKS cluster running both 1.28 and 1.29 (upgraded from 1.27). Everything seems to be working as expected on my clusters, so here's a few things to check:

  1. Ensure OIDC is enabled on the cluster. This command should produce output: aws iam list-open-id-connect-providers | grep $(aws eks describe-cluster --name $MY_CLUSTER --query "cluster.identity.oidc.issuer" --output text|sed 's/.*\///')
  2. Check your service account is annotated properly kubectl describe sa s3-csi-driver-sa -n YOUR_NAMESPACE. It should have an annotation like this: Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/s3-csi-driver-role
  3. Ensure the proper trust relationship is in that role. It should look something like this:
    {  "Version": "2012-10-17",
    "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:sub": "system:serviceaccount:SERVICE_ACCOUNT_NAMESPACE:SERVICE_ACCOUNT_NAME",
          "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:aud": "sts.amazonaws.com"
        }
      }
    }
    ]
    }

I took these steps from this knowledge base, which has some more details: https://repost.aws/knowledge-center/eks-troubleshoot-oidc-and-irsa. Also the documentation for IRSA might be helpful: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

The original report says it was working on 1.27, so there might be something else going on here, but these are the first things to check. If that still doesn't fix the issue, try to verify that the IRSA credentials getting into the driver container with a command like this: kubectl exec s3-csi-node-XXXXX -i -t -n kube-system -- env (you'll need to get the pod name from your cluster). There should be the following env variables: AWS_ROLE_ARN, AWS_WEB_IDENTITY_TOKEN_FILE, and AWS_STS_REGIONAL_ENDPOINTS. If that is all present, there may be a driver issue getting that context down to the mount process.

ali-panahi commented 2 months ago

Hello @jjkr Thanks, but as mentioned in official documents https://github.com/awslabs/mountpoint-s3-csi-driver/blob/main/examples/kubernetes/static_provisioning/static_provisioning.yaml We expect it should have worked on all namespaces without creating specific service account on specific namespace and passing service account option to pod while we are using s3-csi-driver.

note: We are using [Mountpoint for Amazon S3 CSI Driver] add-on, and service account: s3-csi-driver-sa has been deployed on kube-system namespace.

PaveLGIL commented 2 months ago

I tried this on a bottlerocket EKS cluster running both 1.28 and 1.29 (upgraded from 1.27). Everything seems to be working as expected on my clusters, so here's a few things to check:

  1. Ensure OIDC is enabled on the cluster. This command should produce output: aws iam list-open-id-connect-providers | grep $(aws eks describe-cluster --name $MY_CLUSTER --query "cluster.identity.oidc.issuer" --output text|sed 's/.*\///')
  2. Check your service account is annotated properly kubectl describe sa s3-csi-driver-sa -n YOUR_NAMESPACE. It should have an annotation like this: Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/s3-csi-driver-role
  3. Ensure the proper trust relationship is in that role. It should look something like this:
{  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:sub": "system:serviceaccount:SERVICE_ACCOUNT_NAMESPACE:SERVICE_ACCOUNT_NAME",
          "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:aud": "sts.amazonaws.com"
        }
      }
    }
  ]
}

I took these steps from this knowledge base, which has some more details: https://repost.aws/knowledge-center/eks-troubleshoot-oidc-and-irsa. Also the documentation for IRSA might be helpful: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

The original report says it was working on 1.27, so there might be something else going on here, but these are the first things to check. If that still doesn't fix the issue, try to verify that the IRSA credentials getting into the driver container with a command like this: kubectl exec s3-csi-node-XXXXX -i -t -n kube-system -- env (you'll need to get the pod name from your cluster). There should be the following env variables: AWS_ROLE_ARN, AWS_WEB_IDENTITY_TOKEN_FILE, and AWS_STS_REGIONAL_ENDPOINTS. If that is all present, there may be a driver issue getting that context down to the mount process.

@jjkr Thank you for the answer! But I have checked all that things previousely and unfortunately they looks good..

  1. OIDC enabled
  2. Service account annotated properly
  3. Trust relationships looks good too
    {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.eu-central-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:sub": "system:serviceaccount:kube-system:s3-csi-driver-sa"
                }
            }
        }
    ]
    }

Environment variables also presents.. Screenshot 2024-04-03 at 10 54 05

myevit commented 2 months ago

Same issue

subaroon commented 2 months ago

same issue. I tried the following driver versions in Kubernetes 1.29 but it didn't work.

PaveLGIL commented 2 months ago

So, guys, I want to say sorry.. My issue solved. I had not correct OIDC provider url.. I will close this bug. On version 1.28 it works as expected.