kubernetes-sigs / aws-ebs-csi-driver

CSI driver for Amazon EBS https://aws.amazon.com/ebs/
Apache License 2.0
971 stars 782 forks source link

FSGroup and fsGroupChangePolicy not supported with ReadWriteOncePod AccessMode #1982

Open GDegrove opened 5 months ago

GDegrove commented 5 months ago

/kind bug

Hello,

I think I've found a bug with the new ReadWriteOncePod access mode in the latest EBS CSI driver

What happened?

When deploying a statefulset, I realized that the ReadWriteOncePod access mode does not respect fsGroup and fsGroupChange. When using that access modes, the disk is mounted with root:root owner and the process cannot write into the disk.

What you expected to happen? The volume is mounted into the pod with the right mode, and the process can write to the disks. ,

How to reproduce it (as minimally and precisely as possible)?


- check the /data mount point:

k exec -it busybox-0 -- sh -c 'ls -lah /data' total 20K drwxr-xr-x 3 root root 4.0K Mar 22 09:59 . drwxr-xr-x 1 root root 75 Mar 22 10:03 .. drwx------ 2 root root 16.0K Mar 22 09:59 lost+found


- change the securityContext

apiVersion: v1 kind: Service metadata: name: busybox namespace: default labels: app: busybox spec: ports:

k exec -it busybox-0 -- sh -c 'touch /data/test2 && ls -lah /data'
touch: /data/test2: Permission denied
command terminated with exit code 1

Same experiment with ReadWriteOnce:

---
apiVersion: v1
kind: Service
metadata:
  name: busybox
  namespace: default
  labels:
    app: busybox
spec:
  ports:
  - port: 8088
    name: api
  clusterIP: None
  selector:
    app: busybox
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: busybox
  namespace: default
  labels:
    app: busybox
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox
  serviceName: busybox
  template:
    metadata:
      labels:
        app: busybox
    spec:
      containers:
      - name: busybox
        image: busybox
        args:
        - /bin/sh
        - -c
        - while true; do echo "hello"; sleep 2; done
        ports:
        - containerPort: 8088
          name: api  
        volumeMounts:
        - name: data
          mountPath: /data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: 
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
 k exec -it busybox-0 -- sh -c 'touch /data/test && ls -lah /data'
total 20K
drwxr-xr-x    3 root     root        4.0K Mar 22 10:14 .
drwxr-xr-x    1 root     root          63 Mar 22 10:14 ..
drwx------    2 root     root       16.0K Mar 22 10:14 lost+found
-rw-r--r--    1 root     root           0 Mar 22 10:14 test

update pod:

---
apiVersion: v1
kind: Service
metadata:
  name: busybox
  namespace: default
  labels:
    app: busybox
spec:
  ports:
  - port: 8088
    name: api
  clusterIP: None
  selector:
    app: busybox
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: busybox
  namespace: default
  labels:
    app: busybox
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox
  serviceName: busybox
  template:
    metadata:
      labels:
        app: busybox
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        runAsGroup: 1000
        fsGroup: 3000
        fsGroupChangePolicy: "OnRootMismatch"
      containers:
      - name: busybox
        image: busybox
        args:
        - /bin/sh
        - -c
        - while true; do echo "hello"; sleep 2; done
        ports:
        - containerPort: 8088
          name: api  
        volumeMounts:
        - name: data
          mountPath: /data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: 
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi

Check:

k exec -it busybox-0 -- sh -c 'touch /data/test2 && ls -lah /data'
total 20K
drwxrwsr-x    3 root     3000        4.0K Mar 22 10:15 .
drwxr-xr-x    1 root     root          63 Mar 22 10:15 ..
drwxrws---    2 root     3000       16.0K Mar 22 10:14 lost+found
-rw-rw-r--    1 root     3000           0 Mar 22 10:14 test
-rw-r--r--    1 1000     3000           0 Mar 22 10:15 test2

Anything else we need to know?:

Environment

ConnorJC3 commented 5 months ago

Hi @GDegrove - the EBS CSI Driver is not responsible for managing/applying the pod's fsGroup settings, that is performed by Kubernetes (specifically, by the kubelet). I am going to leave this issue open for our team to investigate, but I would suggest also reporting this issue to Kubernetes itself (edit: this appears to be intentional, see my next comment below).

ConnorJC3 commented 5 months ago

Upon further inspection, it appears this behavior in Kubernetes is intentional when the CSIDriver's fsGroupPolicy is set to ReadWriteOnceWithFSType:

https://github.com/kubernetes/kubernetes/blob/95a6f2e4dcc2801612933707b05d31609744ada7/pkg/volume/csi/csi_mounter.go#L474-L476 https://github.com/kubernetes/kubernetes/blob/95a6f2e4dcc2801612933707b05d31609744ada7/pkg/volume/csi/csi_util.go#L131-L142

We did change the default back because of issue https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/1365 - but this change has not been applied to the EKS Addon version of the driver due to a limitation in the EKS Addons service.

Thus, you should be able to resolve this issue by either:

  1. Switching to the Helm version of the EBS CSI Driver, which has the fsGroupPolicy set to File by default
  2. Manually deleting and recreating the ebs.csi.aws.com CSIDriver objects of the EKS Addon installation to set fsGroupPolicy to File
    Delete the existing CSIDriver object and apply one that looks like this (The fsGroupPolicy field is immutable so the CSIDriver has to be recreated in order to change it):
    apiVersion: storage.k8s.io/v1
    kind: CSIDriver
    metadata:
    name: ebs.csi.aws.com
    labels:
    app.kubernetes.io/component: csi-driver
    app.kubernetes.io/managed-by: EKS
    app.kubernetes.io/name: aws-ebs-csi-driver
    app.kubernetes.io/version: 1.28.0
    spec:
    attachRequired: true
    podInfoOnMount: false
    fsGroupPolicy: File

We're aware this is a subpar experience for EKS Addon users and are looking to change the default on the EKS Addons version of the driver too, but I don't have any ETA on that at the moment.

mugdha-adhav commented 4 months ago

@ConnorJC3 based on k8s docs for pod security context, since k8s v1.26 the process of setting file ownership and permissions based on the fsGroup specified in the securityContext will be performed by the CSI driver instead of Kubernetes.

Is this not the case yet and are we still depending on Kubernetes for managing fsGroup settings?

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

AndrewSirenko commented 1 week ago

/remove-lifecycle stale

/status lifecycle/frozen

gnufied commented 1 week ago

This seems like a bug in k8s. Even for default ReadWriteOnceWithFSType using ReadWriteOncePod should result in modification of volume according to specified in fsgroup. This is likely an oversight when ReadWriteOncePod was implemented.

/transfer kubernetes

gnufied commented 1 week ago

Filed - https://github.com/kubernetes/kubernetes/issues/127170 for fixing this in k8s.