qnap-dev / QNAP-CSI-PlugIn

Apache License 2.0
28 stars 3 forks source link

Many non-root deployments fail with permission denied when trying to write to QNAP volumes #16

Open pkerwien opened 1 week ago

pkerwien commented 1 week ago

When deploying 3rd party applications like mariadb-operator, cloudnative-pg and bitnami/mariadb helm chart, they all fail when using a PVC on the QNAP NAS.

This happens since the CSI driver is not changing the volume permissions while mounting it when fsGroup is used in the manifests to allow the non-root container user to write to the filesystem.

Using this demo deployment with both Longhorn and QNAP-CSI:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mariadb
spec:
  selector:
    matchLabels:
      app: mariadb
  template:
    metadata:
      labels:
        app: mariadb
    spec:
      securityContext:
        fsGroup: 999
        fsGroupChangePolicy: Always
      containers:
        - name: mariadb
          image: mariadb:11.4.3
          ports:
            - containerPort: 3306
          env:
            - name: MARIADB_ROOT_PASSWORD
              value: t0ps3cr3t
          command: ["sleep", "9999999"]
          volumeMounts:
            - mountPath: /var/lib/mysql
              name: db-data
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - ALL
            privileged: false
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            seccompProfile:
              type: RuntimeDefault  
            runAsUser: 999
            runAsGroup: 999
      volumes:
        - name: db-data
          persistentVolumeClaim:
            claimName: mariadb

Results in the following when using Longhorn CSI:

$ kubectl exec -it mariadb-6d88bd45c4-kwrmz -- ls -ldn /var/lib/mysql
drwxrwsr-x 3 0 999 4096 Sep  1 10:36 /var/lib/mysql

With QNAP-CSI:

PVC:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: mariadb
  annotations:
    trident.qnap.io/ThinAllocate: "true"
    trident.qnap.io/threshold: "80"
    # QuTS-hero features
    trident.qnap.io/Deduplication: "false"
    trident.qnap.io/Compression: "true"
    trident.qnap.io/FastClone: "true"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi
  storageClassName: standard
$ kubectl exec -it mariadb-5576bb9b88-29tbh -- ls -ldn /var/lib/mysql
drwxr-xr-x 3 0 0 4096 Sep  2 09:20 /var/lib/mysql

Since the container UID in this example is 999, the user can write to the volume when using Longhorn, but not when using QNAP-CSI.

Another example is using the mariadb-operator to deploy mariadb databases. The database pods will fail with:

2024-09-02 09:32:17+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:11.4.3+maria~ubu2404 started.
2024-09-02 09:32:17+00:00 [Note] [Entrypoint]: Initializing database files
2024-09-02  9:32:17 0 [Warning] Can't create test file '/var/lib/mysql/mariadb.lower-test' (Errcode: 13 "Permission denied")
2024-09-02  9:32:17 0 [ERROR] mariadbd: Can't create/write to file './ddl_recovery.log' (Errcode: 13 "Permission denied")
2024-09-02  9:32:17 0 [ERROR] DDL_LOG: Failed to create ddl log file: ./ddl_recovery.log
2024-09-02  9:32:17 0 [ERROR] Aborting

Installation of system tables failed!  Examine the logs in
/var/lib/mysql/ for more information.

The problem could be conflicting information in an external
my.cnf files. You can ignore these by doing:

    shell> /usr/bin/mariadb-install-db --defaults-file=~/.my.cnf

You can also try to start the mariadbd daemon with:

    shell> /usr/sbin/mariadbd --skip-grant-tables --general-log &

and use the command line tool /usr/bin/mariadb
to connect to the mysql database and look at the grant tables:

    shell> /usr/bin/mariadb -u root mysql
    MariaDB> show tables;

Try '/usr/sbin/mariadbd --help' if you have problems with paths.  Using
--general-log gives you a log in /var/lib/mysql/ that may be helpful.

The latest information about mariadb-install-db is available at
https://mariadb.com/kb/en/installing-system-tables-mysql_install_db
You can find the latest source at https://downloads.mariadb.org and
the maria-discuss email list at https://launchpad.net/~maria-discuss

Please check all of the above before submitting a bug report
at https://mariadb.org/jira

Please add necessary fsGroup support into the CSI driver so all these non-root applications can be deployed using QNAP volumes.

My setup is:

My storage class for QNAP PVCs is:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: csi.trident.qnap.io
parameters:
  selector: "performance=standard"
allowVolumeExpansion: true
davidcheng0716 commented 1 week ago

@pkerwien thank you for the comment; maybe the following steps can help you

(1) kubectl edit csidriver csi.trident.qnap.io (2) fsGroupPolicy : "ReadWriteOnceWithFSType" ----> "File"

Screenshot from 2024-09-03 17-15-54

Screenshot from 2024-09-03 17-16-49

pkerwien commented 1 week ago

Thanks! This looks promising. The mariadb-operator now managed to deploy a DB cluster. I will do more testning later.

pkerwien commented 1 week ago

@davidcheng0716 All previously failed deployments work now! Can the changes be made during installation of QNAP CSI (to avoid patching) or will this CSI driver change be default in a future release?

pkerwien commented 1 week ago

@davidcheng0716 I just discovered that the Longhorn CSI driver uses fsGroupPolicy: ReadWriteOnceWithFSType (same as QNAP CSI before patching). And in the longhorn storage class, I can see fsType: ext4. Perhaps that is the reason fsGroup works as expected when using Longhorn. From https://kubernetes-csi.github.io/docs/support-fsgroup.html:

"ReadWriteOnceWithFSType: Indicates that volumes will be examined to determine if volume ownership and permissions should be modified to match the pod's security policy. Changes will only occur if the fsType is defined and the persistent volume's accessModes contains ReadWriteOnce."

In my QNAP storage class, there is no such fsType parameter. Not sure if I can add one or if that would make it work without having to patch the CSIDriver.

LeonaChen2727 commented 1 week ago

We appreciate your suggestion and will consider setting it as the default in a future version.