openebs-archive / dynamic-nfs-provisioner

Operator for dynamically provisioning an NFS server on any Kubernetes Persistent Volume. Also creates an NFS volume on the dynamically provisioned server for enabling Kubernetes RWX volumes.
Apache License 2.0
169 stars 58 forks source link

nfs rwx folder has 000 as permission #124

Open mbu147 opened 3 years ago

mbu147 commented 3 years ago

Describe the bug: I created many RWX nfs shares with nfs-provisioner. As backend storageclass i use openebs-jiva. Sometimes every nfs share mount get the permission 000 Then, the mount folder looks like

root@nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6-fb8f5fd66-jfz6b:/ # ls -la
d---------   17 xfs      xfs           4096 Oct 29 14:15 nfsshare

in the nfs-pvc pod. Cause of that, the nginx container where the nfs-pvc is mounted also have 000 on the folder and cannot read the files within

The files in the mount folder have the correct permissions.

Expected behaviour: Default mount permission 755 or something similar

Steps to reproduce the bug: Just create a new nfs rwx pvc and wait. After some time the running nginx container cannot read the folder anymore.

The output of the following commands will help us better understand what's going on:

Anything else we need to know?: jiva and nfs installed via helm charts https://github.com/openebs/jiva-operator/tree/develop/deploy/helm/charts https://github.com/openebs/dynamic-nfs-provisioner/tree/develop/deploy/helm/charts helm config:

nfs-provisioner:
    rbac:
        pspEnabled: false
    podSecurityContext:
        fsGroup: 120

storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-rwx
  annotations:
    openebs.io/cas-type: nfsrwx
    cas.openebs.io/config: |
      - name: NFSServerType
        value: "kernel"
      - name: BackendStorageClass
        value: "openebs-jiva-csi-default"
      #  LeaseTime defines the renewal period(in seconds) for client state
      - name: LeaseTime
      value: 30
      #  GraceTime defines the recovery period(in seconds) to reclaim locks
      - name: GraceTime
        value: 30
      #  FSGID defines the group permissions of NFS Volume. If it is set
      #  then non-root applications should add FSGID value under pod
      #  Supplemental groups
      - name: FSGID
        value: "120"
      - name: NFSServerResourceRequests
        value: |-
          cpu: 50m
          memory: 50Mi
      - name: NFSServerResourceLimits
        value: |-
          cpu: 100m
          memory: 100Mi
provisioner: openebs.io/nfsrwx
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true

Environment details:

Do i have a misconfigured setup or is this a bug?

Thanks for help!

mittachaitu commented 3 years ago
  image: openebs/provisioner-nfs:0.7.1  and image: openebs/jiva-operator:3.0.0

Hi @mbu147, I followed the same steps as mentioned in the description and observed that permissions of nfsshare directory are 755 and here is the output:

root@nfs-pvc-e155220f-63b7-4882-9104-98575910d9c9-69c97df57d-wrxq2:/ # ls -la
total 88
drwxr-xr-x    1 root     root          4096 Nov  2 06:42 .
drwxr-xr-x    1 root     root          4096 Nov  2 06:42 ..
drwxr-xr-x    3 root     root          4096 Nov  2 06:42 nfsshare
...
...
...

Steps followed to provision NFS volume:

StorageClass outputs:

 kubectl get sc
NAME                       PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
openebs-device             openebs.io/local      Delete          WaitForFirstConsumer   false                  140m
openebs-hostpath           openebs.io/local      Delete          WaitForFirstConsumer   false                  140m
openebs-jiva-csi-default   jiva.csi.openebs.io   Delete          Immediate              true                   140m
openebs-kernel-nfs         openebs.io/nfsrwx     Delete          Immediate              false                  140m

Did I miss anything? Not sure how are you getting d--------- 17 xfs xfs 4096 Oct 29 14:15 nfsshare this 000 permissions.

One more observation:

root@nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6-fb8f5fd66-jfz6b:/ # ls -la d--------- 17 xfs xfs 4096 Oct 29 14:15 nfsshare

Can you help with the following outputs(maybe it will help to understand further):

mbu147 commented 3 years ago

Hi @mittachaitu, thanks for your reply and testing process!

It looks like I'm doing the same, except not using the "global" helm chart. I switch to the same chart as you and will have a look.

I noticed that it apparently only occurs when each node is under a high IO load, so that he needs to "reconnect" the mount points.

ls -la shows that different owner & user? Are there any manual edits made on ownership? Usually, it should be root root by default...

In a fresh new pvc the folder has root root and r-xr-xr-x. My nginx and php-fpm container have the UID and GID 33, in the pvc container UID 33 is the xfs user. I think i did a chown nginx:nginx -R <nfs mount> after copying my data from the old pvc.

Thanks!

mittachaitu commented 3 years ago

I noticed that it apparently only occurs when each node is under a high IO load, so that he needs to "reconnect" the mount points.

Hmm..., the system might be going into an RO state, if jiva volume is turning into RO then d--------- permission make sense(AFAIK).

My nginx and php-fpm container have the UID and GID 33, in the pvc container UID 33 is the xfs user. I think i did a chown nginx:nginx -R after copying my data from the old PVC.

Yeah, currently nfs-provisioner allows to only to set fsGID but there is an issue to support configuring UID(which is being worked upon), so that when the volume is provisioned user no need to run chown commands explicitly(anyway this is different problem).

mbu147 commented 3 years ago

okay i understand.. no one had these issue beside me?

You asked for more information, i forgot to add this to my last post: kubectl get sc openebs-jiva-csi-default -o yaml

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"labels":{"argocd.argoproj.io/instance":"openebs"},"name":"openebs-jiva-csi-default"},"parameters":{"cas-type":"jiva","policy":"openebs-jiva-default-policy"},"provisioner":"jiva.csi.openebs.io","reclaimPolicy":"Delete","volumeBindingMode":"Immediate"}
  creationTimestamp: "2021-10-29T11:16:06Z"
  labels:
    argocd.argoproj.io/instance: openebs
  name: openebs-jiva-csi-default
  resourceVersion: "21060613"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/openebs-jiva-csi-default
  uid: f9275268-9a2b-4f20-a59b-ebdaede5b8e3
parameters:
  cas-type: jiva
  policy: openebs-jiva-default-policy
provisioner: jiva.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

kubectl get deploy nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6 -n openebs -o yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2021-10-29T12:23:27Z"
  generation: 1
  labels:
    openebs.io/nfs-server: nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6
  name: nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6
  namespace: openebs
  resourceVersion: "22196180"
  selfLink: /apis/apps/v1/namespaces/openebs/deployments/nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6
  uid: 4b9de5e8-a1b5-4ea1-b905-67ec358dc015
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      openebs.io/nfs-server: nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6
  strategy:
    type: Recreate
  template:
    metadata:
      creationTimestamp: null
      labels:
        openebs.io/nfs-server: nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6
    spec:
      containers:
      - env:
        - name: SHARED_DIRECTORY
          value: /nfsshare
        - name: CUSTOM_EXPORTS_CONFIG
        - name: NFS_LEASE_TIME
          value: "90"
        - name: NFS_GRACE_TIME
          value: "90"
        image: openebs/nfs-server-alpine:0.7.1
        imagePullPolicy: IfNotPresent
        name: nfs-server
        ports:
        - containerPort: 2049
          name: nfs
          protocol: TCP
        - containerPort: 111
          name: rpcbind
          protocol: TCP
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /nfsshare
          name: exports-dir
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - name: exports-dir
        persistentVolumeClaim:
          claimName: nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2021-10-29T12:23:27Z"
    lastUpdateTime: "2021-10-29T12:24:58Z"
    message: ReplicaSet "nfs-pvc-3da88edc-2b97-4165-93ba-49bc54056cc6-fb8f5fd66" has
      successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2021-10-31T10:33:15Z"
    lastUpdateTime: "2021-10-31T10:33:15Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1