siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.53k stars 520 forks source link

NFS Ganesha does not work with Talos 1.3.7 #7424

Closed VMAthreyas closed 3 months ago

VMAthreyas commented 1 year ago

Bug Report

some of our users are running nfs ganesha on our talos based clusters https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner/blob/master/deploy/kubernetes/deployment.yaml. it works fine with K8s version 1.23.14 & talos version 1.2.6, but it does not work with K8s version 1.24.13 & talos version 1.3.7, we have tested K8s 1.24 with another provisioner (RKE) and it works fine, looks like something has changed between talos 1.2.6 -> 1.3.7.

Description

We use PVC as a root disk for local cluster NFS server and try to mount this as /export on other pods, but with talos 1.3.7 version, while using storageclass created based on the local cluster NFS server, PVC itself does not get created.

        volumeMounts:
            - name: export-volume
              mountPath: /export
      volumes:
        - name: export-volume
          persistentVolumeClaim:
            claimName: nesc-nfs-root

Logs

Error:

E0628 11:38:35.874156       1 controller.go:908] error syncing claim "6c2243e6-3297-4c7d-a382-1c9d28079b42": failed to provision volume with StorageClass "example-nfs": error creating export for volume: error exporting export block
EXPORT
{
        Export_Id = 1;
        Path = /export/pvc-6c2243e6-3297-4c7d-a382-1c9d28079b42;
        Pseudo = /export/pvc-6c2243e6-3297-4c7d-a382-1c9d28079b42;
        Access_Type = RW;
        Squash = no_root_squash;
        SecType = sys;
        Filesystem_id = 1.1;
        FSAL {
                Name = VFS;
        }
}
: error calling org.ganesha.nfsd.exportmgr.AddExport: 0 export entries in /export/vfs.conf added because (invalid param value) errors. Details:
Config File (/export/vfs.conf:43): 1 validation errors in block EXPORT
Config File (/export/vfs.conf:43): Errors found in configuration block EXPORT
I0628 11:38:50.885170       1 provision.go:450] using service SERVICE_NAME=nfs-provisioner cluster IP 10.255.38.166 as NFS server IP

Deployment file:
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-provisioner
---
kind: Service
apiVersion: v1
metadata:
  name: nfs-provisioner
  labels:
    app: nfs-provisioner
spec:
  ports:
    - name: nfs
      port: 2049
    - name: nfs-udp
      port: 2049
      protocol: UDP
    - name: nlockmgr
      port: 32803
    - name: nlockmgr-udp
      port: 32803
      protocol: UDP
    - name: mountd
      port: 20048
    - name: mountd-udp
      port: 20048
      protocol: UDP
    - name: rquotad
      port: 875
    - name: rquotad-udp
      port: 875
      protocol: UDP
    - name: rpcbind
      port: 111
    - name: rpcbind-udp
      port: 111
      protocol: UDP
    - name: statd
      port: 662
    - name: statd-udp
      port: 662
      protocol: UDP
  selector:
    app: nfs-provisioner
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: nfs-provisioner
spec:
  selector:
    matchLabels:
      app: nfs-provisioner
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-provisioner
    spec:
      serviceAccount: nfs-provisioner
      containers:
        - name: nfs-provisioner
          image: k8s.gcr.io/sig-storage/nfs-provisioner:v3.0.0
          ports:
            - name: nfs
              containerPort: 2049
            - name: nfs-udp
              containerPort: 2049
              protocol: UDP
            - name: nlockmgr
              containerPort: 32803
            - name: nlockmgr-udp
              containerPort: 32803
              protocol: UDP
            - name: mountd
              containerPort: 20048
            - name: mountd-udp
              containerPort: 20048
              protocol: UDP
            - name: rquotad
              containerPort: 875
            - name: rquotad-udp
              containerPort: 875
              protocol: UDP
            - name: rpcbind
              containerPort: 111
            - name: rpcbind-udp
              containerPort: 111
              protocol: UDP
            - name: statd
              containerPort: 662
            - name: statd-udp
              containerPort: 662
              protocol: UDP
          securityContext:
            capabilities:
              add:
                - DAC_READ_SEARCH
                - SYS_RESOURCE
          args:
            - "-provisioner=example.com/nfs"
          env:
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: SERVICE_NAME
              value: nfs-provisioner
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          imagePullPolicy: "IfNotPresent"
          volumeMounts:
            - name: export-volume
              mountPath: /export
      volumes:
        - name: export-volume
          persistentVolumeClaim:
            claimName: nesc-nfs-root

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
  - apiGroups: [""]
    resources: ["services", "endpoints"]
    verbs: ["get"]
  - apiGroups: ["extensions"]
    resources: ["podsecuritypolicies"]
    resourceNames: ["nfs-provisioner"]
    verbs: ["use"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-provisioner
     # replace with namespace where provisioner is deployed
    namespace: nesctest
roleRef:
  kind: ClusterRole
  name: nfs-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-provisioner
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-provisioner
    # replace with namespace where provisioner is deployed
    namespace: nesctest
roleRef:
  kind: Role
  name: leader-locking-nfs-provisioner
  apiGroup: rbac.authorization.k8s.io

Environment

smira commented 1 year ago

Can you please provide a step by step reproducing example of that?

VMAthreyas commented 1 year ago
  1. Create a large enough pvc from the existing storage class that can be used as the root disk for the NFS server

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    name: nesc-nfs-root
    spec:
    storageClassName: ceph-csi
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
      storage: 100Gi
  2. create the NFS server using deployment file of nfs-ganesha

    apiVersion: v1
    kind: ServiceAccount
    metadata:
    name: nfs-provisioner
    ---
    kind: Service
    apiVersion: v1
    metadata:
    name: nfs-provisioner
    labels:
    app: nfs-provisioner
    spec:
    ports:
    - name: nfs
      port: 2049
    - name: nfs-udp
      port: 2049
      protocol: UDP
    - name: nlockmgr
      port: 32803
    - name: nlockmgr-udp
      port: 32803
      protocol: UDP
    - name: mountd
      port: 20048
    - name: mountd-udp
      port: 20048
      protocol: UDP
    - name: rquotad
      port: 875
    - name: rquotad-udp
      port: 875
      protocol: UDP
    - name: rpcbind
      port: 111
    - name: rpcbind-udp
      port: 111
      protocol: UDP
    - name: statd
      port: 662
    - name: statd-udp
      port: 662
      protocol: UDP
    selector:
    app: nfs-provisioner
    ---
    kind: Deployment
    apiVersion: apps/v1
    metadata:
    name: nfs-provisioner
    spec:
    selector:
    matchLabels:
      app: nfs-provisioner
    replicas: 1
    strategy:
    type: Recreate
    template:
    metadata:
      labels:
        app: nfs-provisioner
    spec:
      serviceAccount: nfs-provisioner
      containers:
        - name: nfs-provisioner
          image: k8s.gcr.io/sig-storage/nfs-provisioner:v3.0.0
          ports:
            - name: nfs
              containerPort: 2049
            - name: nfs-udp
              containerPort: 2049
              protocol: UDP
            - name: nlockmgr
              containerPort: 32803
            - name: nlockmgr-udp
              containerPort: 32803
              protocol: UDP
            - name: mountd
              containerPort: 20048
            - name: mountd-udp
              containerPort: 20048
              protocol: UDP
            - name: rquotad
              containerPort: 875
            - name: rquotad-udp
              containerPort: 875
              protocol: UDP
            - name: rpcbind
              containerPort: 111
            - name: rpcbind-udp
              containerPort: 111
              protocol: UDP
            - name: statd
              containerPort: 662
            - name: statd-udp
              containerPort: 662
              protocol: UDP
          securityContext:
            capabilities:
              add:
                - DAC_READ_SEARCH
                - SYS_RESOURCE
          args:
            - "-provisioner=example.com/nfs"
          env:
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: SERVICE_NAME
              value: nfs-provisioner
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          imagePullPolicy: "IfNotPresent"
          volumeMounts:
            - name: export-volume
              mountPath: /export
      volumes:
        - name: export-volume
          persistentVolumeClaim:
            claimName: nesc-nfs-root
  3. Add necessary roles

    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
    name: nfs-provisioner-runner
    rules:
    - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
    - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
    - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
    - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
    - apiGroups: [""]
    resources: ["services", "endpoints"]
    verbs: ["get"]
    - apiGroups: ["extensions"]
    resources: ["podsecuritypolicies"]
    resourceNames: ["nfs-provisioner"]
    verbs: ["use"]
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
    name: run-nfs-provisioner
    subjects:
    - kind: ServiceAccount
    name: nfs-provisioner
     # replace with namespace where provisioner is deployed
    namespace: default
    roleRef:
    kind: ClusterRole
    name: nfs-provisioner-runner
    apiGroup: rbac.authorization.k8s.io
    ---
    kind: Role
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
    name: leader-locking-nfs-provisioner
    rules:
    - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
    ---
    kind: RoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
    name: leader-locking-nfs-provisioner
    subjects:
    - kind: ServiceAccount
    name: nfs-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
    roleRef:
    kind: Role
    name: leader-locking-nfs-provisioner
    apiGroup: rbac.authorization.k8s.io
  4. Create a storageclass to use this nfs server

    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
    name: example-nfs
    provisioner: example.com/nfs
    mountOptions:
    - vers=4.1
  5. Create a pvc using this storageclass

    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
    name: nfs-test1
    spec:
    storageClassName: example-nfs
    accessModes:
    - ReadWriteMany
    resources:
    requests:
      storage: 10G
github-actions[bot] commented 3 months ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 3 months ago

This issue was closed because it has been stalled for 7 days with no activity.