NetApp / trident

Storage orchestrator for containers
Apache License 2.0
762 stars 222 forks source link

Issue creating a volume and mounting to a pod in GKE with Trident #874

Closed andreasvandaalen closed 7 months ago

andreasvandaalen commented 11 months ago

Describe the bug Following a fresh Trident installation and backend creation according to the NetApp Trident Backend Configuration, we had an issue with volume creation and mounting on the NetApp backend (CVO).

Despite the backend's successful creation, the expected volume still needs to be created on the backend. Instead, a "magic" volume appears mounted in the pod as the PVC, which is of "tmpfs" type rather than the anticipated shared volume from the NetApp backend.

The Trident operator, controller, and node pods fail to bind to the "ontap-nas" storage class and do not create a volume on the NetApp backend upon PV or PVC creation. Although the PV, PVC, and pod are successfully created, the NFS shared NetApp volume is not displayed in the pod.

Trident-controller pod logs show errors and warnings potentially related to this issue.

Environment Provide accurate information about the environment to help us reproduce the issue.

To Reproduce kubectl apply -f merged_manifests.yml -n trident

apiVersion: v1
kind: Secret
metadata:
  name: backend-tbc-ontap-nas-secret
type: Opaque
stringData:
  username: <redacted>
  password: <redacted>
---
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
  name: backend-tbc-ontap-nas
spec:
  version: 1
  backendName: ontap-nas-backend
  storageDriverName: ontap-nas
  managementLIF: <redacted>
  dataLIF: <redacted>
  svm: <svm>
  credentials:
    name: backend-tbc-ontap-nas-secret
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ontapnasudp
provisioner: csi.trident.netapp.io
mountOptions: ["rwx", "nfsvers=3", "proto=udp"]
parameters:
  backendType: "ontap-nas"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-storage
  labels:
    type: local
spec:
  storageClassName: ontapnasudp
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/trident-test"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-storage
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: ontapnasudp
---
kind: Pod
apiVersion: v1
metadata:
  name: pv-pod
spec:
  volumes:
    - name: pv-storage
      persistentVolumeClaim:
       claimName: pvc-storage
  containers:
    - name: pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/tmp/trident-test"
          name: pv-storage

Expected behavior

Upon successful creation of the Trident backend, it is expected that a volume would be created on the NetApp backend (CVO) that corresponds to any PV or PVC created in the Kubernetes cluster. The Trident operator, controller, and node pods should bind to the "ontap-nas" storage class and initiate the volume creation on the backend.

Once the PV, PVC, and pod are created, the NFS shared NetApp volume should be mounted in the pod and be visible when inspecting the pod's volume details. Thus, the expected behavior is a seamless creation and mounting of NetApp volumes in the Kubernetes pods through the Trident operator.

 ~/repo/ kubectl get tridentbackendconfigs -n trident-helm
NAME                    BACKEND NAME        BACKEND UUID                           PHASE   STATUS
backend-tbc-ontap-nas   ontap-nas-backend   54168d75-18e7-46e9-8b1a-d50a467c6aab   Bound   Success
 ~/repo/ kubectl get tridentbackendconfigs -n trident-helm -o yaml
apiVersion: v1
items:
- apiVersion: trident.netapp.io/v1
  kind: TridentBackendConfig
  metadata:
    creationTimestamp: "2023-11-30T11:06:38Z"
    finalizers:
    - trident.netapp.io
    generation: 1
    name: backend-tbc-ontap-nas
    namespace: trident-helm
    resourceVersion: "548627922"
    uid: d32f26e6-b942-4243-a9d3-007b0786c10d
  spec:
    backendName: ontap-nas-backend
    credentials:
      name: backend-tbc-ontap-nas-secret
    dataLIF: <redacted> 
    managementLIF: <redacted>
    storageDriverName: ontap-nas
    svm: <redacted>
    version: 1
  status:
    backendInfo:
      backendName: ontap-nas-backend
      backendUUID: 54168d75-18e7-46e9-8b1a-d50a467c6aab
    deletionPolicy: delete
    lastOperationStatus: Success
    message: Backend 'ontap-nas-backend' created
    phase: Bound
kind: List
metadata:
  resourceVersion: ""

Storage Class:

 ~/repo/ kubectl get sc ontapnasudp -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"ontapnasudp"},"mountOptions":["rwx","nfsvers=3","proto=udp"],"parameters":{"backendType":"ontap-nas"},"provisioner":"csi.trident.netapp.io","volumeBindingMode":"Immediate"}
  creationTimestamp: "2023-11-30T12:32:00Z"
  name: ontapnasudp
  resourceVersion: "548675493"
  uid: 29728bf2-dd12-45ac-9111-abe15a6c25f2
mountOptions:
- rwx
- nfsvers=3
- proto=udp
parameters:
  backendType: ontap-nas
provisioner: csi.trident.netapp.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

Physical Volume (PV)

~/repo/ kubectl get pv -n trident -o yaml pv-storage
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"labels":{"type":"local"},"name":"pv-storage"},"spec":{"accessModes":["ReadWriteMany"],"capacity":{"storage":"10Gi"},"hostPath":{"path":"/tmp/trident-test"},"storageClassName":"ontapnasudp"}}
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2023-11-30T12:33:09Z"
  finalizers:
  - kubernetes.io/pv-protection
  labels:
    type: local
  name: pv-storage
  resourceVersion: "548676147"
  uid: 7519b4d8-8f29-4a09-bad8-412082192e20
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 10Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: pvc-storage
    namespace: trident
    resourceVersion: "548676145"
    uid: d2e363e4-79f4-4f79-bdb2-c4374e37e2ad
  hostPath:
    path: /tmp/trident-test
    type: ""
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ontapnasudp
  volumeMode: Filesystem
status:
  phase: Bound

Physical VOlume Claim (PVC)

 ~/repo/ kubectl get pvc -n trident -o yaml pvc-storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"pvc-storage","namespace":"trident"},"spec":{"accessModes":["ReadWriteMany"],"resources":{"requests":{"storage":"10Gi"}},"storageClassName":"ontapnasudp"}}
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2023-11-30T12:33:09Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: pvc-storage
  namespace: trident
  resourceVersion: "548676149"
  uid: d2e363e4-79f4-4f79-bdb2-c4374e37e2ad
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: ontapnasudp
  volumeMode: Filesystem
  volumeName: pv-storage
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 10Gi
  phase: Bound

the mount in pv-pod (the pod)

root@pv-pod:/# df -h | grep trid
tmpfs           3.9G  4.0K  3.9G   1% /tmp/trident-test

Additional context We have errors, and some additional information from the trident-controller pod:

csi-attacher I1205 10:46:30.285060 1 connection.go:201] GRPC error: <nil> csi-attacher I1205 10:47:30.287315 1 connection.go:201] GRPC error: <nil> │ │ csi-attacher I1205 10:48:30.294277 1 connection.go:201] GRPC error: <nil> │ │ trident-main time="2023-12-05T10:48:31Z" level=error msg="Trident-ACP version is empty." error="<nil>" logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST work │ │ flow="core=version"

csi-attacher W1205 10:48:30.294433 1 csi_handler.go:173] Failed to repair volume handle for driver pd.csi.storage.gke.io: node handle has wrong number of elements; got 1, wanted 6 or more │ │ csi-attacher I1205 10:48:30.294442 1 csi_handler.go:740] Found NodeID <redacted> in CSINode <redacted> │ │ csi-attacher W1205 10:48:30.294461 1 csi_handler.go:173] Failed to repair volume handle for driver pd.csi.storage.gke.io: node handle has wrong number of elements; got 1, wanted 6 or more │ │ trident-main time="2023-12-05T10:48:31Z" level=debug msg="REST API call received." Duration="10.192µs" Method=GET RequestURL=/trident/v1/version Route=GetVersion logLayer=rest_frontend requestID=1 │ │ 9bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="trident_rest=logger" │ │ trident-main time="2023-12-05T10:48:31Z" level=debug msg="Getting Trident-ACP version." logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="core=vers │ │ ion" │ │ trident-main time="2023-12-05T10:48:31Z" level=warning msg="ACP is not enabled." logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="core=version" │ │ trident-main time="2023-12-05T10:48:31Z" level=error msg="Trident-ACP version is empty." error="<nil>" logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST work │ │ flow="core=version" │ │ trident-main time="2023-12-05T10:48:31Z" level=debug msg="REST API call complete." Duration="978.427µs" Method=GET RequestURL=/trident/v1/version Route=GetVersion StatusCode=200 logLayer=rest_fron │ │ tend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="trident_rest=logger" │ │ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=7a27627a-c480-446c-a35a-9addc41b169 │ │ 2 requestSource=Kubernetes workflow="node=update" │ │ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=d33b8f54-6142-4e3d-bc0b-f9458a7c37e │ │ c requestSource=Kubernetes workflow="node=update" │ │ trident-main time="2023-12-05T10:48:37Z" level=warning msg="K8S helper has no record of the updated storage class; instead it will try to create it." logLayer=csi_frontend name=ontapnasudp paramet │ │ ers="map[backendType:ontap-nas]" provisioner=csi.trident.netapp.io requestID=c2f7f1ba-2333-4d73-b142-ac072a9ef5fd requestSource=Kubernetes workflow="storage_class=update" │ │ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=gke-e-infra-gke-e-infra-gke-b083af15-zn8s requestID=86a4ae35-eeb7-4340-a415-929d93b662c │ │ f requestSource=Kubernetes workflow="node=update" │ │ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=gke-e-infra-gke-e-infra-gke-88607822-luiw requestID=8ea3a1cc-0807-4ae3-893e-ddb8f4ed476 │ │ 6 requestSource=Kubernetes workflow="node=update" │ │ trident-main time="2023-12-05T10:48:37Z" level=warning msg="K8S helper could not add a storage class: object is being deleted: tridentstorageclasses.trident.netapp.io \"ontapnasudp\" already exist │ │ s" logLayer=csi_frontend name=ontapnasudp parameters="map[backendType:ontap-nas]" provisioner=csi.trident.netapp.io requestID=c2f7f1ba-2333-4d73-b142-ac072a9ef5fd requestSource=Kubernetes workflow │ │ ="storage_class=update" │ │ trident-main time="2023-12-05T10:48:49Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=e3db19b3-4afd-441d-85dd-37022ed9831 │ │ b requestSource=Kubernetes workflow="node=update"

wonderland commented 11 months ago

Just a few observations that might be helpful:

I suggest you create a new namespace (you would usually not provisioning anything but Trident itself into the Trident namespace). Then apply a proper YAML (e.g. without any manual PV) into that namespace and check the result.

andreasvandaalen commented 11 months ago

@wonderland Thank you for the observations and the hints. As its getting late here, we'll revise the situation based on that tomorrow 👍 Much appreciated for your input!

andreasvandaalen commented 11 months ago

It appears that we are only able to use TCP for communication between our test GKE cluster and the CVO. As a result I've two questions, can I add them here, or shall I create new issues?

We see that the GKE cluster nodes are NFS "ready". However when trying to manually mount an NFS share from i.e. an Ubuntu pod it requires nfs-common and services nfsbind and rpc-statd running to do so.

root@ubuntu-v5:/# showmount -e [redacted] Export list for [redacted]: / (everyone) /trident_pvc_2604074c_8aaa_4571_b225_81245cd221d0 (everyone) /trident_pvc_2fa53c4e_c2ee_4c21_9601_81ab14a0922b (everyone) etc... manual mounting works with tcp, as a result I've create a storage class for tcp

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: andreas-tcp
provisioner: csi.trident.netapp.io
mountOptions: ["rwx", "nfsvers=3", "proto=tcp"]
parameters:
  backendType: "ontap-nas"

and the pvc shows up

 kubectl get pvc -n andreas

NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-andreas   Bound    pvc-2fa53c4e-c2ee-4c21-9601-81ab14a0922b   1Gi        RWO            andreas-tcp    85s
[redacted]::> vol show -fields create-time -volume *922b

vserver            volume                                           create-time
------------------ ------------------------------------------------ ------------------------
[redacted] trident_pvc_2fa53c4e_c2ee_4c21_9601_81ab14a0922b Fri Dec 08 10:48:07 2023

however when trying to use PVC at POD creation with it ends up with a Exit status 32.

i.e.

Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               33m                   default-scheduler        Successfully assigned andreas/ubuntu-v5 to gke-e-infra-gke-e-infra-gke-88607822-luiw
  Normal   SuccessfulAttachVolume  33m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-2fa53c4e-c2ee-4c21-9601-81ab14a0922b"
  Warning  FailedMount             17m (x2 over 28m)     kubelet                  Unable to attach or mount volumes: unmounted volumes=[pvc-andreas], unattached volumes=[kube-api-access-ns54g pvc-andreas]: timed out waiting for the condition
  Warning  FailedMount             10m (x8 over 30m)     kubelet                  Unable to attach or mount volumes: unmounted volumes=[pvc-andreas], unattached volumes=[pvc-andreas kube-api-access-ns54g]: timed out waiting for the condition
  Warning  FailedMount             2m18s (x23 over 32m)  kubelet                  MountVolume.SetUp failed for volume "pvc-2fa53c4e-c2ee-4c21-9601-81ab14a0922b" : rpc error: code = Internal desc = error mounting NFS volume 
[redacted]:/trident_pvc_2fa53c4e_c2ee_4c21_9601_81ab14a0922b on mountpoint /var/lib/kubelet/pods/2c31700c-1bdc-49a5-a135-bff148115654/volumes/kubernetes.io~csi/pvc-2fa53c4e-c2ee-4c21-9601-81ab14a0922b/mount: exit status 32

I didn't expect I should add services to the POD, is that misunderstanding? and do you have other hints what to look for with the "exit status 32", because the export rules are open, and don't see more details what could hint on the cause of the inability to mount the PVC?

wonderland commented 11 months ago

NFS would always run on TCP (UDP usage with NFS has been stopped decades ago). No need to specify it explicitly. Also, I've never seen the "rwx" NFS mount option, are you sure that is valid?

All storage access will always be at the node level. The mount is done by the worker node, then passed on to the container as a bind mount. Therefore no need to install any NFS packages inside the pod (though technically you could do that and access NFS in this way - but this is definitely not the K8s model for storage access!).

Unfortunately "exit status 32" is a pretty generic NFS error code. Usually something in the networking/connectivity area but hard to tell more from that code alone. You could SSH into the worker node and try to manually mount with verbose flags, that should give you more details. The output from the pod events gives you the full mount path (e.g. [redacted]:/trident_pvc_2fa53c4e_c2ee_4c21_9601_81ab14a0922b) so something like

mount -vvv -o vers=3 [redacted]:/trident_pvc_2fa53c4e_c2ee_4c21_9601_81ab14a0922b /mnt/test

andreasvandaalen commented 11 months ago

NFS would always run on TCP (UDP usage with NFS has been stopped decades ago). No need to specify it explicitly.

We started a couple of months with documentation of 19 and later with 21, i expect that we adopted initially proto=udp from the examples. But that got easily adjusted. (There's an example which I found through Google, but is fro the documentation used earlier, but reflects why we did use it https://github.com/NetApp/trident/blob/master/trident-installer/sample-input/storage-class-samples/storage-class-ontapnas-k8s1.8-mountoptions.yaml)

Also, I've never seen the "rwx" NFS mount option, are you sure that is valid?

Those options are documented i.e. here: https://github.com/NetAppDocs/trident/blob/main/trident-use/ontap-nas.adoc and although the POD could only initialize and not actually boot, if you cofnigure with "rwo", starting a second POD which would also only initialize because of the issue dealing with NFS in our case, it actually "complaints" that the PVC is already in use by another POD; which I would say works well, it's with RWO only to be provisioned to one POD.

All storage access will always be at the node level. The mount is done by the worker node, then passed on to the container as a bind mount. Therefore no need to install any NFS packages inside the pod (though technically you could do that and access NFS in this way - but this is definitely not the K8s model for storage access!).

This clarification is very helpful+ (at least to me) to get the picture better on how this fundamentally should work. What we did see is that trident says: NFS check on the nodes. But but while verifying the GCP / GKE cluster details we see that next to the default "Container-Optimized OS (COS)" which we use, there's the options: like support for NFS 😨

image

and the cluster where we did do a test with filestore there were nfs csi drivers added to the COS.

Unfortunately "exit status 32" is a pretty generic NFS error code. Usually something in the networking/connectivity area but hard to tell more from that code alone. You could SSH into the worker node and try to manually mount with verbose flags, that should give you more details. The output from the pod events gives you the full mount path (e.g. [redacted]:/trident_pvc_2fa53c4e_c2ee_4c21_9601_81ab14a0922b) so something like

Unfortunately we don't have the option to ssh into the nodes because of COS. Because of this we found out the above where Ubuntu supplies the support for NFS, and probably has the ability that we can login to the nodes. We did want to do that today, but well, that went differently. We try to do this however soon.

Thanks for the quick response, and thanks for the valuable hints and thoughts @wonderland!

andreasvandaalen commented 11 months ago

We've removed the "rwx":

mountOptions: ["rwx", "nfsvers=3", "proto=tcp"]

to

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: andreas-tcp
provisioner: csi.trident.netapp.io
mountOptions: ["nfsvers=3", "proto=tcp"]
parameters:
  backendType: "ontap-nas"

for some reason we got confused by how we (specifically I) did read and (mis)interpreted the rwx.

Now it proceeds building the POD and mounting the PVC and once on the shell we can see:

root@pv-pod:/usr/share/nginx/html# mount | grep nfs
[redacted]:/trident_pvc_9089df00_f6cd_470b_b1c1_5f37da50c6a0 on /usr/share/nginx/html type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=[redacted],mountvers=3,mountport=635,mountproto=tcp,local_lock=none,addr=[redacted])

or

root@pv-pod:/usr/share/nginx/html# df  -h /usr/share/nginx/html
Filesystem                                                      Size  Used Avail Use% Mounted on
[redacted]:/trident_pvc_9089df00_f6cd_470b_b1c1_5f37da50c6a0  1.0G  320K  1.0G   1% /usr/share/nginx/html

Thank you once again, we're happy having the share connected 💯