kubevirt / hostpath-provisioner-operator

Apache License 2.0
52 stars 35 forks source link

mounter fails with NFS backed storage for --storagePoolPath #377

Closed slalomnut closed 11 months ago

slalomnut commented 12 months ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug /kind enhancement

What happened: hpp-pool pod deployment stuck in CrashLoopBackOff

pathInfo, err := os.Stat(hostMountPath): fails attempting to stat NFS mount path.

[root@api-int ~]# oc debug hpp-pool-66a3ae7d-7b586fb698-znstp Starting pod/hpp-pool-66a3ae7d-7b586fb698-znstp-debug, command was: /usr/bin/mounter --storagePoolPath /source --mountPath /var/hpvolumes/csi --hostPath /host Pod IP: 10.128.1.19 If you don't see a command prompt, try pressing enter. sh-5.1# /usr/bin/mounter --storagePoolPath /source --mountPath /var/hpvolumes/csi --hostPath /host {"level":"info","ts":1695162977.3244886,"logger":"mounter","msg":"Go Version: go1.19.10"} {"level":"info","ts":1695162977.3245575,"logger":"mounter","msg":"Go OS/Arch: linux/amd64"} {"level":"info","ts":1695162977.370991,"logger":"mounter","msg":"Found mount info","source path on host":"hostname.domain.net:/mnt/ovirt/openshift/nfs/vols/pvc-68b78950-9bbc-4d55-96c7-b27c5c66bbfb"} {"level":"info","ts":1695162977.371048,"logger":"mounter","msg":"Target path","path":"/var/hpvolumes/csi"} {"level":"info","ts":1695162977.3710966,"logger":"mounter","msg":"host path","path":"/host"}

panic: stat hostname.domain.net:/mnt/ovirt/openshift/nfs/vols/pvc-68b78950-9bbc-4d55-96c7-b27c5c66bbfb: no such file or directory

sh-5.1# stat /source File: /source Size: 3 Blocks: 1 IO Block: 131072 directory Device: 400046h/4194374d Inode: 34 Links: 2 Access: (0777/drwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2023-09-19 19:04:47.270279314 +0000 Modify: 2023-09-19 21:49:05.032855479 +0000 Change: 2023-09-19 21:49:05.032855479 +0000 Birth: -

sh-5.1# mountpoint /source/ /source/ is a mountpoint

Confirmed RW access to mount point.

sh-5.1# echo "write-test" > /source/test sh-5.1# cat /source/test write-test

[root@api-int ~]# oc describe pod hpp-pool-66a3ae7d-7b586fb698-znstp Name: hpp-pool-66a3ae7d-7b586fb698-znstp Namespace: openshift-cnv Priority: 0 Service Account: hostpath-provisioner-admin-csi Node: api-int.os-prd.domain.net.0.168.192.in-addr.arpa/192.168.0.26 Start Time: Tue, 19 Sep 2023 19:04:55 +0000 Labels: hpp-pool=local-hpp pod-template-hash=7b586fb698 Annotations: k8s.ovn.org/pod-networks: {"default":{"ip_addresses":["10.128.0.160/23"],"mac_address":"0a:58:0a:80:00:a0","gateway_ips":["10.128.0.1"],"ip_address":"10.128.0.160/2... k8s.v1.cni.cncf.io/network-status: [{ "name": "ovn-kubernetes", "interface": "eth0", "ips": [ "10.128.0.160" ], "mac": "0a:58:0a:80:00:a0", "default": true, "dns": {} }] openshift.io/scc: hostpath-provisioner-csi Status: Running IP: 10.128.0.160 IPs: IP: 10.128.0.160 Controlled By: ReplicaSet/hpp-pool-66a3ae7d-7b586fb698 Containers: mounter: Container ID: cri-o://62a93cb7d3465ec9322b40fa6cd028e12f4a36978a3af686c35e99c8d24381cc Image: registry.redhat.io/container-native-virtualization/hostpath-provisioner-operator-rhel9@sha256:e5fa0aa2d6a48dd2b5e14b9d3741c144b371845c3dbee0dd3a440a1d5fa6d777 Image ID: registry.redhat.io/container-native-virtualization/hostpath-provisioner-operator-rhel9@sha256:e5fa0aa2d6a48dd2b5e14b9d3741c144b371845c3dbee0dd3a440a1d5fa6d777 Port: Host Port: Command: /usr/bin/mounter --storagePoolPath /source --mountPath /var/hpvolumes/csi --hostPath /host State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 2 Started: Tue, 19 Sep 2023 23:05:52 +0000 Finished: Tue, 19 Sep 2023 23:05:52 +0000 Ready: False Restart Count: 52 Requests: cpu: 10m memory: 100Mi Environment: Mounts: /host from host-root (rw) /source from data (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tvh7z (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: hpp-pool-66a3ae7d ReadOnly: false host-root: Type: HostPath (bare host directory volume) Path: / HostPathType: Directory kube-api-access-tvh7z: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Normal Pulled 25m (x49 over 4h5m) kubelet Container image "registry.redhat.io/container-native-virtualization/hostpath-provisioner-operator-rhel9@sha256:e5fa0aa2d6a48dd2b5e14b9d3741c144b371845c3dbee0dd3a440a1d5fa6d777" already present on machine Warning BackOff 21s (x1118 over 4h5m) kubelet Back-off restarting failed container mounter in pod hpp-pool-66a3ae7d-7b586fb698-znstp_openshift-cnv(f18608e2-05c3-4c7b-80b7-08a41bb10e65)

[root@api-int customizations]# oc get pvc,pv NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/hpp-pool-66a3ae7d Bound pvc-c94c1b59-8b05-4c18-bdcd-dbe3e85e06a5 50Gi RWO freenas-nfs-csi 21s

NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/pvc-08563a4f-d4a7-4896-a1d8-3c49abc9e85d 34087042032 RWO Retain Bound openshift-virtualization-os-images/rhel9-a1947a1edca5 freenas-nfs-csi 16h persistentvolume/pvc-110b1cb9-5ca3-499a-8021-7453659eb8fa 34087042032 RWO Retain Bound openshift-virtualization-os-images/fedora-f7cc15256f08 freenas-nfs-csi 16h persistentvolume/pvc-31147cae-f042-4ca0-bdc5-0b57291ddc58 34087042032 RWO Retain Bound openshift-virtualization-os-images/centos7-680e9b4e0fba freenas-nfs-csi 16h persistentvolume/pvc-33c3c34c-80ff-483d-9f18-814bde11c732 34087042032 RWO Retain Bound openshift-virtualization-os-images/rhel8-2cde3f47f8c7 freenas-nfs-csi 16h persistentvolume/pvc-6c20fbbd-8ef0-44a8-aaba-0a389d7ad376 34087042032 RWO Retain Bound openshift-virtualization-os-images/centos-stream9-7ff5f92120f1 freenas-nfs-csi 16h persistentvolume/pvc-c94c1b59-8b05-4c18-bdcd-dbe3e85e06a5 50Gi RWO Retain Bound openshift-cnv/hpp-pool-66a3ae7d freenas-nfs-csi 14s persistentvolume/pvc-da3e0962-9f66-47af-9bcb-3d26c52d698b 34087042032 RWO Retain Bound openshift-virtualization-os-images/centos-stream8-0c4085ea2026 freenas-nfs-csi 16h

[root@api-int customizations]# oc describe persistentvolumeclaim/hpp-pool-66a3ae7d Name: hpp-pool-66a3ae7d Namespace: openshift-cnv StorageClass: freenas-nfs-csi Status: Bound Volume: pvc-c94c1b59-8b05-4c18-bdcd-dbe3e85e06a5 Labels: app.kubernetes.io/component=storage app.kubernetes.io/managed-by=hostpath-provisioner-operator app.kubernetes.io/part-of=hyperconverged-cluster app.kubernetes.io/version=4.13.4 k8s-app=hostpath-provisioner kubevirt.io.hostpath-provisioner/storagePool=local-hpp Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: org.democratic-csi.nfs volume.kubernetes.io/storage-provisioner: org.democratic-csi.nfs Finalizers: [kubernetes.io/pvc-protection] Capacity: 50Gi Access Modes: RWO VolumeMode: Filesystem Used By: hpp-pool-66a3ae7d-7b586fb698-bst9n Events: Type Reason Age From Message


Normal ExternalProvisioning 67s persistentvolume-controller waiting for a volume to be created, either by external provisioner "org.democratic-csi.nfs" or manually created by system administrator Normal Provisioning 67s org.democratic-csi.nfs_zfs-nfs-democratic-csi-controller-7fd4cbc97-ds4f2_9b2e6afa-733d-445e-a71d-0dd005eb0cca External provisioner is provisioning volume for claim "openshift-cnv/hpp-pool-66a3ae7d" Normal ProvisioningSucceeded 60s org.democratic-csi.nfs_zfs-nfs-democratic-csi-controller-7fd4cbc97-ds4f2_9b2e6afa-733d-445e-a71d-0dd005eb0cca Successfully provisioned volume pvc-c94c1b59-8b05-4c18-bdcd-dbe3e85e06a5

[root@api-int customizations]# oc describe persistentvolume/pvc-c94c1b59-8b05-4c18-bdcd-dbe3e85e06a5 Name: pvc-c94c1b59-8b05-4c18-bdcd-dbe3e85e06a5 Labels: Annotations: pv.kubernetes.io/provisioned-by: org.democratic-csi.nfs volume.kubernetes.io/provisioner-deletion-secret-name: provisioner-secret-freenas-nfs-csi-zfs-nfs-democratic-csi volume.kubernetes.io/provisioner-deletion-secret-namespace: democratic-csi Finalizers: [kubernetes.io/pv-protection] StorageClass: freenas-nfs-csi Status: Bound Claim: openshift-cnv/hpp-pool-66a3ae7d Reclaim Policy: Retain Access Modes: RWO VolumeMode: Filesystem Capacity: 50Gi Node Affinity: Message:
Source: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: org.democratic-csi.nfs FSType: nfs VolumeHandle: pvc-c94c1b59-8b05-4c18-bdcd-dbe3e85e06a5 ReadOnly: false VolumeAttributes: node_attach_driver=nfs provisioner_driver=freenas-nfs server=freenas.cjgolden.net share=/mnt/ovirt/openshift/nfs/vols/pvc-c94c1b59-8b05-4c18-bdcd-dbe3e85e06a5 storage.kubernetes.io/csiProvisionerIdentity=1695302245662-2893-org.democratic-csi.nfs Events:

What you expected to happen: /usr/bin/mounter --storagePoolPath /source --mountPath /var/hpvolumes/csi --hostPath /host to succeed and the hpp-pool pod start successfully.

How to reproduce it (as minimally and precisely as possible): Deploy using NFS backed storage for --storagePoolPath.

Anything else we need to know?: Other pods (e.g. openshift-virtualization-os-images) successfully create pvc and pvs and successfully start.

Environment:

akalenyu commented 11 months ago

Thanks for reporting this! we may need to do something similar to https://github.com/kubevirt/kubevirt/pull/9591