Stateful services remain ina Terminating state in the event of a machine outage

houzhx759 commented 1 year ago

RKE version: 1.4.4 , k8s version 1.25.6

Docker version: (docker version,docker info preferred) 19.03.12-3

Operating system and kernel: (cat /etc/os-release, uname -r preferred): centos7.6，Linux version 5.8.7-1.el7.elrepo.x86_64 (mockbuild@Build64R7)

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)

Huawei Cloud

A stateful service is always in Terminating state after a node is down. Moreover, the node node will not restart the pod, or will be ina Terminating state, must be forcibly deleted to exit the restart, what is the reason? Has anyone experienced this problem? thank you

The following is the node kubelet log :"operationExecutor.UnmountVolume started for volume \"pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499\" (UniqueName: \"kubernetes.io/nfs/c831bbc8-2103-4259-af5b-ac98992ad58b-pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499\") pod \"c831bbc8-2103-4259-af5b-ac98992ad58b\" (UID: \"c831bbc8-2103-4259-af5b-ac98992ad58b\") "
I0911 09:52:16.042550    2243 reconciler.go:211] "operationExecutor.UnmountVolume started for volume \"pvc-7f3211a4-38e6-4ed4-a3e7-ad1f2e4be796\" (UniqueName: \"kubernetes.io/nfs/3d9a34cc-446a-48e8-8cfe-43775def4e4d-pvc-7f3211a4-38e6-4ed4-a3e7-ad1f2e4be796\") pod \"3d9a34cc-446a-48e8-8cfe-43775def4e4d\" (UID: \"3d9a34cc-446a-48e8-8cfe-43775def4e4d\") "
E0911 09:52:16.043903    2243 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/nfs/3d9a34cc-446a-48e8-8cfe-43775def4e4d-pvc-7f3211a4-38e6-4ed4-a3e7-ad1f2e4be796 podName:3d9a34cc-446a-48e8-8cfe-43775def4e4d nodeName:}" failed. No retries permitted until 2023-09-11 09:54:18.043881997 +0000 UTC m=+5134.086634119 (durationBeforeRetry 2m2s). Error: UnmountVolume.TearDown failed for volume "pvc-7f3211a4-38e6-4ed4-a3e7-ad1f2e4be796" (UniqueName: "kubernetes.io/nfs/3d9a34cc-446a-48e8-8cfe-43775def4e4d-pvc-7f3211a4-38e6-4ed4-a3e7-ad1f2e4be796") pod "3d9a34cc-446a-48e8-8cfe-43775def4e4d" (UID: "3d9a34cc-446a-48e8-8cfe-43775def4e4d") : unmount failed: exit status 32
Unmounting arguments: /var/lib/kubelet/pods/3d9a34cc-446a-48e8-8cfe-43775def4e4d/volumes/kubernetes.io~nfs/pvc-7f3211a4-38e6-4ed4-a3e7-ad1f2e4be796
Output: umount: /var/lib/kubelet/pods/3d9a34cc-446a-48e8-8cfe-43775def4e4d/volumes/kubernetes.io~nfs/pvc-7f3211a4-38e6-4ed4-a3e7-ad1f2e4be796: not mounted.
E0911 09:52:16.044059    2243 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/nfs/c831bbc8-2103-4259-af5b-ac98992ad58b-pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499 podName:c831bbc8-2103-4259-af5b-ac98992ad58b nodeName:}" failed. No retries permitted until 2023-09-11 09:54:18.044042866 +0000 UTC m=+5134.086794997 (durationBeforeRetry 2m2s). Error: UnmountVolume.TearDown failed for volume "pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499" (UniqueName: "kubernetes.io/nfs/c831bbc8-2103-4259-af5b-ac98992ad58b-pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499") pod "c831bbc8-2103-4259-af5b-ac98992ad58b" (UID: "c831bbc8-2103-4259-af5b-ac98992ad58b") : unmount failed: exit status 32
Unmounting arguments: /var/lib/kubelet/pods/c831bbc8-2103-4259-af5b-ac98992ad58b/volumes/kubernetes.io~nfs/pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499
Output: umount: /var/lib/kubelet/pods/c831bbc8-2103-4259-af5b-ac98992ad58b/volumes/kubernetes.io~nfs/pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499: not mounted.

superseb commented 1 year ago

Please supply more information on the pod configuration, and what kind storage/volumes are being created to identify a root cause.

houzhx759 commented 1 year ago

hello,Let me describe the current problem, which is that if a node node runs two statefulsets, a node node can get up after it goes down. The other one stays in Termination. Here are the details of the pod in Terminnation state nfs-provisioner is used for underlying storage Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: data-minio-0 ReadOnly: false kube-api-access-v8l6l: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events:

Name: minio-1 Namespace: devops Priority: 0 Service Account: default Node: kube-node-08/10.40.43.141 Start Time: Mon, 14 Aug 2023 16:00:34 +0800 Labels: app=minio controller-revision-hash=minio-79d988c599 statefulset.kubernetes.io/pod-name=minio-1 Annotations: cni.projectcalico.org/containerID: 01b4769effe191e752d0b5ec829c0f388e63d2ae3fafb04b79cab64855777f4d cni.projectcalico.org/podIP: cni.projectcalico.org/podIPs: kubectl.kubernetes.io/restartedAt: 2023-08-14T16:00:09+08:00 Status: Terminating (lasts 13d) Termination Grace Period: 30s IP: IPs: Controlled By: StatefulSet/minio Containers: minio: Container ID: docker://0e3c7492916c814fbdb300215785b27aed8c03e35ff3c90a12862ba1008c2212 Image: /system/minio:RELEASE.2020-11-06T23-17-07Z Image ID: docker-pullable://system/minio@sha256:a1dc27cbac312868a03c7ffbf35b886f3c24f552d69f1036ea1b80f1153ad9b1 Port: 9000/TCP Host Port: 0/TCP Args: server http://minio-{0...3}.minio.devops.svc.cluster.local/data State: Terminated Reason: Completed Exit Code: 0 Started: Tue, 12 Sep 2023 14:28:47 +0800 Finished: Tue, 12 Sep 2023 14:38:22 +0800 Ready: False Restart Count: 1 Environment: MINIO_ACCESS_KEY: admin123 MINIO_SECRET_KEY: U*fVXIu8V9RAfP4M MINIO_PROMETHEUS_AUTH_TYPE: public Mounts: /data from data (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tsq4q (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: data-minio-1 ReadOnly: false kube-api-access-tsq4q: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events:

Name: minio-2 Namespace: devops Priority: 0 Service Account: default Node: kube-node-04/10.40.21.93 Start Time: Mon, 14 Aug 2023 16:00:23 +0800 Labels: app=minio controller-revision-hash=minio-79d988c599 statefulset.kubernetes.io/pod-name=minio-2 Annotations: cni.projectcalico.org/containerID: 5c3d7918c8f708d1ab3d29a2ed33ce1df7365f013bab94071f98e6e574ca3623 cni.projectcalico.org/podIP: 10.42.8.6/32 cni.projectcalico.org/podIPs: 10.42.8.6/32 kubectl.kubernetes.io/restartedAt: 2023-08-14T16:00:09+08:00 Status: Running IP: 10.42.8.6 IPs: IP: 10.42.8.6 Controlled By: StatefulSet/minio Containers: minio: Container ID: docker://172a03893edef65044997200bac033715ef2e905482feca6a8a4a435b043c1af Image: /system/minio:RELEASE.2020-11-06T23-17-07Z Image ID: docker-pullable:///minio@sha256:a1dc27cbac312868a03c7ffbf35b886f3c24f552d69f1036ea1b80f1153ad9b1 Port: 9000/TCP Host Port: 0/TCP Args: server http://minio-{0...3}.minio.devops.svc.cluster.local/data State: Running Started: Mon, 14 Aug 2023 16:00:28 +0800 Ready: True Restart Count: 0 Environment: MINIO_ACCESS_KEY: admin123 MINIO_SECRET_KEY: U*fVXIu8V9RAfP4M MINIO_PROMETHEUS_AUTH_TYPE: public Mounts: /data from data (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wzwf (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: data-minio-2 ReadOnly: false kube-api-access-8wzwf: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events:

Name: minio-3 Namespace: devops Priority: 0 Service Account: default Node: kube-node-06/10.40.165.124 Start Time: Mon, 14 Aug 2023 16:00:13 +0800 Labels: app=minio controller-revision-hash=minio-79d988c599 statefulset.kubernetes.io/pod-name=minio-3 Annotations: cni.projectcalico.org/containerID: 9ff74b1d85e3abc1f06e56b128ee4873b77d094a775a8ad24c42f4691accec00 cni.projectcalico.org/podIP: 10.42.6.7/32 cni.projectcalico.org/podIPs: 10.42.6.7/32 kubectl.kubernetes.io/restartedAt: 2023-08-14T16:00:09+08:00 Status: Running IP: 10.42.6.7 IPs: IP: 10.42.6.7 Controlled By: StatefulSet/minio Containers: minio: Container ID: docker://77b691d838e3e3f59b69584a28d21627871f3c78bbbba8bea08aedc06ff39697 Image: /system/minio:RELEASE.2020-11-06T23-17-07Z Image ID: docker-pullable:///minio@sha256:a1dc27cbac312868a03c7ffbf35b886f3c24f552d69f1036ea1b80f1153ad9b1 Port: 9000/TCP Host Port: 0/TCP Args: server http://minio-{0...3}.minio.devops.svc.cluster.local/data State: Running Started: Mon, 14 Aug 2023 16:00:18 +0800 Ready: True Restart Count: 0 Environment: MINIO_ACCESS_KEY: admin123 MINIO_SECRET_KEY: U*fVXIu8V9RAfP4M MINIO_PROMETHEUS_AUTH_TYPE: public Mounts: /data from data (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dmb54 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: data-minio-3 ReadOnly: false kube-api-access-dmb54: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events:

Please supply more information on the pod configuration, and what kind storage/volumes are being created to identify a root cause.请提供有关 Pod 配置的更多信息，以及正在创建哪种存储/卷来确定根本原因。

github-actions[bot] commented 10 months ago

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

stefanlasiewski commented 7 months ago

I know this is an old issue, but I wanted to share a workaround for posterity.

Basically what happened is that /var/lib/kubelet/pods/c831bbc8-2103-4259-af5b-ac98992ad58b/volumes/kubernetes.io~nfs/pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499 was once an NFS mount, but is no longer an NFS mount. Perhaps Docker crashed, or the node was rebooted. That means that /var/lib/kubelet/pods/c831bbc8-2103-4259-af5b-ac98992ad58b/volumes/kubernetes.io~nfs/pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499 is an empty, orphaned directory at this point.

k8s thinks that the volume is still an NFS volume, and tries to unmount it. Your storage/CSI provider tries to do umount.nfs /var/lib/kubelet/pods/c831bbc8-2103-4259-af5b-ac98992ad58b/volumes/kubernetes.io~nfs/pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499 which fails because it's not an NFS volume. So k8s will end up in an endless loop trying to remove it.

To fix:

Look at /var/lib/kubelet/pods/c831bbc8-2103-4259-af5b-ac98992ad58b/volumes/kubernetes.io~nfs/pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499 to ensure that it is an empty directory
Manually run rmdir /var/lib/kubelet/pods/c831bbc8-2103-4259-af5b-ac98992ad58b/volumes/kubernetes.io~nfs/pvc-82ba9c77-1851-48ec-bc4a-90ecc4b22499.

rancher / rke

Stateful services remain ina Terminating state in the event of a machine outage #3360