Open jmrr opened 4 months ago
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
What happened:
We're using a Postgresql Bitnami Helm Chart (15.1.4) to run a postgres on a microk8s v1.29 cluster. I wanted to leverage this csi driver for this db storage using an iscsi LUN and target that I created on a QNAP NAS connected over a 10GbE network.
To connect to the LUN, I created a PV + PVC like in the examples and added the PVC as
primary.persistence.existingClaim
value when deploying the helm chart.This was working like a charm, at last we could move away from risky storage in the node or slower NFS. However, I replaced the pods of the chart's statefulset to increase its resources, and somehow the
csi-iscsi-node
didn't mount the target in the right location of the pod's volume.The outcome (and how we realised): the new location of the volume
/var/snap/microk8s/common/var/lib/kubelet/pods/b88fdaea-a22e-42ac-90ae-d71f927dc300/volumes/kubernetes.io~csi/postgresql/mount
wasn't actually a mount of the storage in the NAS, but the node root's filesystem itself! A parallel data ingestion operation consumed the node's storage degrading the node and somewhat the whole cluster as many key workloads got evicted with[DiskPressure]
and a taint added to the nodeLogs that we encountered:
The
Detected OS without systemd
message is equally puzzling as we're using Ubuntu 22.04 :thinking: ...What you expected to happen:
Say original pod volume location was:
/var/snap/microk8s/common/var/lib/kubelet/pods/9cd76fee-cd41-4869-90d2-d46ffedddf68/volumes/kubernetes.io~csi/postgresql/mount
-> This was actually the mount point of the filesystem used by the iscsi target.And the new pod volume location was
/var/snap/microk8s/common/var/lib/kubelet/pods/b88fdaea-a22e-42ac-90ae-d71f927dc300/volumes/kubernetes.io~csi/postgresql/mount
I would expect the iscsi csi driver node to unmount the target in the first location and re-mount it in the second location, corresponding to the replacement pod, with no data loss.
How to reproduce it:
PersistentVolume
manifest:Anything else we need to know?:
warning: Unmount skipped because path does not exist
messages in the node logs. Environment:554efb1
kubectl version
):v1.29.4
Ubuntu 22.04.3 LTS
uname -a
):5.15.0-105-generic
open-iscsi
microk8s v1.29