Open teknologista opened 3 years ago
By the way I have clusters of type RKE2 hardened available to test a potential fix or help debug the issue
Couldn't reproduce with this:
allowVolumeExpansion: false # not yet supported
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: "scw-bssd-enc"
provisioner: csi.scaleway.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
encrypted: "true"
csi.storage.k8s.io/node-stage-secret-name: "enc-secret"
csi.storage.k8s.io/node-stage-secret-namespace: "default"
---
apiVersion: v1
kind: Secret
metadata:
name: enc-secret
namespace: default
type: Opaque
data:
encryptionPassphrase: bXlhd2Vzb21lcGFzc3BocmFzZQ==
---
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongo
spec:
selector:
matchLabels:
role: mongo
environment: test
serviceName: "mongo"
replicas: 3
template:
metadata:
labels:
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo
command:
- mongod
- "--replSet"
- rs0
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo,environment=test"
volumeClaimTemplates:
- metadata:
name: mongo-persistent-storage
spec:
storageClassName: "scw-bssd-enc"
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5Gi
Used this script:
#!/bin/bash
while true; do
while kubectl get pods --no-headers | grep -v Running ; do
sleep 2
done
kubectl delete pods mongo-$(($RANDOM % 3))
done
I let it run for some time, with no issue. Tested on Kapsule with k8s 1.20.11.
Could you get cryptsetup status /dev/mapper/scw-luks-<id>
when it's stuck ?
Hi Patrik,
Thanks for looking at this.
It may then be related to the fact that the Kubernetes cluster is RKE2 Government with hardened Pod Security Policy being enforced.
I will try again tomorrow and let you know the outcome.
Hi @Sh4d1 ,
We are stuck with this issue again today while doing a rolling upgrade of a kubernetes cluster between two minor 1.20 versions.
This is what happended:
MountVolume.MountDevice failed for volume "pvc-7da69745-cd8b-4e4e-b236-ebcb6c76c328" : rpc error: code = Internal desc = failed to format and mount device from ("/dev/mapper/scw-luks-cd5543ac-4300-4bb7-882a-6f19ca0149c3") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-7da69745-cd8b-4e4e-b236-ebcb6c76c328/globalmount") with fstype ("ext4") and options ([]): exit status 1
As per your request this is the output:
~ sudo cryptsetup status /dev/mapper/scw-luks-cd5543ac-4300-4bb7-882a-6f19ca0149c3
/dev/mapper/scw-luks-cd5543ac-4300-4bb7-882a-6f19ca0149c3 is active.
type: n/a
cipher: aes-xts-plain64
keysize: 256 bits
key location: keyring
device: (null)
sector size: 512
offset: 32768 sectors
size: 188710912 sectors
mode: read/write
On the scaleway web console I can see the volume being attached to the right node though.
On the other side, I have logged on the node and tried a full cycle of:
- cryptsetup lucksClose the device mapper from CSI
- cryptsetup lucksOpen the device /dev/sda
- fsck -fy /dev/mapper/the_mapper-device
fsck did fix a few minor errors. nothing crazy but it di modify fs.
With success.
Then I did a cryptsetup lucksClose and then the volume was successfully auto mounted by the CSI without me doing anything.
There might be a reason why sometimes an ext4 volume is messed up after disconnection from workload because of a sudden kill of the pod using it. It then might need to have an fsck run to be mounted again by CSI.
I don't know if this helps, I may be wrong but it is the result of my research.
Anyway, is there anything we can do about this as it kills the auto-healing behaviour of a Kubernetes cluster (maybe run an automated fsck -fy prior to mount in pod)... :-(
Many thanks.
Describe the bug Volumes gets in an unmountable state after trying to restart a pod using an encrypted PV
To Reproduce Setup Scaleway CSI and create encrypted storageClass as outlined in the docs. Deploy a StatefulSet such as a 3 replicaset mongodb Wait for the workload to come up PVs are provisioned and everything is fine Kill one pod and wait for it to be recreated by Kubernetes Just after the scheduler schedules the pod to run on a node it errors because it cannot mount the previously created and existing PV. See errors shown in kube logs below
Expected behavior PV should be attached to the new node where the new pod is scheduled and the pod should start
Details (please complete the following information):
Additional context
Errors shown
Warning FailedMount MountVolume.MountDevice failed for volume "pvc-3030ae10-3579-494a-a215-0017aea58332" : rpc error: code = Internal desc = error encrypting/opening volume with ID aeffa5d1-d5c3-406c-a728-d5d2c856aed9: luksStatus returned ok, but device scw-luks-aeffa5d1-d5c3-406c-a728-d5d2c856aed9 is not active
and
MountVolume.WaitForAttach failed for volume "pvc-83cf34a9-d36d-46e5-bbf2-199c426f518c" : volume fr-par-2/cbe3eca8-f623-4bbe-bc76-450eceb391b2 has GET error for volume attachment csi-879b1d2e5fa7ca784f356b823505c5506b57891aa56966b59c8ebfdae3497320: volumeattachments.storage.k8s.io "csi-879b1d2e5fa7ca784f356b823505c5506b57891aa56966b59c8ebfdae3497320" is forbidden: User "system:node:node-5" cannot get resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: no relationship found between node 'node-5' and this object
Again, that only seems to happen for encrypted PVs.