hetznercloud / csi-driver

Kubernetes Container Storage Interface driver for Hetzner Cloud Volumes
MIT License
623 stars 102 forks source link

MountVolume.SetUp failed for encrypted XFS volume #648

Closed bheisig closed 1 month ago

bheisig commented 2 months ago

TL;DR

Since version 2.8.0 k8s is unable to mount encrypted PVC with file system type XFS.

Expected behavior

hcloud volume will be attached to server, encrypted and mounted with file system type XFS

Observed behavior

kubectl -n <redacted> get events -w:

0s          Warning   FailedMount              <redacted>/<redacted>       MountVolume.SetUp failed for volume "pvc-<redacted>" : rpc error: code = Internal desc = failed to publish volume: mount failed: exit status 255...

Minimal working example

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: hcloud-encrypted-volume-xfs
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: csi.hetzner.cloud
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
  csi.storage.k8s.io/node-publish-secret-name: hcloud-csi-encryption
  csi.storage.k8s.io/node-publish-secret-namespace: kube-system
  fsType: xfs
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  namespace: <redacted>
  name: <redacted>
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: '2Gi'
  storageClassName: hcloud-encrypted-volume-xfs
  volumeMode: Filesystem

Log output

Journal on k8s worker node:

Jul 02 13:24:05 k8s-worker-<redacted> kubelet[1419]: I0702 13:24:05.902796    1419 operation_generator.go:622] "MountVolume.WaitForAttach entering for volume \"pvc-<redacted>\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^100958515\") pod \"<redacted>\" (UID: \"e7d54231-b891-48d2-9bcb-2e1ce5a2765a\") DevicePath \"csi-752d36c10fcfac0f0e2fcfdac3b878d79d6db9182184b62b912bdb05288e209e\"" pod="<redacted>/<redacted>"
Jul 02 13:24:05 k8s-worker-<redacted> kubelet[1419]: I0702 13:24:05.913312    1419 operation_generator.go:632] "MountVolume.WaitForAttach succeeded for volume \"pvc-<redacted>\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^100958515\") pod \"<redacted>\" (UID: \"e7d54231-b891-48d2-9bcb-2e1ce5a2765a\") DevicePath \"csi-752d36c10fcfac0f0e2fcfdac3b878d79d6db9182184b62b912bdb05288e209e\"" pod="<redacted>/<redacted>"
Jul 02 13:24:06 k8s-worker-<redacted> kubelet[1419]: E0702 13:24:06.105057    1419 file_linux.go:61] "Unable to read config path" err="path does not exist, ignoring" path="/etc/kubernetes/manifests"
Jul 02 13:24:06 k8s-worker-<redacted> kubelet[1419]: E0702 13:24:06.110430    1419 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^100958515 podName: nodeName:}" failed. No retries permitted until 2024-07-02 13:26:08.110395848 +0000 UTC m=+1656234.478285384 (durationBeforeRetry 2m2s). Error: MountVolume.SetUp failed for volume "pvc-<redacted>" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^100958515") pod "<redacted>" (UID: "e7d54231-b891-48d2-9bcb-2e1ce5a2765a") : rpc error: code = Internal desc = failed to publish volume: mount failed: exit status 255

kubectl -n kube-system logs -f -l app.kubernetes.io/instance=hcloud-csi during this time: nothing

Additional information

kubectl version:

Client Version: v1.29.6
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.6

k8s worker node:

lsb_release -a:

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.4 LTS
Release:        22.04
Codename:       jammy

uname -a:

Linux k8s-worker-<redacted> 5.15.0-112-generic #122-Ubuntu SMP Thu May 23 07:48:21 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
jooola commented 1 month ago

I could not reproduce your issue, could you maybe provide use the debug logs?

See https://github.com/hetznercloud/csi-driver/blob/main/docs/debug-logs.md

bheisig commented 1 month ago

While

$ kubectl -n kube-system logs -f deploy/hcloud-csi-controller hcloud-csi-driver

outputs nothing in debug mode

$ kubectl -n kube-system logs -f daemonsets/hcloud-csi-node hcloud-csi-driver

repeats these lines:

level=debug ts=2024-07-24T14:23:37.575476001Z component=grpc-server msg="handling request" method=/csi.v1.Node/NodeGetCapabilities req=
level=debug ts=2024-07-24T14:23:37.575542186Z component=grpc-server msg="finished handling request" method=/csi.v1.Node/NodeGetCapabilities err=null
level=debug ts=2024-07-24T14:23:37.591087021Z component=grpc-server msg="handling request" method=/csi.v1.Node/NodeGetCapabilities req=
level=debug ts=2024-07-24T14:23:37.591156703Z component=grpc-server msg="finished handling request" method=/csi.v1.Node/NodeGetCapabilities err=null
level=debug ts=2024-07-24T14:23:37.592175166Z component=grpc-server msg="handling request" method=/csi.v1.Node/NodeGetCapabilities req=
level=debug ts=2024-07-24T14:23:37.592239558Z component=grpc-server msg="finished handling request" method=/csi.v1.Node/NodeGetCapabilities err=null
level=debug ts=2024-07-24T14:23:37.593027176Z component=grpc-server msg="handling request" method=/csi.v1.Node/NodePublishVolume req="volume_id:\"101066085\" publish_context:<key:\"devicePath\" value:\"/dev/disk/by-id/scsi-0HC_Volume_101066085\" > target_path:\"/var/lib/kubelet/pods/860d1611-4be6-4fee-a7e2-960f337ddb75/volumes/kubernetes.io~csi/pvc-ef6bff65-56ea-4e6d-9e61-3050355f628b/mount\" volume_capability:<mount:<fs_type:\"xfs\" > access_mode:<mode:SINGLE_NODE_WRITER > > secrets:<key:\"encryption-passphrase\" value:\"<redacted>\" > volume_context:<key:\"csi.storage.k8s.io/ephemeral\" value:\"false\" > volume_context:<key:\"csi.storage.k8s.io/pod.name\" value:\"events-5767b68dc4-7nz5c\" > volume_context:<key:\"csi.storage.k8s.io/pod.namespace\" value:\"<redacted>\" > volume_context:<key:\"csi.storage.k8s.io/pod.uid\" value:\"860d1611-4be6-4fee-a7e2-960f337ddb75\" > volume_context:<key:\"csi.storage.k8s.io/serviceAccount.name\" value:\"default\" > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1721830718569-8081-csi.hetzner.cloud\" > "
level=info ts=2024-07-24T14:23:37.664616774Z component=linux-mount-service msg="publishing volume" target-path=/var/lib/kubelet/pods/860d1611-4be6-4fee-a7e2-960f337ddb75/volumes/kubernetes.io~csi/pvc-ef6bff65-56ea-4e6d-9e61-3050355f628b/mount device-path=/dev/mapper/scsi-0HC_Volume_101066085 fs-type=xfs block-volume=false readonly=false mount-options= encrypted=true
E0724 14:23:37.667750       1 mount_linux.go:236] Mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t xfs /dev/mapper/scsi-0HC_Volume_101066085 /var/lib/kubelet/pods/860d1611-4be6-4fee-a7e2-960f337ddb75/volumes/kubernetes.io~csi/pvc-ef6bff65-56ea-4e6d-9e61-3050355f628b/mount
Output: mount: mounting /dev/mapper/scsi-0HC_Volume_101066085 on /var/lib/kubelet/pods/860d1611-4be6-4fee-a7e2-960f337ddb75/volumes/kubernetes.io~csi/pvc-ef6bff65-56ea-4e6d-9e61-3050355f628b/mount failed: Invalid argument

level=error ts=2024-07-24T14:23:37.667817821Z component=grpc-server msg="handler failed" err="rpc error: code = Internal desc = failed to publish volume: mount failed: exit status 255\nMounting command: mount\nMounting arguments: -t xfs /dev/mapper/scsi-0HC_Volume_101066085 /var/lib/kubelet/pods/860d1611-4be6-4fee-a7e2-960f337ddb75/volumes/kubernetes.io~csi/pvc-ef6bff65-56ea-4e6d-9e61-3050355f628b/mount\nOutput: mount: mounting /dev/mapper/scsi-0HC_Volume_101066085 on /var/lib/kubelet/pods/860d1611-4be6-4fee-a7e2-960f337ddb75/volumes/kubernetes.io~csi/pvc-ef6bff65-56ea-4e6d-9e61-3050355f628b/mount failed: Invalid argument\n"
jooola commented 1 month ago

We still cannot reproduce the error, we would need you to run a few command to gather some information.

In the hcloud-csi-node-* pod (from the relevant node) that is trying to mount the xfs filesystem, could you please shell into the hcloud-csi-driver container and run the following command:

Please make sure to update the paths

cat /etc/os-release
cat /etc/mtab

apk add lsblk
lsblk --fs

cryptsetup luksDump /dev/disk/by-id/scsi-0HC_Volume_101069871
cryptsetup -v status scsi-0HC_Volume_101069871
umount /dev/mapper/scsi-0HC_Volume_101069871
mount -v -t xfs /dev/mapper/scsi-0HC_Volume_101069871 /var/lib/kubelet/pods/860d1611-4be6-4fee-a7e2-960f337ddb75/volumes/kubernetes.io~csi/pvc-ef6bff65-56ea-4e6d-9e61-3050355f628b/mount
bheisig commented 1 month ago

I tested it again with a fresh deployment:

Result is reproducable:

# kubectl -n kube-system logs -f hcloud-csi-node-s8j8g hcloud-csi-driver
level=debug ts=2024-07-25T09:36:11.386346816Z component=grpc-server msg="handling request" method=/csi.v1.Node/NodeGetCapabilities req=
level=debug ts=2024-07-25T09:36:11.38642836Z component=grpc-server msg="finished handling request" method=/csi.v1.Node/NodeGetCapabilities err=null
level=debug ts=2024-07-25T09:36:11.399080253Z component=grpc-server msg="handling request" method=/csi.v1.Node/NodeGetCapabilities req=
level=debug ts=2024-07-25T09:36:11.399165244Z component=grpc-server msg="finished handling request" method=/csi.v1.Node/NodeGetCapabilities err=null
level=debug ts=2024-07-25T09:36:11.400243245Z component=grpc-server msg="handling request" method=/csi.v1.Node/NodeGetCapabilities req=
level=debug ts=2024-07-25T09:36:11.400359835Z component=grpc-server msg="finished handling request" method=/csi.v1.Node/NodeGetCapabilities err=null
level=debug ts=2024-07-25T09:36:11.401307591Z component=grpc-server msg="handling request" method=/csi.v1.Node/NodePublishVolume req="volume_id:\"101070019\" publish_context:<key:\"devicePath\" value:\"/dev/disk/by-id/scsi-0HC_Volume_101070019\" > target_path:\"/var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/volumes/kubernetes.io~csi/pvc-5accc317-0126-40fc-bbb4-607de98b1e0a/mount\" volume_capability:<mount:<fs_type:\"xfs\" > access_mode:<mode:SINGLE_NODE_WRITER > > secrets:<key:\"encryption-passphrase\" value:\"<redacted>\" > volume_context:<key:\"csi.storage.k8s.io/ephemeral\" value:\"false\" > volume_context:<key:\"csi.storage.k8s.io/pod.name\" value:\"events-5767b68dc4-h6rdl\" > volume_context:<key:\"csi.storage.k8s.io/pod.namespace\" value:\"<redacted>\" > volume_context:<key:\"csi.storage.k8s.io/pod.uid\" value:\"89a645f0-d795-4675-a5b1-7af373588fa5\" > volume_context:<key:\"csi.storage.k8s.io/serviceAccount.name\" value:\"default\" > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1721899788712-8081-csi.hetzner.cloud\" > "
level=info ts=2024-07-25T09:36:11.521576407Z component=linux-mount-service msg="publishing volume" target-path=/var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/volumes/kubernetes.io~csi/pvc-5accc317-0126-40fc-bbb4-607de98b1e0a/mount device-path=/dev/mapper/scsi-0HC_Volume_101070019 fs-type=xfs block-volume=false readonly=false mount-options= encrypted=true
E0725 09:36:11.586868       1 mount_linux.go:236] Mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t xfs /dev/mapper/scsi-0HC_Volume_101070019 /var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/volumes/kubernetes.io~csi/pvc-5accc317-0126-40fc-bbb4-607de98b1e0a/mount
Output: mount: mounting /dev/mapper/scsi-0HC_Volume_101070019 on /var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/volumes/kubernetes.io~csi/pvc-5accc317-0126-40fc-bbb4-607de98b1e0a/mount failed: Invalid argument

level=error ts=2024-07-25T09:36:11.586943335Z component=grpc-server msg="handler failed" err="rpc error: code = Internal desc = failed to publish volume: mount failed: exit status 255\nMounting command: mount\nMounting arguments: -t xfs /dev/mapper/scsi-0HC_Volume_101070019 /var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/volumes/kubernetes.io~csi/pvc-5accc317-0126-40fc-bbb4-607de98b1e0a/mount\nOutput: mount: mounting /dev/mapper/scsi-0HC_Volume_101070019 on /var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/volumes/kubernetes.io~csi/pvc-5accc317-0126-40fc-bbb4-607de98b1e0a/mount failed: Invalid argument\n"

As you asked, I opened a shell:

$ kubectl -n kube-system exec -it hcloud-csi-node-s8j8g -c hcloud-csi-driver -- ash

# cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.20.1
PRETTY_NAME="Alpine Linux v3.20"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"

Neither pod nor PVC are mentioned in /etc/mtab. I removed all non-related pods and PVCs from output:

# cat /etc/mtab
overlay / overlay rw,relatime,lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/5819/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/5818/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/5817/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/573/fs,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/7772/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/7772/work 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs ro,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,nosuid,relatime,size=16024084k,nr_inodes=4006021,mode=755,inode64 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime,pagesize=2M 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
/dev/sda1 /run/csi ext4 rw,relatime 0 0
/dev/sda1 /etc/hosts ext4 rw,relatime 0 0
/dev/sda1 /dev/termination-log ext4 rw,relatime 0 0
/dev/sda1 /etc/hostname ext4 rw,relatime 0 0
/dev/sda1 /etc/resolv.conf ext4 rw,relatime 0 0
shm /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k,inode64 0 0
/dev/sda1 /var/lib/kubelet ext4 rw,relatime 0 0
tmpfs /run/secrets/kubernetes.io/serviceaccount tmpfs ro,relatime,size=163840k,inode64 0 0

I couldn't do these steps because cluster is behind a proxy:

apk add lsblk
lsblk --fs
# cryptsetup luksDump /dev/disk/by-id/scsi-0HC_Volume_101070019
LUKS header information for /dev/disk/by-id/scsi-0HC_Volume_101070019

Version:        1
Cipher name:    aes
Cipher mode:    xts-plain64
Hash spec:      sha256
Payload offset: 4096
MK bits:        512
MK digest:      5e c6 d0 61 1a 25 f1 c3 09 f3 44 57 34 81 07 bb f6 29 77 0c 
MK salt:        8e ff 99 c9 4b 90 f3 bf b2 14 8f ee 7e 70 25 58 
                7d 42 b7 64 1d c7 8a e4 1f 4b 82 86 12 f1 37 b2 
MK iterations:  85892
UUID:           f03e2070-e239-40aa-bc3f-1e7d31cc5c83

Key Slot 0: ENABLED
        Iterations:             1466538
        Salt:                   34 a5 69 1a 38 d8 4c 18 74 14 39 0a d1 57 72 51 
                                28 08 d7 2f c5 38 d5 84 05 6d 72 0c 11 f9 fa 2c 
        Key material offset:    8
        AF stripes:             4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED
# cryptsetup -v status scsi-0HC_Volume_101070019
/dev/mapper/scsi-0HC_Volume_101070019 is active.
  type:    LUKS1
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: dm-crypt
  device:  /dev/sdo
  sector size:  512
  offset:  4096 sectors
  size:    20967424 sectors
  mode:    read/write
  flags:   discards 
Command successful.
# umount /dev/mapper/scsi-0HC_Volume_101070019
umount: can't unmount /dev/mapper/scsi-0HC_Volume_101070019: Invalid argument

Finally, I tried to mount it manually:

# mount -v -t xfs /dev/mapper/scsi-0HC_Volume_101070019 /var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/mount
mount: mounting /dev/mapper/scsi-0HC_Volume_101070019 on /var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/mount failed: No such file or directory

# mount -v -t xfs /dev/mapper/scsi-0HC_Volume_101070019 /var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/volumes/kubernetes.io\~csi/pvc-5accc317-0126-40fc-bbb4-607de98b1e0a/mount
mount: mounting /dev/mapper/scsi-0HC_Volume_101070019 on /var/lib/kubelet/pods/89a645f0-d795-4675-a5b1-7af373588fa5/volumes/kubernetes.io~csi/pvc-5accc317-0126-40fc-bbb4-607de98b1e0a/mount failed: No such file or directory

# mount -v -t xfs /dev/mapper/scsi-0HC_Volume_101070019 /tmp/
mount: mounting /dev/mapper/scsi-0HC_Volume_101070019 on /tmp/ failed: Invalid argument
jooola commented 1 month ago

Sadly the mount command does not give us enough details, could you please share us the kernel logs after a mount failure? Same as above, after shelling in the hcloud-csi-node container, you could run:

dmesg

The kernel logs from the node might be useful, if the logs from within the hcloud-csi-node container are not helpful enough.

bheisig commented 1 month ago

These lines are repeated dozens of times in dmesg:

[627973.852544] XFS (dm-12): Superblock has unknown incompatible features (0x20) enabled.
[627973.852549] XFS (dm-12): Filesystem cannot be safely mounted by this kernel.
jooola commented 1 month ago

This seems related to https://gitlab.alpinelinux.org/alpine/aports/-/issues/15744, could you maybe try the suggestions from this ticket, to see if it solves the problem for you?

Also, if you could provide us information about the xfs partition, e.g. using xfs_info:

/ # xfs_info /dev/mapper/scsi-0HC_Volume_101083717 
meta-data=/dev/mapper/scsi-0HC_Volume_101083717 isize=512    agcount=4, agsize=655232 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=2620928, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Maybe running xfs_repair in dry-run mode could also provide us some information:

/ # xfs_repair -n /dev/mapper/scsi-0HC_Volume_101083717
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

Sorry for all this back and forth, I sadly still cannot reproduce this. Thanks a lot for continuing to provide us debugging information!

jooola commented 1 month ago

I suspect that the node's kernel must be updated to fix this, either by upgrading the kernel itself or by upgrade the worker to e.g. ubuntu-24.04.

On our dev/test nodes, we currently have: Linux hcloud-csi-node-hjzfd 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 Linux

bheisig commented 1 month ago

After a recent upgrade we use the same Kernel version on our worker nodes (still on Ubuntu 22.04):

# uname -a
Linux hcloud-csi-node-hk8cf 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 Linux

I've tried to format the filesystem manually but failed every time:

# mkfs.xfs -d size=1g -n ftype=1 -i rnext64=0 /dev/mapper/scsi-0HC_Volume_101070019
unknown option -i (null)

# mkfs.xfs -d size=1g -n ftype=1 -m rnext64=0 /dev/mapper/scsi-0HC_Volume_101070019
unknown option -m (null)

A bit unexpected: xfs_info returned some info:

# xfs_info /dev/mapper/scsi-0HC_Volume_101070019
meta-data=/dev/mapper/scsi-0HC_Volume_101070019 isize=512    agcount=4, agsize=655232 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=2620928, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=25650, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

But xfs_repair failed:

# xfs_repair -n /dev/mapper/scsi-0HC_Volume_101070019
Phase 1 - find and verify superblock...
Superblock has unknown compat/rocompat/incompat features (0x0/0x0/0x20).
Using a more recent xfs_repair is recommended.
Found unsupported filesystem features.  Exiting now.

Sorry for all this back and forth, I sadly still cannot reproduce this. Thanks a lot for continuing to provide us debugging information!

No worries. Keep up with your good work :)

jooola commented 1 month ago

Did you run the commands above using the new csi-driver containers? If not, could you try that?

Between csi-driver 2.7.1 and 2.8.0, the alpine image was bumped from alpine 3.15 to alpine 3.20 https://github.com/hetznercloud/csi-driver/commit/1e18db30dcc4395847208ebcaba8232c9aba0cc2

I noticed some differences between our 2 filesystem (note that the xfs_info was executed with csi-driver 2.7.1): < ours < | > yours > image

Did you by any chance format the volume using custom flags?

The same command using the csi-driver 2.8.0:

/ # xfs_info /dev/mapper/scsi-0HC_Volume_101083717 
meta-data=/dev/mapper/scsi-0HC_Volume_101083717 isize=512    agcount=4, agsize=655232 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0 inobtcount=0 nrext64=0
data     =                       bsize=4096   blocks=2620928, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

The mkfs.xfs command might have failed because the flags were probably not available in csi-driver 2.7.1 (alpine 3.15).

bheisig commented 1 month ago

Container definitely runs on v2.8.0:

$ kubectl -n kube-system describe pod hcloud-csi-node-45fzw 
Name:             hcloud-csi-node-45fzw
Namespace:        kube-system
Priority:         0
Service Account:  default
Node:             k8s-worker-3a771cd32e16a6cf/10.96.1.11
Start Time:       Tue, 30 Jul 2024 11:15:53 +0000
Labels:           app.kubernetes.io/component=node
                  app.kubernetes.io/instance=hcloud-csi
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=hcloud-csi
                  controller-revision-hash=796c69c9cd
                  helm.sh/chart=hcloud-csi-2.8.0
                  pod-template-generation=14
Annotations:      <none>
Status:           Running
IP:               10.98.5.57
IPs:
  IP:           10.98.5.57
Controlled By:  DaemonSet/hcloud-csi-node
Containers:
  csi-node-driver-registrar:
    Container ID:  containerd://148dbe7b15fc540cb1d9094737c6aa879f76211a7875046e82c9b0a3f0983a2f
    Image:         registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.7.0
    Image ID:      registry.k8s.io/sig-storage/csi-node-driver-registrar@sha256:4a4cae5118c4404e35d66059346b7fa0835d7e6319ff45ed73f4bba335cf5183
    Port:          <none>
    Host Port:     <none>
    Args:
      --kubelet-registration-path=/var/lib/kubelet/plugins/csi.hetzner.cloud/socket
    State:          Running
      Started:      Tue, 30 Jul 2024 11:15:54 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     50m
      memory:  40Mi
    Requests:
      cpu:        10m
      memory:     20Mi
    Environment:  <none>
    Mounts:
      /registration from registration-dir (rw)
      /run/csi from plugin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4fj5b (ro)
  liveness-probe:
    Container ID:   containerd://0b4e880ef5821fccc5bcbfc27de10501788693d913e6d1b4a6b94573c9627050
    Image:          registry.k8s.io/sig-storage/livenessprobe:v2.9.0
    Image ID:       registry.k8s.io/sig-storage/livenessprobe@sha256:2b10b24dafdc3ba94a03fc94d9df9941ca9d6a9207b927f5dfd21d59fbe05ba0
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 30 Jul 2024 11:15:54 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     50m
      memory:  40Mi
    Requests:
      cpu:        10m
      memory:     20Mi
    Environment:  <none>
    Mounts:
      /run/csi from plugin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4fj5b (ro)
  hcloud-csi-driver:
    Container ID:  containerd://1839db61d35613e4c9793aeef23f1841824de8083b4ce39103ad8905c17d1b72
    Image:         docker.io/hetznercloud/hcloud-csi-driver:v2.8.0
    Image ID:      docker.io/hetznercloud/hcloud-csi-driver@sha256:ca7745faf7ef9c478204382ef98cb1339176a64592331de1668c297a920564c7
    Ports:         9189/TCP, 9808/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /bin/hcloud-csi-driver-node
    State:          Running
      Started:      Tue, 30 Jul 2024 11:15:54 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  80Mi
    Requests:
      cpu:     10m
      memory:  40Mi
    Liveness:  http-get http://:healthz/healthz delay=10s timeout=3s period=2s #success=1 #failure=5
    Environment:
      CSI_ENDPOINT:      unix:///run/csi/socket
      METRICS_ENDPOINT:  0.0.0.0:9189
      ENABLE_METRICS:    true
      LOG_LEVEL:         debug
      HCLOUD_DEBUG:      true
    Mounts:
      /dev from device-dir (rw)
      /run/csi from plugin-dir (rw)
      /var/lib/kubelet from kubelet-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4fj5b (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  kubelet-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet
    HostPathType:  Directory
  plugin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins/csi.hetzner.cloud/
    HostPathType:  DirectoryOrCreate
  registration-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins_registry/
    HostPathType:  Directory
  device-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /dev
    HostPathType:  Directory
  kube-api-access-4fj5b:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 :NoExecute op=Exists
                             :NoSchedule op=Exists
                             CriticalAddonsOnly op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  46m   default-scheduler  Successfully assigned kube-system/hcloud-csi-node-45fzw to k8s-worker-3a771cd32e16a6cf
  Normal  Pulled     46m   kubelet            Container image "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.7.0" already present on machine
  Normal  Created    46m   kubelet            Created container csi-node-driver-registrar
  Normal  Started    46m   kubelet            Started container csi-node-driver-registrar
  Normal  Pulled     46m   kubelet            Container image "registry.k8s.io/sig-storage/livenessprobe:v2.9.0" already present on machine
  Normal  Created    46m   kubelet            Created container liveness-probe
  Normal  Started    46m   kubelet            Started container liveness-probe
  Normal  Pulled     46m   kubelet            Container image "docker.io/hetznercloud/hcloud-csi-driver:v2.8.0" already present on machine
  Normal  Created    46m   kubelet            Created container hcloud-csi-driver
  Normal  Started    46m   kubelet            Started container hcloud-csi-driver
# xfs_info /dev/mapper/scsi-0HC_Volume_101070019 
meta-data=/dev/mapper/scsi-0HC_Volume_101070019 isize=512    agcount=4, agsize=655232 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
data     =                       bsize=4096   blocks=2620928, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=25650, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Custom flags are not used.

patfeiler commented 1 month ago

Hey,

i am a workmate from @bheisig . We had done a full cleanup of all affected pvcs and redeployed the csi v2.8.0. Now i tried a deployment with encrypted xfs and got the same behavior like above.

Than i formatted the volume directly via the hcloud-csi-driver container.

kubectl exec -it -n kube-system hcloud-csi-node-v25r4 -c hcloud-csi-driver ash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # mkfs.xfs -d size=1g -n ftype=1 -i nrext64=0 /dev/mapper/scsi-0HC_Volume_101086364 -f
meta-data=/dev/mapper/scsi-0HC_Volume_101086364 isize=512    agcount=4, agsize=65536 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=22482, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
/ # 

Now it was possible to mount the volume fully functional.

Logs broken:

level=error ts=2024-07-30T12:54:07.312859845Z component=grpc-server msg="handler failed" err="rpc error: code = Internal desc = failed to publish volume: mount failed: exit status 255\nMounting command: mount\nMounting arguments: -t xfs /dev/mapper/scsi-0HC_Volume_101086364 /var/lib/kubelet/pods/494bd080-4f4a-4efb-a237-10e68a972799/volumes/kubernetes.io~csi/pvc-dc2ab109-b9a2-4e35-ad60-77df1914f811/mount\nOutput: mount: mounting /dev/mapper/scsi-0HC_Volume_101086364 on /var/lib/kubelet/pods/494bd080-4f4a-4efb-a237-10e68a972799/volumes/kubernetes.io~csi/pvc-dc2ab109-b9a2-4e35-ad60-77df1914f811/mount failed: Invalid argument\n"
level=info ts=2024-07-30T12:55:11.411331077Z component=linux-mount-service msg="publishing volume" target-path=/var/lib/kubelet/pods/494bd080-4f4a-4efb-a237-10e68a972799/volumes/kubernetes.io~csi/pvc-dc2ab109-b9a2-4e35-ad60-77df1914f811/mount device-path=/dev/mapper/scsi-0HC_Volume_101086364 fs-type=xfs block-volume=false readonly=false mount-options= encrypted=true
level=error ts=2024-07-30T12:55:11.414147489Z component=grpc-server msg="handler failed" err="rpc error: code = Internal desc = failed to publish volume: mount failed: exit status 255\nMounting command: mount\nMounting arguments: -t xfs /dev/mapper/scsi-0HC_Volume_101086364 /var/lib/kubelet/pods/494bd080-4f4a-4efb-a237-10e68a972799/volumes/kubernetes.io~csi/pvc-dc2ab109-b9a2-4e35-ad60-77df1914f811/mount\nOutput: mount: mounting /dev/mapper/scsi-0HC_Volume_101086364 on /var/lib/kubelet/pods/494bd080-4f4a-4efb-a237-10e68a972799/volumes/kubernetes.io~csi/pvc-dc2ab109-b9a2-4e35-ad60-77df1914f811/mount failed: Invalid argument\n"
E0730 12:55:11.414095       1 mount_linux.go:236] Mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t xfs /dev/mapper/scsi-0HC_Volume_101086364 /var/lib/kubelet/pods/494bd080-4f4a-4efb-a237-10e68a972799/volumes/kubernetes.io~csi/pvc-dc2ab109-b9a2-4e35-ad60-77df1914f811/mount
Output: mount: mounting /dev/mapper/scsi-0HC_Volume_101086364 on /var/lib/kubelet/pods/494bd080-4f4a-4efb-a237-10e68a972799/volumes/kubernetes.io~csi/pvc-dc2ab109-b9a2-4e35-ad60-77df1914f811/mount failed: Invalid argument

fixed:

level=info ts=2024-07-30T12:57:13.496861452Z component=linux-mount-service msg="publishing volume" target-path=/var/lib/kubelet/pods/494bd080-4f4a-4efb-a237-10e68a972799/volumes/kubernetes.io~csi/pvc-dc2ab109-b9a2-4e35-ad60-77df1914f811/mount device-path=/dev/mapper/scsi-0HC_Volume_101086364 fs-type=xfs block-volume=false readonly=false mount-options= encrypted=true
jooola commented 1 month ago

We finally managed to reproduce the bug. Setting nrext64=0 does indeed solve the problem.

The mismatch between the supported XFS options between the csi-driver container (alpine) and the node kernel is at the source of this bug. While the csi-driver formats the file system with the nrext64=1 flag, the mount fails because the kernel does not understand some options.

We will release a minimal bug fix that forces the use of nrext64=0.

We are also considering to only support a minimal kernel version (e.g. >=5.15) by default, using for example mkfs.xfs -f -c options=/usr/share/xfsprogs/mkfs/lts_5.15.conf /dev/mapper/scsi-0HC_Volume_101135663

$ cat /usr/share/xfsprogs/mkfs/lts_5.15.conf
# V5 features that were the mkfs defaults when the upstream Linux 5.15 LTS
# kernel was released at the end of 2021.

[metadata]
bigtime=1
crc=1
finobt=1
inobtcount=1
reflink=1
rmapbt=0

[inode]
sparse=1
nrext64=0