kubernetes-sigs / aws-ebs-csi-driver

CSI driver for Amazon EBS https://aws.amazon.com/ebs/
Apache License 2.0
998 stars 800 forks source link

Failing to mount XFS volumes #2110

Closed mpb10 closed 2 months ago

mpb10 commented 3 months ago

/kind bug

We're running into an issue with the aws-ebs-csi-driver on an Ubuntu 20.04 worker node where we believe it is incorrectly formatting XFS volumes, and therefore they can't be mounted in the pod's containers.

What happened? We get the following pod error trying to create and mount an XFS volume to a pod that is running on an Ubuntu 20.04 worker node: Events:

  Type     Reason                  Age               From                     Message
  ----     ------                  ----              ----                     -------
  Normal   Scheduled               15s               default-scheduler        Successfully assigned default/test-78 to hostname.us-gov-west-1.compute.internal
  Normal   SuccessfulAttachVolume  13s               attachdetach-controller  AttachVolume.Attach succeeded for volume "pv-shoot--test-matt--tm-05-redacted-bc87-4da1-901c-redacted"
  Warning  FailedMount             4s (x5 over 12s)  kubelet                  MountVolume.MountDevice failed for volume "pv-shoot--test-matt--tm-05-redacted-bc87-4da1-901c-redacted" : rpc error: code = Internal desc = could not format "/dev/nvme1n1" and mount it at "/var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/dbca68fc6e1a04b138beae1f4c5b1657e93fb466483bacd4c910869931d34105/globalmount": mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t xfs -o nouuid,defaults /dev/nvme1n1 /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/dbca68fc6e1a04b138beae1f4c5b1657e93fb466483bacd4c910869931d34105/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/dbca68fc6e1a04b138beae1f4c5b1657e93fb466483bacd4c910869931d34105/globalmount: wrong fs type, bad option, bad superblock on /dev/nvme1n1, missing codepage or helper program, or other error.

What you expected to happen? We expect the XFS volume to be mounted automatically by the CSI driver without any errors.

How to reproduce it (as minimally and precisely as possible)? We use the following manifest to reproduce the error:

# This manifest creates a storageClass, a persistentVolumeClaim, and a pod that uses that persistentVolumeClaim to mount a volume.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: test-78
  labels:
    test: test-78
    testpod: test
provisioner: ebs.csi.aws.com
parameters:
  csi.storage.k8s.io/fstype: xfs
  encrypted: 'true'
  type: gp2
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-78
  labels:
    test: test-78
    testpod: test
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: test-78
  volumeMode: Filesystem
---
kind: Pod
apiVersion: v1
metadata:
  name: test-78
  labels:
    test: test-78
    testpod: test
spec:
  containers:
    - name: my-frontend
      image: busybox
      volumeMounts:
      - mountPath: "/data"
        name: test-volume-01
      command: [ "sleep", "1000000" ]
  volumes:
    - name: test-volume-01
      persistentVolumeClaim:
        claimName: test-78

Our worker nodes are Ubuntu 20.04 AWS EC2 instances running in FIPS mode. This issue does not occur on a different OS, such as SUSE chost images. We also tried disabling FIPS mode, however that didn't make a difference:

root@hostname:/# uname -a
Linux hostname.us-gov-west-1.compute.internal 5.4.0-1128-aws-fips #138+fips1-Ubuntu SMP Sat Jun 29 00:01:34 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

We've ensured that our xfsprogs package, which supplies mkfs.xfs is up-to-date:

root@ip-10-250-13-69:/# apt-get upgrade xfsprogs
Reading package lists... Done
Building dependency tree
Reading state information... Done
xfsprogs is already the newest version (5.3.0-1ubuntu2).
root@ip-10-250-13-69:/# mkfs.xfs -V
mkfs.xfs version 5.3.0

We're using the following Kubernetes version:

k version
Client Version: v1.30.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.3

We've replicated this on the aws-ebs-csi-driver CSI driver versions v1.29.0 and 1.33.0. Changing the CSI driver version doesn't seem to make a difference. We also don't see any mention of this issue in the changelog.

Anything else we need to know?: This error does not occur if the volume type is EXT4 or if we change the worker node OS to something other than Ubuntu 20.04.

We believe that the CSI driver is formatting the XFS volumes incorrectly. This is because, if we SSH into the worker node and manually re-format the XFS volume that is failing to mount, it will then be able to be mounted by the aws-ebs-csi-driver without any issues. This leads us to believe that the aws-ebs-csi-driver is incorrectly formatting the XFS volume before attempting to mount it.

Additionally, the xfs_info values for the XFS volume are slightly different when the CSI driver formats it and when it's manually re-formatted. Both formatting commands are using the default formatting parameters for mkfs.xfs.

This is the xfs_info of the XFS volume after it is formatted by the aws-ebs-csi-driver:

root@hostname:/# xfs_info /dev/nvme1n1
meta-data=/dev/nvme1n1           isize=512    agcount=16, agsize=163840 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=2621440, imaxpct=25
         =                       sunit=1      swidth=1 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=16384, version=2
         =                       sectsz=512   sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

And this is the xfs_info of the same XFS volume after it is re-formatted manually and can be mounted without issue. Notice that the only value that changed is blocks under the log section:

root@hostname:/# mkfs.xfs -f /dev/nvme1n1
meta-data=/dev/nvme1n1           isize=512    agcount=16, agsize=163840 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=2621440, imaxpct=25
         =                       sunit=1      swidth=1 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

As a temporary workaround, we've created a daemonSet that watches the worker nodes for newly created XFS volumes, tests whether they mount successfully, and if they don't, re-formats them automatically. While this does work, it's risky and not production-worthy.

Environment

Thanks!

torredil commented 3 months ago

/assign

mpb10 commented 3 months ago

By the way, these are the logs from the CSI driver when a new XFS volume fails to be mounted:

I0812 20:16:00.182375       1 driver.go:69] "Driver Information" Driver="ebs.csi.aws.com" Version="v1.33.0"
I0812 20:16:00.182570       1 driver.go:138] "Listening for connections" address="/csi/csi.sock"
I0812 20:17:21.329399       1 mount_linux.go:634] Attempting to determine if disk "/dev/nvme1n1" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/nvme1n1])
I0812 20:17:21.345634       1 mount_linux.go:637] Output: ""
I0812 20:17:21.345656       1 mount_linux.go:572] Disk "/dev/nvme1n1" appears to be unformatted, attempting to format as type: "xfs" with options: [-f /dev/nvme1n1]
I0812 20:17:21.690774       1 mount_linux.go:583] Disk successfully formatted (mkfs): xfs - /dev/nvme1n1 /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/044d53b8de23a158c12416d02e5f15f2e4e960decdf224387c8ab41902205426/globalmount
I0812 20:17:21.690808       1 mount_linux.go:601] Attempting to mount disk /dev/nvme1n1 in xfs format at /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/044d53b8de23a158c12416d02e5f15f2e4e960decdf224387c8ab41902205426/globalmount
I0812 20:17:21.690917       1 mount_linux.go:249] Detected OS without systemd
I0812 20:17:21.690927       1 mount_linux.go:224] Mounting cmd (mount) with arguments (-t xfs -o nouuid,defaults /dev/nvme1n1 /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/044d53b8de23a158c12416d02e5f15f2e4e960decdf224387c8ab41902205426/globalmount)
E0812 20:17:21.698938       1 mount_linux.go:236] Mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t xfs -o nouuid,defaults /dev/nvme1n1 /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/044d53b8de23a158c12416d02e5f15f2e4e960decdf224387c8ab41902205426/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/044d53b8de23a158c12416d02e5f15f2e4e960decdf224387c8ab41902205426/globalmount: wrong fs type, bad option, bad superblock on /dev/nvme1n1, missing codepage or helper program, or other error.

E0812 20:17:21.699007       1 driver.go:108] "GRPC error" err=<
    rpc error: code = Internal desc = could not format "/dev/nvme1n1" and mount it at "/var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/044d53b8de23a158c12416d02e5f15f2e4e960decdf224387c8ab41902205426/globalmount": mount failed: exit status 32
    Mounting command: mount
    Mounting arguments: -t xfs -o nouuid,defaults /dev/nvme1n1 /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/044d53b8de23a158c12416d02e5f15f2e4e960decdf224387c8ab41902205426/globalmount
    Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/044d53b8de23a158c12416d02e5f15f2e4e960decdf224387c8ab41902205426/globalmount: wrong fs type, bad option, bad superblock on /dev/nvme1n1, missing codepage or helper program, or other error.
 >

Manually re-formatting the XFS volume on the worker node using mkfs.xfs -f /dev/nvme1n1 will allow the CSI driver to automatically mount the volume after a second or two.

torredil commented 3 months ago

Thanks for the very detailed bug report @mpb10.

This issue is caused by a compatibility mismatch between the version of xfsprogs used by the driver and the kernel version on the worker nodes - the driver utilizes xfsprogs v5.18 which formats XFS volumes with features requiring kernel v5.18 or higher. However, as noted above, the custom Ubuntu 20.04 worker nodes are running an older kernel version (v5.4), which does not support newer XFS features.

Relevant dmesg output:

[ 7383.213514] XFS (nvme1n1): Superblock has unknown read-only compatible features (0x8) enabled.
[ 7383.214947] XFS (nvme1n1): Attempted to mount read-only compatible filesystem read-write.
[ 7383.214948] XFS (nvme1n1): Filesystem can only be safely mounted read only.
[ 7383.214959] XFS (nvme1n1): SB validate failed with error -22.

Manually re-formatting the XFS volume on the worker node using mkfs.xfs -f /dev/nvme1n1 will allow the CSI driver to automatically mount the volume after a second or two.

The fact that manually reformatting the volume using the host's older xfsprogs version (v5.3.0) resolves the issue further confirms that the problem lies in the kernel's inability to mount volumes formatted by the newer xfsprogs version used by the driver.


Ideally, the best solution here would be upgrading the kernel or using an AMI that includes a more recent kernel version : )

I understand this may be challenging or not feasible. In that case, as far as workarounds go, formatting the volumes with the older xfsprogs version available on the host before they are mounted by the driver (as you are currently doing) or using statically provisioned volumes that are pre-formatted would be viable options.

I'll discuss this pain point with the team during our next sync-up and follow up here with the long term view for this class of issue.

mpb10 commented 3 months ago

Thank you for the fast response to this!

To add some context, the Ubuntu 20.04 kernel version that we're using is their FIPS-enabled kernel, which only goes up to version 5.4 at the moment. We have a requirement to use FIPS-enabled kernels, so unfortunately we're unable to upgrade to kernel version 5.18 to solve this issue. Also, Canonical doesn't have any exact dates as to when Ubuntu 22.04 will have a FIPS-enabled kernel officially available, so we are likely going to have to continue to use version 5.4 for a while.

Also, having to manually format our volumes during the provisioning process is rather inefficient and breaks up our automation workflows, so this solution isn't ideal either.

I think having the ability to build our own version of the CSI driver image with an older version of xfsprogs would be great and solve our issue, although I understand this isn't recommended by you guys. Any other way to "opt-out" of using the updated version of xfsprogs would solve our issue too.

Thanks!

torredil commented 3 months ago

Thanks for that feedback @mpb10, its very helpful.

The team is looking to implement a new optional parameter on the node plugin to let users disable some of the newer XFS formatting features. This should solve the compatibility issues you're seeing with the older kernel.

To be clear, this will be an opt-in feature. We're doing it this way to preserve existing behavior, and more importantly, disabling the newer XFS features may result in other compatibility issues down the road.

Relevant WIP PR: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/pull/2121 - feel free to leave further feedback/questions either here or directly on the PR.

mpb10 commented 3 months ago

This is great! this solution will work perfectly for us and we eagerly await it.

Thank you very much @torredil and team!

chethan-das commented 3 months ago

Hi @torredil Is there an ETA to release the opt-in feature - proposed for this issue ? Thank you, Chethan Das

ConnorJC3 commented 3 months ago

@chethan-das We're actively working on this feature and we hope to release it in the near future but won't have a firm ETA until it's fully ready and tested. I (or somebody else from the team) will update this issue when we have a firm ETA or other information available.

AndrewSirenko commented 2 months ago

/close

This should be fixed in aws-ebs-csi-driver v1.35.0, and has been tested by a user with nodes with linux kernel versions ≤ 5.4. Thank you for raising this issue!

Please set the node.legacyXFS helm chart parameter to true to format XFS volumes with bigtime=0,inobtcount=0,reflink=0, so that they can be mounted onto nodes with linux kernel ≤ 5.4. Warning: volumes formatted with this option may experience issues after 2038, and will be unable to use some XFS features (for example, reflinks).

See our driver options documentation or PR #2121 for more details.

k8s-ci-robot commented 2 months ago

@AndrewSirenko: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/2110#issuecomment-2354031217): >/close > >This should be fixed in aws-ebs-csi-driver v1.35.0, and has been tested by a user with nodes with linux kernel versions ≤ 5.4. Thank you for raising this issue! > >Please set the `node.legacyXFS` helm chart parameter to true to format XFS volumes with `bigtime=0,inobtcount=0,reflink=0`, so that they can be mounted onto nodes with linux kernel ≤ 5.4. **Warning**: volumes formatted with this option may experience issues after 2038, and will be unable to use some XFS features (for example, reflinks). > >See [our driver options documentation](https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/docs/options.md) or PR #2121 for more details. Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
aghassemlouei commented 1 month ago

Just wanted to drop in and say thank you to @AndrewSirenko and @ConnorJC3 as this really helped us!