kubernetes-sigs / aws-ebs-csi-driver

CSI driver for Amazon EBS https://aws.amazon.com/ebs/
Apache License 2.0
980 stars 791 forks source link

Now standard AWS symlink from /dev/xvdaa to /dev/nvme1n1 break the driver for minikube #2156

Open stevemadere opened 1 month ago

stevemadere commented 1 month ago

/kind bug

What happened? EBS volumes get created and attached just fine but then are unavailable to kubernetes running under minikube.

AWS EC2 instances these days often don't have a /dev/xvdaa device for attaching EBS volumes but instead have a /dev/nvme1n1 device which gets attached and a symlink is then created: /dev/xvdaa -> /dev/nvme1n1

Under this circumstance, the MountVolume.MountDevice fails:

Events: Type Reason Age From Message


Normal Scheduled 14m default-scheduler Successfully assigned default/postgres-deployment-7b6556544c-8ztdk to minikube Normal SuccessfulAttachVolume 14m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-378c1c8f-500c-4bee-a16d-2ab9457a5511" Warning FailedMount 2m31s (x14 over 14m) kubelet MountVolume.MountDevice failed for volume "pvc-378c1c8f-500c-4bee-a16d-2ab9457a5511" : rpc error: code = Internal desc = Failed to find device path /dev/xvdaa. no device path for device "/dev/xvdaa" volume "vol-0f1bd87aca480b924" found Warning FailedMount 4s (x8 over 68s) kubelet MountVolume.MountDevice failed for volume "pvc-378c1c8f-500c-4bee-a16d-2ab9457a5511" : rpc error: code = Internal desc = Failed to find device path /dev/xvdaa. no device path for device "/dev/xvdaa" volume "vol-0f1bd87aca480b924" found

While investigating, I found that the minikube container itself does not seem to be aware of /dev/xvdaa:

[ec2-user@ip-172-31-31-77 ~]$ minikube ssh docker@minikube:~$ ls -l /dev/xv /dev/nvm ls: cannot access '/dev/xv*': No such file or directory crw------- 1 root root 250, 0 Sep 20 20:30 /dev/nvme0 brw-rw---- 1 root disk 259, 0 Sep 20 20:30 /dev/nvme0n1 brw-rw---- 1 root disk 259, 1 Sep 20 20:30 /dev/nvme0n1p1 brw-rw---- 1 root disk 259, 2 Sep 20 20:30 /dev/nvme0n1p128 crw------- 1 root root 250, 1 Sep 20 20:30 /dev/nvme1 brw-rw---- 1 root disk 259, 3 Sep 20 20:30 /dev/nvme1n1 docker@minikube:~$

But the volume is definitely attached and AWS claims it's attached to /dev/xvdaa:

[

ec2-user@ip-172-31-31-77 ~]$ aws ec2 describe-volumes --volume-id vol-0f1bd87aca480b924 --region us-west-2


{
"Volumes": [
{
"AvailabilityZone": "us-west-2b",
"Attachments": [
{
"AttachTime": "2024-09-20T20:17:13.000Z",
"InstanceId": "i-0a6641c5d48bab3e8",
"VolumeId": "vol-0f1bd87aca480b924",
"State": "attached",
"DeleteOnTermination": false,
"Device": "/dev/xvdaa"
}
],
"Tags": [
{
"Value": "pvc-378c1c8f-500c-4bee-a16d-2ab9457a5511",
"Key": "kubernetes.io/created-for/pv/name"
},
{
"Value": "pvc-378c1c8f-500c-4bee-a16d-2ab9457a5511",
"Key": "CSIVolumeName"
},
{
"Value": "default",
"Key": "kubernetes.io/created-for/pvc/namespace"
},
{
"Value": "true",
"Key": "ebs.csi.aws.com/cluster"
},
{
"Value": "pg-data-pvc",
"Key": "kubernetes.io/created-for/pvc/name"
}
],
"Encrypted": false,
"VolumeType": "gp3",
"VolumeId": "vol-0f1bd87aca480b924",
"State": "in-use",
"Iops": 3000,
"SnapshotId": "",
"CreateTime": "2024-09-20T20:17:10.322Z",
"MultiAttachEnabled": false,
"Size": 8
}
]
}

**What you expected to happen?**

After the volume is attached, the device can be mounted even if the attachment Device name is a symlink.

**How to reproduce it (as minimally and precisely as possible)?**
```bash
helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver
helm install aws-ebs-csi-driver aws-ebs-csi-driver/aws-ebs-csi-driver --namespace kube-system

pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pg-data-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi
  storageClassName: ebs-sc

Anything else we need to know?:

Environment

torredil commented 3 weeks ago

Hey @stevemadere, thanks for reporting this : )

We will treat this as a feature request. Currently, the driver is not officially supported or previously tested by our team in minikube environments. It seems that there is a big opportunity to improve the resiliency of FindDevicePath here.

/kind feature