kubernetes-sigs / aws-ebs-csi-driver

CSI driver for Amazon EBS https://aws.amazon.com/ebs/
Apache License 2.0
968 stars 780 forks source link

Driver volume type selection inconsistency #2092

Open maximveksler opened 1 month ago

maximveksler commented 1 month ago

/kind bug

What happened? Using StorageClass with parameters.type: gp3 coupled with annotation ebs.csi.aws.com/volumeType: io2 produces varying results based on existance of ebs.csi.aws.com/throughput annotation.

What you expected to happen?

Volume should always be created with type io2 or the ability to specify volume type should be consistent using an annotation.

How to reproduce it (as minimally and precisely as possible)? This produces Volume of type io2

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: t1
  namespace: default
  annotations:
    ebs.csi.aws.com/iops: '9005'
    ebs.csi.aws.com/volumeType: io2
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 505Gi
  storageClassName: ebs
  volumeMode: Filesystem

---

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs
  annotations:
    storageclass.kubernetes.io/is-default-class: 'true'
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate

This produces volume of type gp3

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: t1
  namespace: default
  annotations:
    ebs.csi.aws.com/iops: '9005'
    ebs.csi.aws.com/volumeType: io2
    ebs.csi.aws.com/throughput: "805"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 505Gi
  storageClassName: ebs
  volumeMode: Filesystem

---

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs
  annotations:
    storageclass.kubernetes.io/is-default-class: 'true'
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate

Anything else we need to know?:

Environment

maximveksler commented 1 month ago

possible workaround is setting throughput to str "0", i.e.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: t1
  namespace: default
  annotations:
    ebs.csi.aws.com/iops: '9005'
    ebs.csi.aws.com/volumeType: io2
    ebs.csi.aws.com/throughput: '0'
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 505Gi
  storageClassName: ebs
  volumeMode: Filesystem
torredil commented 1 month ago

Hey there @maximveksler, thanks for the report. This behavior is actually intended and not a bug.

The inconsistency you're seeing is due to the throughput parameter being incompatible with io2 volumes. This is a limitation imposed by the EC2 API, not the driver itself:

   E0722 14:58:42.764024       1 driver.go:107] "GRPC error" err="rpc error: code = InvalidArgument desc = Could not modify volume (invalid argument) \"vol-0f6a723553b9b4e15\": invalid argument: operation error EC2: ModifyVolume, https response error StatusCode: 400, RequestID: 14f611a3-03bd-4d49-a9b6-907d2dfa3c0e, api error InvalidParameterCombination: The throughput parameter is not supported for io2 volumes."

The workaround of setting ebs.csi.aws.com/throughput: '0' works by effectively removing the throughput specification. For consistent results, it is recommended to only specify the throughput annotation for volume types that support it, such as gp3.

maximveksler commented 1 month ago

Hi, yes I'm aware it's at the API level

Shouldn't the driver mask out this behaviour? The user intent is likely to have io2 provisioned.

Other option (less preferred) is produçe an error instead of attempting to apply the change.

The current result is highly error prune IMO, specifically because it's no exception is raised.

torredil commented 1 month ago

CSI drivers do not have access to PVC annotations during the CreateVolume RPC, which is why creating the intended volume is a two step process when using volume-modifier-for-k8s:

  1. The volume is initially created using the volume type defined in the StorageClass.
  2. The volume-modifier sidecar attempts to modify the volume in order to reconcile the intended state, based on the PVC annotations. This is when we report an error, but as you point out, the volume is still created.

The good news is that this limitation is addressed by a new Kubernetes feature (currently alpha). With VAC, we have access to the specified mutable parameters at CreateVolume time, and they override StorageClass parameters. Thus incompatible configurations as described in this thread will cause the volume creation to fail, preventing the creation of volumes that potentially don't meet the intended requirements.