kubernetes-sigs / gcp-compute-persistent-disk-csi-driver

The Google Compute Engine Persistent Disk (GCE PD) Container Storage Interface (CSI) Storage Plugin.
Apache License 2.0
163 stars 143 forks source link

hyperdisk-balanced topology issues #1820

Open jsafrane opened 2 weeks ago

jsafrane commented 2 weeks ago

hyperdisks-balanced disks are not usable on most (?) VMs. Similarly, regular persistent disks are not usable on N4/C4 VMs. It makes scheduling of Pods that use hyperdisk-balanced PVs challenging on clusters with mixed VMs, say N2 and N4.

Are there any guidelines how to configure the CSI driver and StorageClasses so PVCs that are scheduled to N4 VMs use hyperdisk-balanced disks and PVCs that are scheduled to N2 VMs use standard PDs?

Right now I can imagine putting all N4 machines into a single availability zone + make sure that there is no N2 VM there. I can then create two dedicated StorageClasses:

  1. hyperdisk: with allowedTopologies targeting the availability zone with N4 machines + type: hyperdisk-balanced.
  2. disk: with allowedTopologies targeting all other AZs with type: pd-standard.

Scheduler is then able to choose the right nodes that use PVs provisioned from these StorageClasses. But it's quite cumbersome to set up.

It feels like there should be two separate CSI drivers, with separate topologies and attach limits.

mattcary commented 2 weeks ago

We don't have a great solution for this. We're working on some ideas. The attach limit is a problem for sure. Using separate CSI drivers would fix it, but it starts getting silly in terms of node resource consumption, especially given that we need to reserve space for mount-time operations like fsck and mkfs that can consume a lot of memory for large volumes.

msau42 commented 2 weeks ago

The problem is worse. Each hyperdisk type has different supported machine types and volume limits. So you would essentially need one CSI driver per disk type.

One idea we did discuss in the past was to have the ability for a CSI driver to be registered with multiple names. It would require all the sidecars to be able to handle processing requests from multiple csi drivers. It would also require the user to explicitly use a different driver name in the storage class, which could also complicate things if we wanted to support transparently changing disk types.