kubernetes-sigs / vsphere-csi-driver

vSphere storage Container Storage Interface (CSI) plugin
https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/index.html
Apache License 2.0
293 stars 177 forks source link

Cross NS quotas #2828

Closed omniproc closed 3 weeks ago

omniproc commented 5 months ago

/kind feature

What happened:

When a ResourceQuota / LimitRange is created it is enforced per namespace.

---
apiVersion: v1
kind: LimitRange              # Limit PVC min/max
metadata:
  name: storagelimits
spec:
  limits:
  - type: PersistentVolumeClaim
    max:
      storage: 50Gi
    min:
      storage: 1Gi
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: storagequota
spec:
  hard:
    persistentvolumeclaims: "50"  # We want to stay way below the current max for vSphere CSI (which is 59 per node VM)
    requests.storage: "500Gi"     # Limit the amount of storage requested from a StorageClass

Those are namespaced resources that affect any StorageClass used for the given namespace

What you expected to happen: We'd like to implement limits per StorageClass to make sure vSphere backing datastores are not over-provisioned. Or, to put it differently: we'd like to have a method to make sure the effect on the datastore is the same as if ResourceQuota would be calculated cross-namespaces.

How to reproduce it (as minimally and precisely as possible):

  1. Create a new StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
volumeBindingMode: WaitForFirstConsumer
metadata:
  name: rwo-dynamic-block-volume
provisioner: csi.vsphere.vmware.com
allowVolumeExpansion: false
parameters:
  storagepolicyname: "foo"
reclaimPolicy: Delete
  1. Apply the policies ResourceQuota / LimitRange shown above
  2. Create a PV 499Gi in size in NS A
  3. Create a PV 499Gi in size in NS B
  4. vSphere datastore now has 998Gi storage space allocated although. We can not currently ensure it will never grow bigger then 500Gi.

Anything else we need to know?:

This is probably a feature vSphere admins are very interested it as it enables control of datastore usage which is currently not possible. Since this is a vSphere CSI specific request I guess it would make most sense to implement it as a parameter of the csi.vsphere.vmware.com StorageClass object. The responsibility of the admin configuring the StorageClass would be to ensure that the total space defined by this parameter of all StorageClass pointing to the same vSphere datastore doesn't go above the vSphere datastore capacity. I do not believe that there should more logic to this since whatever additional check you might implement it would all break when the same backing datastore is used by another K8s cluster. The scope of this really is:

Environment:

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 3 weeks ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 3 weeks ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/2828#issuecomment-2293386212): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.