projectcapsule / capsule

Multi-tenancy and policy-based framework for Kubernetes.
https://capsule.clastix.io
Apache License 2.0
1.56k stars 153 forks source link

PVC Creation fails for default storage classes #979

Open abhinandanbaheti opened 7 months ago

abhinandanbaheti commented 7 months ago

Bug description

PVC Creation fails for default storage classes

How to reproduce

Steps to reproduce the behavior:

  1. Create a Capsule Tenant Object, with the storage options in the spec. So that tenant can only list storage classes which has label "capsule.clastix.io/tenant-usable" ` storageOptions: matchExpressions:
    • key: capsule.clastix.io/tenant-usable operator: Exists `
  2. Add the label "capsule.clastix.io/tenant-usable" to all storage classes in the cluster, including the default storage class
  3. Create a StatefulSet and define volumeClaimtemplates. But do not put any storage class name in the spec. Let the default storage class be automatically injected to PVC in annotation (volume.beta.kubernetes.io/storage-class) by kubernetes.
  4. Describe the stateful set. PVC creation fails with the error Warning FailedCreate 2m29s (x177 over 6h57m) statefulset-controller create Pod test-20 in StatefulSet test failed error: failed to create PVC file-test-20: admission webhook "pvc.capsule.clastix.io" denied the request: A valid Storage Class must be used: matching the label selector defined in the Tenant

Expected behavior

PVC creation should be successful, because if we don't specify storage class name in volumeClaimtemplates, kubernetes picks up the default storageclass and set it in annotation (volume.beta.kubernetes.io/storage-class) in PVC, but since capsule checks that pvc must have the storageClassName in the spec it fails. We should also add a check for the annotation (volume.beta.kubernetes.io/storage-class) with the valid storage class name, and if its present then allow the request to create pvc. Sample code https://github.com/projectcapsule/capsule/blob/main/pkg/webhook/pvc/validating.go#L48-L56

oliverbaehler commented 7 months ago

@abhinandanbaheti Thanks for reporting this. I will try to reproduce and deliver a fix.

oliverbaehler commented 7 months ago

Which Kubernetes Version are you running? I can't replicate it with > 1.25. According to the docs:

In the past, the annotation volume.beta.kubernetes.io/storage-class was used instead of the storageClassName attribute. This annotation is still working; however, it will become fully deprecated in a future Kubernetes release.

In my tests, the storageClass attribute was set directly with the cluster default storageClass

logikone commented 7 months ago

Hey @oliverbaehler thanks for the response. We're running 1.24 currently

oliverbaehler commented 7 months ago

@logikone A specific distribution or something like that? I am unable to confirm that issue with KinD 1.24.0

logikone commented 7 months ago

Hey, it took me a while but i've finally been able to repo this issue. for some of our pvcs we have the volume.beta.kubernetes.io/storage-class annotation set, which is the old way of specifying the storage class. In this instance since the storage class is defined spec.storageClassName doesn't get populated with the default and then we get this error in response: https://github.com/projectcapsule/capsule/blob/main/pkg/webhook/pvc/validating.go#L53

I think we need to look in both places for the storage class until support for the annotation is fully removed, which I suspect might be a bit as per this comment: https://github.com/kubernetes/kubernetes/pull/51440#issuecomment-396640304 which is already over 5 years old 😢

logikone commented 7 months ago

i'm not able to repro this in testing using the e2e test suite in this repo. still digging to try and figure out whats going on

oliverbaehler commented 7 months ago

Are you using a specific distro or do you have any custom FeatureGates enabled?

logikone commented 7 months ago

its just the standard k8s dist, installed by kops. for whatever reason when the annotation is present on a pvc the request gets denied by the pvc.capsule.clastix.io webhook, but when its absent it works. but then on other clusters is works even with the annotation so i'm not yet convinced that capsule is causing the issue. still trying to figure out what the difference is between the clusters. in both cases though there is no default storage class set on the tenant so the mutating webhook shouldn't be changing anything afaict

prometherion commented 5 months ago

I think we need to look in both places for the storage class until support for the annotation is fully removed, which I suspect might be a bit as per this comment: kubernetes/kubernetes#51440 (comment) which is already over 5 years old

@oliverbaehler we're doing something similar to Ingress objects, where the annotation is still used for previous Kubernetes versions. don't you think we should add this also for PVCs?