csi-addons / kubernetes-csi-addons

CSI-Addons implementation and APIs for Kubernetes
Apache License 2.0
17 stars 33 forks source link

reclaimspace: support for storageclass annotation #585

Closed iPraveenParihar closed 2 weeks ago

iPraveenParihar commented 1 month ago

This commit adds support for auto ReclaimSpace by StorageClass annotations. For newly created PVCs ReclaimSpaceCronjob will be created for PVCs associated with a StorageClass that has the ReclaimSpace annotation.

iPraveenParihar commented 1 month ago

Initial testing

StorageClass annotating and PVC creation.

$ k annotate sc rbd-reclaim "reclaimspace.csiaddons.openshift.io/schedule=@daily"
storageclass.storage.k8s.io/rbd-reclaim annotated
$ k create -f pvc.yaml
persistentvolumeclaim/reclaim-pvc created
$ k get pvc
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
reclaim-pvc   Bound    pvc-82312fbe-0af8-460e-a34f-4c2af7da733b   1Gi        RWO            rbd-reclaim    1s

logs from csi-addons-controller-manager

2024-06-06T10:23:34.405Z    INFO    PVC is not in bound state   {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "97bd1dc8-afe6-42ef-b289-37a2842c35f0", "PVCPhase": "Pending"}
2024-06-06T10:23:34.411Z    INFO    PVC is not in bound state   {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "438492af-eaea-4fa1-be9f-92675b52cd5c", "PVCPhase": "Pending"}
2024-06-06T10:23:34.422Z    INFO    PVC is not in bound state   {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "3ca0ab9e-0ea6-4468-84f0-cba3e945c426", "PVCPhase": "Pending"}
2024-06-06T10:23:34.443Z    INFO    PVC is not in bound state   {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "1fa95398-e1e4-40a3-998a-b5d9f3ae2413", "PVCPhase": "Pending"}
2024-06-06T10:23:34.484Z    INFO    PVC is not in bound state   {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "4a80f9d8-3cc0-497e-9e4c-23a44befdf74", "PVCPhase": "Pending"}
2024-06-06T10:23:34.566Z    INFO    Adding annotation   {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "4d432688-e112-4043-8c93-ef400308bff5", "Schedule": "@daily", "ReclaimSpaceCronJobName": "reclaim-pvc-1717669414", "Annotation": "{\"metadata\":{\"annotations\":{\"reclaimspace.csiaddons.openshift.io/cronjob\":\"reclaim-pvc-1717669414\",\"reclaimspace.csiaddons.openshift.io/schedule\":\"@daily\"}}}"}
2024-06-06T10:23:34.586Z    INFO    KubeAPIWarningLogger    unknown field "spec.jobTemplate.metadata.creationTimestamp"
2024-06-06T10:23:34.586Z    INFO    Successfully created reclaimSpaceCronJob    {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "4d432688-e112-4043-8c93-ef400308bff5", "Schedule": "@daily", "ReclaimSpaceCronJobName": "reclaim-pvc-1717669414"}
2024-06-06T10:23:34.595Z    INFO    No upcoming scheduled times, requeue with delay till next run   {"controller": "reclaimspacecronjob", "controllerGroup": "csiaddons.openshift.io", "controllerKind": "ReclaimSpaceCronJob", "ReclaimSpaceCronJob": {"name":"reclaim-pvc-1717669414","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc-1717669414", "reconcileID": "549d3127-b482-4df7-9c8e-99de192cf580", "now": "2024-06-06T10:23:34.595Z", "nextRun": "2024-06-07T00:00:00.000Z"}
2024-06-06T10:23:34.596Z    ERROR   Failed to update reclaimSpaceCronJob    {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "e9e2be5a-d82c-401b-ba7b-88c1353d7473", "ReclaimSpaceCronJobName": "reclaim-pvc-1717669414", "Schedule": "@daily", "error": "Operation cannot be fulfilled on reclaimspacecronjobs.csiaddons.openshift.io \"reclaim-pvc-1717669414\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/csi-addons/kubernetes-csi-addons/controllers/csiaddons.(*PersistentVolumeClaimReconciler).Reconcile
/workspace/go/src/github.com/csi-addons/kubernetes-csi-addons/controllers/csiaddons/persistentvolumeclaim_controller.go:170
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/workspace/go/src/github.com/csi-addons/kubernetes-csi-addons/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/workspace/go/src/github.com/csi-addons/kubernetes-csi-addons/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/workspace/go/src/github.com/csi-addons/kubernetes-csi-addons/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/workspace/go/src/github.com/csi-addons/kubernetes-csi-addons/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222
2024-06-06T10:23:34.596Z    ERROR   Reconciler error    {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "e9e2be5a-d82c-401b-ba7b-88c1353d7473", "error": "Operation cannot be fulfilled on reclaimspacecronjobs.csiaddons.openshift.io \"reclaim-pvc-1717669414\": the object has been modified; please apply your changes to the latest version and try again"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/workspace/go/src/github.com/csi-addons/kubernetes-csi-addons/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/workspace/go/src/github.com/csi-addons/kubernetes-csi-addons/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/workspace/go/src/github.com/csi-addons/kubernetes-csi-addons/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222
2024-06-06T10:23:34.601Z    INFO    No upcoming scheduled times, requeue with delay till next run   {"controller": "reclaimspacecronjob", "controllerGroup": "csiaddons.openshift.io", "controllerKind": "ReclaimSpaceCronJob", "ReclaimSpaceCronJob": {"name":"reclaim-pvc-1717669414","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc-1717669414", "reconcileID": "c1fcce40-9b27-48c5-9968-890f8f3f4553", "now": "2024-06-06T10:23:34.601Z", "nextRun": "2024-06-07T00:00:00.000Z"}
2024-06-06T10:23:34.607Z    INFO    Successfully updated reclaimSpaceCronJob    {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "PersistentVolumeClaim": {"name":"reclaim-pvc","namespace":"reclaim-ns"}, "namespace": "reclaim-ns", "name": "reclaim-pvc", "reconcileID": "9eb81097-2ba2-4d10-8804-0b96ed41b9a9", "ReclaimSpaceCronJobName": "reclaim-pvc-1717669414", "Schedule": "@daily"}

PVC

$ k get pvc reclaim-pvc -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
reclaimspace.csiaddons.openshift.io/cronjob: reclaim-pvc-1717669414
reclaimspace.csiaddons.openshift.io/schedule: '@daily'
volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
volume.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
creationTimestamp: "2024-06-06T10:23:34Z"
finalizers:
- kubernetes.io/pvc-protection
name: reclaim-pvc
namespace: reclaim-ns
resourceVersion: "4813090"
uid: 82312fbe-0af8-460e-a34f-4c2af7da733b
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: rbd-reclaim
volumeMode: Filesystem
volumeName: pvc-82312fbe-0af8-460e-a34f-4c2af7da733b
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
phase: Bound

ReclaimSpaceCronJob

 $ k get reclaimspacecronjobs.csiaddons.openshift.io
NAME                     SCHEDULE   SUSPEND   ACTIVE   LASTSCHEDULE   AGE
reclaim-pvc-1717669414   @daily                                       7m22s
nixpanic commented 3 weeks ago

I think this looks good.

One thing I would like to know, is why the decision was made to have the annotation on the StorageClass a higher priority than the annotation on the Namespace.

When I think about this, the highest priority should be at the last object in a chain, and Namespace would come just before a PersistentVolumeClaim, after the global StorageClass. I can imagine that some apps (grouped by Namespace) require a different than the default schedule. By giving the StorageClass a higher priority, it is not so easy to configure an alternative schedule for just a single application (that may have multiple PersistentVolumeClaims).

iPraveenParihar commented 3 weeks ago

I think this looks good.

One thing I would like to know, is why the decision was made to have the annotation on the StorageClass a higher priority than the annotation on the Namespace.

When I think about this, the highest priority should be at the last object in a chain, and Namespace would come just before a PersistentVolumeClaim, after the global StorageClass. I can imagine that some apps (grouped by Namespace) require a different than the default schedule. By giving the StorageClass a higher priority, it is not so easy to configure an alternative schedule for just a single application (that may have multiple PersistentVolumeClaims).

By defining annotations at the StorageClass level, consistency is maintained across all PVCs associated with this StorageClass. While Namespace annotations will be good for workloads grouped by namespace as you mentioned. But again, this will apply to all PVCs within the namespace which I guess is doesn't provide fine-grained control over PVCs defined by SC?

nixpanic commented 3 weeks ago

By defining annotations at the StorageClass level, consistency is maintained across all PVCs associated with this StorageClass. While Namespace annotations will be good for workloads grouped by namespace as you mentioned. But again, this will apply to all PVCs within the namespace which I guess is doesn't provide fine-grained control over PVCs defined by SC?

The question is what users expect. A default schedule that is inherited by the StorageClass is more generic than an annotation on the Namespace, and an annotation on a PVC is less generic yet again. Some applications may require a different schedule/interval than the default, and being able to set that per Namespace seems useful to me.

To phrase it the other way around, setting an annotation on the Namespace is useless when it is already set on the StorageClass. Users could still modify it by setting an annotation per PVC, but that is less practical than setting it on a Namespace.

iPraveenParihar commented 3 weeks ago

The question is what users expect. A default schedule that is inherited by the StorageClass is more generic than an annotation on the Namespace, and an annotation on a PVC is less generic yet again. Some applications may require a different schedule/interval than the default, and being able to set that per Namespace seems useful to me.

To phrase it the other way around, setting an annotation on the Namespace is useless when it is already set on the StorageClass. Users could still modify it by setting an annotation per PVC, but that is less practical than setting it on a Namespace.

Thanks, I understood. If StorageClass has priority over Namespace annotation and if schedule is already set on StorageClass, then setting it again on the Namespace might seem redundant. And having StorageClass annotation the lowest priority users can set a default schedule at the StorageClass level and override at Namespace / PVC level.

@Madhu-1, if you agree the same then I'll update the PR accordingly.

Madhu-1 commented 3 weeks ago

IMO SC -> NS -> PVC takes the precedence order, where SC is the highest one where user sets it and PVC is the lowest one where user wants to override it. @nixpanic @iPraveenParihar WDYT? I think we can have more keys which can add on SC/Namespace level which will allow for overriding by next one. but again both directions will have its flaws.

nixpanic commented 3 weeks ago

I would see it like this:

  1. use the schedule on the PVC, if missing, then
  2. use the schedule from the Namespace, if missing, then
  3. use the schedule from the StorageClass, if missing, then
  4. don't create a ReclaimSpaceCronJob
Madhu-1 commented 3 weeks ago

I would see it like this:

  1. use the schedule on the PVC, if missing, then
  2. use the schedule from the Namespace, if missing, then
  3. use the schedule from the StorageClass, if missing, then
  4. don't create a ReclaimSpaceCronJob

@iPraveenParihar please go ahead with this one, @Rakshith-R any comments from you one this one?

Rakshith-R commented 3 weeks ago

I would see it like this:

  1. use the schedule on the PVC, if missing, then
  2. use the schedule from the Namespace, if missing, then
  3. use the schedule from the StorageClass, if missing, then
  4. don't create a ReclaimSpaceCronJob

@iPraveenParihar please go ahead with this one, @Rakshith-R any comments from you one this one?

Yes, this sounds good to me too.

Madhu-1 commented 2 weeks ago

@nixpanic PTAL, changes LGTM, if you are fine with the changes i will remove requested changes