Local-pvc-releaser is a Kubernetes controller that improves the efficiency of managing Persistent Volume Claims (PVC) when unexpected node termination occurs by the cloud provider. In cases like this, the Local-pvc-releaser will delete the relevant PVCs as long as they are bounded to a Persistent Volume (PV) that represent a local storage on the faulty node.
The Local-pvc-releaser controller automate the recovery process for pods incase their associated PVCs is bounded to a PV that represents a local storage drive on a faulty node.
Where previously, manual action had to be taken in order to recover the related pods as their state moved to be "Pending", expecting that the faulty node will recover - Something that will not happen as the faulty node got terminated.
The Local-pvc-releaser take an active action by deleting those PVCs and let the pods create a new one instead. The creation of a new PVC will represent a demand for a new node creation (as long as there are no available resources in the cluster) for the common autoscalers. When the relevant resources will be allocated, the Kubernetes scheduler will schedule the pod and complete the recovery process..
Note:
The Local PVC Releaser relies on the well-known Kubernetes label volume.kubernetes.io/selected-node
to link Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) with a terminated node.
Consequently, PVs created by static storage provisioners, such as the local-static-provisioner, will not be managed because the binding between PV and PVC is not performed by the Kubernetes control plane and therefore, this well-known label will not be attached.
The Local-pvc-releaser controller listens to the Kubernetes Node Controller running as part of the cluster control-plane.
The Kuberentes Node Controller is generating "RemovingNode" event upon any node object removal. This is usually happens when you scale down your cluster or if unexpected termination happen to
one of the master/worker nodes.
The Local-pvc-releaser watch those events and reconcile the state of the PVC that are bounded to a PV objects generated from a local storage on the faulty node.
By reconciling (deleting) the needed PVCs, The pod can create a new PVC object and by that, recover as long as there will be available/new resources for him to be scheduled with.
For deploying this controller, You’ll need a Kubernetes cluster to run against. You can use KIND to get a local cluster for testing, or run against a remote cluster (using the current context in kubeconfig).
Deploying the controller using Helm by:
$ helm repo add local-pvc-releaser https://AppsFlyer.github.io/local-pvc-releaser
$ helm install -n <namespace> <release-name> local-pvc-releaser/local-pvc-releaser
For more information, please refer here.
To uninstall/delete the local-pvc-releaser
deployment:
$ helm delete --purge local-pvc-releaser
Local-pvc-releaser controller is publishing the base metrics that are provided by KubeBuilder + additional custom metric indicating about successful PVC deletion and exposed by Prometheus exporter. For more information, please refer here.
deleted_pvc
Labels: namespace, controller_name, dryrun
Description: The number of successful PVC objects that got deleted by the controller
We appreciate and welcome any initiative for improvement. Before raising a PR, Kindly make sure that your code passed all the required CI stages successfully.
Deploy by:
make deploy
Or, Selectively, deploy the controller with different image tag by:
make deploy IMG=<some-registry>/Local-pvc-releaser:tag
UnDeploy the controller from the cluster:
make undeploy