loft-sh / vcluster

vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
https://www.vcluster.com
Apache License 2.0
6.92k stars 427 forks source link

PVs using storage class do not inherit reclaimPolicy #2098

Closed mimartin12 closed 2 months ago

mimartin12 commented 2 months ago

What happened?

I deployed a storage class with a reclaimPolicy: Retain, but the PVs that are created inside the vclsuter using this storage class have theirs set to delete.

What did you expect to happen?

PVs created inside the vcluster inherit the host's storage class reclaimPolicy.

How can we reproduce it (as minimally and precisely as possible)?

  1. Deploy custom storage class with retainPolicy: Retain on host.
  2. Create PV in vcluster using new storage class.

Anything else we need to know?

vcluster storage classes:

image

vcluster PVs:

image

Host cluster Kubernetes version

```console $ kubectl version Client Version: v1.29.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.7 ```

vcluster version

```console $ vcluster --version vcluster version 0.20.0 ```

VCluster Config

``` controlPlane: distro: k3s: enabled: true sync: fromHost: storageClasses: enabled: true ```
mimartin12 commented 2 months ago

Looking at the PVs in the vcluster namespace on the host. I can see that it's creating new PVs instead of reusing them. The first one that was created:

pvc-c9cfe637-bd7e-4c22-bcfc-b13577fb1b84   1Gi        RWX            Retain           Released   vcluster/mssql-backups  azurefile-ee   <unset>                          31m

Then it was released, and a new one created:

pvc-267684a2-070c-4e10-98f8-f2bc50b2be29   1Gi        RWX            Retain           Bound      vcluster/mssql-backups   azurefile-ee   <unset>                          3m52s
mimartin12 commented 2 months ago

This seems nearly identical to https://github.com/loft-sh/vcluster/issues/600.

rmweir commented 2 months ago

Hello @mimartin12, thank you for reaching out!

Since the vCluster cannot create volumes and cannot access the PersistenVolumes (PVs) on the real cluster, it is creating fake persistent volumes which are not exact replicas of the original but placeholders. This way the PersistentVolumeClaim (PVC) on the vCluster has a PV to be bound to. Here is a look at a PV in a real vCluster of mine. You will see there are labels indicating that this is a fake, placeholder PV. The driver is also set to "fake".

apiVersion: v1
items:
- apiVersion: v1
  kind: PersistentVolume
  metadata:
    annotations:
      kubernetes.io/createdby: fake-pv-provisioner
      pv.kubernetes.io/bound-by-controller: "true"
      pv.kubernetes.io/provisioned-by: fake-pv-provisioner
    creationTimestamp: "2024-09-13T22:28:42Z"
    finalizers:
    - kubernetes.io/pv-protection
    labels:
      vcluster.loft.sh/fake-pv: "true"
    name: pvc-0e3fdc86-e3c7-4722-b949-dabe2220cb6e
    resourceVersion: "526"
    uid: 8cd9ad6e-d9d1-4a95-82d2-dc692b3808ba
  spec:
    accessModes:
    - ReadWriteOnce
    capacity:
      storage: 500Mi
    claimRef:
      apiVersion: v1
      kind: PersistentVolumeClaim
      name: pvc-test
      namespace: default
      resourceVersion: "522"
      uid: b6a61265-25c4-494a-ace6-4b1154239935
    flexVolume:
      driver: fake
    persistentVolumeReclaimPolicy: Delete
    storageClassName: standard
    volumeMode: Filesystem
  status:
    lastPhaseTransitionTime: "2024-09-13T22:28:42Z"
    phase: Bound
kind: List
metadata:
  resourceVersion: ""

You could have the PVs be exact replicas by enabling PV syncing. We recommend this for certain CSI drivers and use with certain tools like velero. You can read about our PV syncing and the pseudo syncing mechanism in the syncing persistent volumes documentation. Here is the vCluster code driving this mechanism. If you have any input on how we can clarify the documentation, we would love to hear that feedback.

Now, onto your second issue. Notice that the first PVC is in a Released state but still has a claim attached. In Kubernetes, PVs with a Retain reclaim policy still require admin intervention before they can be moved into the Available state. The intention is to provide an admin with an opportunity to do something with the data in the PV before anything else takes place. The section Why change reclaim policy of a PersistentVolume of the Change the Reclaim Policy of a PersistentVolume docs describes the use case as follows:

With the "Retain" policy, if a user deletes a PersistentVolumeClaim, the corresponding PersistentVolume will not be deleted. Instead, it is moved to the Released phase, where all of its data can be manually recovered. The above, along with a work around is described in this Kubernetes issue. You can delete the claimRef field from the PV and it will move to the Available state. At this point it can be bound again.

The mentioned issue involving vCluster's interaction with the Retain reclaim policy sounds similar but included a bug. PersistentVolumes with Retain reclaim policy were just being deleted and not retained at all. This was in a much earlier iteration of vCluster and has since been resolved.