vmware-tanzu / velero-plugin-for-vsphere

Plugin to support Velero on vSphere
Other
59 stars 49 forks source link

Velero restores volume as thick-provisioned, although originally it was thin-provisioned. #447

Open son-la opened 2 years ago

son-la commented 2 years ago

Describe the bug

Velero restores volume as thick-provisioned, although originally it was thin-provisioned.

Not sure if it is really a bug but I couldn't find any document explaining this.

To Reproduce

The backup scope is 1 PVC of size 15GB (thin provisioned) with very little content in it (<100MB). The corresponding VMDK file for that VPC is pretty reasonable size.

When the snapshot is uploaded to S3, the uploaded object is the size of PVC (15GB). This is explained in here already: https://github.com/vmware-tanzu/velero-plugin-for-vsphere/issues/296

Next I delete the PVC and do velero restore.

I notice a new thick-provisioned volume is attached to the virtual machine.

Expected behavior

The restored volume is thin-provisioned like it was originally for the PVC.

Troubleshooting Information

Velero 1.8 Velero vSphere plugin 1.3.1 Velero AWS plugin 1.3.0 vSphere 6.7u3 vSphere CSI 2.3.1 Kubernetes cluster flavor (Vanilla/ Rancher RKE) v1.20.11

Screenshots

[If applicable, add screenshots to help explain your problem.]

Anything else you would like to add:

Slack discussion: https://kubernetes.slack.com/archives/C6VCGP4MT/p1645015634605989

xing-yang commented 2 years ago

At restore time, dynamic provisioning is used to create a new PVC. The destination datastore will be chosen based on the associated StorageClass (storage policy). Which StorageClass is used for the original PVC? Is it still there at restore time? If so, can you check if it is associated with the destination datastore?

son-la commented 2 years ago

Thank you for the answer.

For storage class configuration, I actually left the storage policy name to be empty. I thought that it wouldn't have any impact since any new PVC have always been thin-provisioned already so far when using that storage class. Is this a root cause in here? I cannot find out yet how to create a vSphere storage policy, which force volume to be thin-provisioned.

In the storage class configuration, a datastore URL is set and the same storage class can be seen for the restored PVC. Only thing I've done is to delete the PVC, then run velero restore afterward.

Here is the original disk before deletion and backup image

And here is the disk for the restored PVC image

I notice that both storage policy are "Datastore Default" but their "Type" are different. The restored one become Thick provision.

lintongj commented 2 years ago

@son-la I believe it is a NFS datastore, right? If yes, it is a known issue in VDDK. Check out the article, https://kb.vmware.com/s/article/2137818.

FYI, in the restore path, velero-plugin-for-vsphere would dynamically provision a PVC for the backup and then use VDDK to overwrite its underlying volume with backup data using VDDK.

son-la commented 2 years ago

Thank you @lintongj , I checked the case. Actually I'm having VMware ESXi, 6.7.0, 15160138 and the data store is VMFS 5. So most likely not the case that you referred above.

lintongj commented 2 years ago

The same issue in VMFS datastore is not an issue I am aware of. BTW, velero-plugin-for-vsphere requires VC and ESX with 6.7U3 or above (link). It might not be the direct cause of this issue. But, it would be great to try it out with setup that meets the prerequisite.

son-la commented 2 years ago

The version in vCenter UI is abit misleading. From this page, the build number should be newer than 6.7u3: https://kb.vmware.com/s/article/2143832 image

lintongj commented 2 years ago

Ok. That's fine. I am checking with VDDK team about the observation in VMFS 5 you had. Will keep you updated when I get any feedback.

lintongj commented 2 years ago

@son-la I confirmed with VDDK team internally that the issue you have is not the same as the known VDDK issue on NFS datastore.

However, it is expected by the current design of velero-plugin-for-vsphere. As I mentioned earlier, on the restore path, the whole volume that was dynamically provisioned would then be overwritten with backup data(on the backup path, full backup is taken always). It would change disk provisioning type from thin to thick.

It is how it works right now. It will stay with this behavior until we add incremental backup support. Thanks.