vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.63k stars 1.39k forks source link

Document limitations: NFS volume with root_squash will require setting supplementalGroups on the nodeAgent #8107

Open kaovilai opened 1 month ago

kaovilai commented 1 month ago

What steps did you take and what happened:

Document limitations: NFS volume with root_squash will require setting supplementalGroups on the nodeAgent.

related:

And to get rid of this limitation, in the future we can consider creating node agent with user/groups per workload.

There are also issues with .snapshot directory used by some NFS servers.

".../.snapshot" directory is a snapshot copy directory used by several NFS servers. They are read only and velero are not able to restore to this path, nor should we give write access to this directory to velero.

You should disable client access to this snapshot copy directory.

NetApp ontap: deselect Show the Snapshot copies directory to clients or Allow clients to access Snapshot copies directory. https://docs.netapp.com/us-en/ontap/enable-snapshot-dir-access-task.html Portworx Flashblade: uncheck Snapshot option https://docs.portworx.com/portworx-backup-on-prem/reference/restore-with-fb

Anything else you would like to add:

Environment:

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

Lyndon-Li commented 1 month ago

The issue in #5578 was not with root_squash and is not a problem about root_squash. On the other hand, it is a problem that in the volume some folder should not be restored. The.snapshot folder contains the COW blocks of the snapshots, it is useless to copy the data to another volume (so it is useless to backup) or copying it to another volume may cause chaos, so the storage just keeps the data non-writable. Therefore, as mentioned in the issue, the solution is to skip the folder from backup.

Lyndon-Li commented 1 month ago

@kaovilai For the root_squash volumes, could you help to do a full test with Velero upstream? I suspect there are more issues besides supplementalGroups --- Velero also needs to restore the file's metadata, including username, user group etc. If the NFS volume is mounted with root_squash, I am afraid the fs-uploader (i.e., kopia, restic) will not able to change the username and group.

After the test, we will be clear of the problems and causes, then we can add the doc.

kaovilai commented 1 month ago

So check that node agent with supplemental groups are able to restore all metadata?

kaovilai commented 1 month ago

Will see if I can add a test case to e2e.