vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.58k stars 1.39k forks source link

Request four new plugin hooks: Pre backup, Post Backup, Pre Restore and Post Restore #4067

Open brito-rafa opened 3 years ago

brito-rafa commented 3 years ago

Describe the problem/challenge you have We need to perform safe stateful migrations in scale across different clusters distributions and across different storage providers. Podhooks are useful for quiesce/unquiesce workloads but we observe multiple times that platform engineers do not have the luxury/visibility/time/knowledge to go to each pod and add specific commands to quiesce/unquiesce workloads. Additionally, today Velero’s Restic integration does not backup/migrate orphan PVC/PV pairs (without a pod).

Describe the solution you'd like We propose to create four new plugin hooks: PreBackup PostBackup PreRestore PostRestore

Once the hooks are implemented on the Velero core, we would write a custom plugin for Velero that would get triggered by the PreBackup hook. Prior to starting the backup and building the serialization of objects, this plugin would quiesce all the workloads setting replicas=0 on deployments, statefulsets, etc, and then mount all PVC/PV pairs with a staging pod. Then Velero would take the backup of all objects: the deployments with replicas=0, and the staging pod and all PVC/PV pairs.

To complete the migration of the workload, a restore would happen on the destination cluster with the quiesced workload. A custom Velero plugin would get triggered on the PostRestore hook and unquiesce the workload deleting the staging pod and reinstating the number of the original replicas on the deployment, statefulsets, etc.

Of course, the pre/post backup/restore custom logic can be written outside Velero, but we would need an operator/orchestrator for such. We think Velero can add this capability under its umbrella. The same hooks can integrate Velero with other capabilities (imagine a PreRestore velero plugin that calls Cluster API to scale up cluster size prior to restore).

cc: @dsu-igeek @eleanor-millman @codegold79

Anything else you would like to add: I will put a design proposal for such hooks, please stay tuned. And volunteering to write those if the design is approved. Once these hooks are part of the Velero core, we can consider opening source the Velero plugin that generically quiesces/unquiesces workloads (with a big disclaimer that specific pre/post pod hooks are preferrable).

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

brito-rafa commented 3 years ago

As discussed on the community call on Aug/24, the fundamental difference between these new proposed plugin hooks from existent BackupItemAction and RestoreItemAction is the proposed plugins are executed once per backup and restore. The ETA of the design doc is Monday.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

reasonerjt commented 2 years ago

Moving this issue out of v1.9 per discussion with rafael, we should pick the pr #4610 when we wanna implement it in the future.

rajivml commented 2 years ago

We also have a similar use case , more context about the use case can be found here

image
batistein commented 2 years ago

We would really need this to backup capi cluster with velero (we need to pause the reconciliation of the cluster first by changing the specs of the cluster object).

kaovilai commented 1 month ago

I would like to be assigned this issue! cc: @sseago @shubham-pampattiwar