Open onedr0p opened 1 year ago
We've been reluctant to widely support deleting temporary pvcs, because with other movers (such as rsync-tls where we expect to replicate many times on a schedule) having the data there is desirable so that the entire contents of the pvc do not need to be copied each time. If a user wants to make sure the pvc is gone entirely they can remove the replicationdestination or the pvc itself.
However, for a restic-specific scenario where you're doing a restore, it does make sense that if you want to restore again you may want to clean up files that were deleted at the source. It looks like restic has a new feature https://github.com/restic/restic/issues/2348 which we could possibly leverage. @onedr0p would this cover your scenario if we were to expose this feature when we upgrade to the latest restic (v0.17.0)?
One thing to keep in mind: this won't help you necessarily if you're using the volumepopulator as the pvc provisioned is a 1x operation (uses whatever is the latest snapshot from the replicationdestination at the time) and will not be updated if you run another sync in your replicationdestination.
I believe that restic feature would cover my use-case.
I haven't thought about this in awhile because the volume populator feature has pretty much killed the need to restore data over a PVC with existing data. With the volume populator, I just need to nuke the workload from my cluster and add it back and let volsync restore it.
This feature would still be useful for me in cases where I want to restore volumes adhoc without the volume populator.
Describe the feature you'd like to have.
Add an option to the data movers that removes all data on the replicationdestination prior to restoring data from a backup.
What is the value to the end user? (why is it a priority?)
When you restore data to a PVC with existing data the existing data is not removed. This isn't desirable if you are restoring data and want the source data to match the destination data.
How will we know we have a good solution? (acceptance criteria)
If an option to remove data prior to a restore is implemented.
Additional context
Imagine if you destroy your cluster and want to recreate it. If using GitOps tools like Flux or Argo the bootstrap process usually involves spinning everything up in the cluster, which means applications will start and the PVCs will be tainted with data the application has written. An option to delete the data in the PVC would be excellent when you scale down the workload to restore data.