oetiker / znapzend

zfs backup with remote capabilities and mbuffer integration.
www.znapzend.org
GNU General Public License v3.0
607 stars 137 forks source link

Introduce runonce --since=snapname syntax #492

Closed jimklimov closed 3 years ago

jimklimov commented 4 years ago

Initially based against an older master branch, will update for the PR.

The essence of the change is detailed in the help text for the new feature; I needed this to quickly update a remote-backup NAS where a colleague deleted all automatic snapshots "because there were so many of them!" and znapzend became confused about not seeing any of its timestamp-patterned snapshots but yet being unable to replicate from scratch because something existed on destination.

This is somewhat similar to the forceSnapshotName option introduced with #457 , but does not impact the name of the snapshot to create during this run - the snapshot named for --since=X should exist in the original dataset's history and then it would be sent among many and/or considered as a starting point for resynchronization of source and destination.

jimklimov commented 4 years ago

Posted the fix for 5.10 as a separate #493 PR as well.

oetiker commented 4 years ago

so by doing that all existing 'newer' snapshots on the destination drive will get removed ? maybe mention that in the documentation.

jimklimov commented 4 years ago

Good question really, gotta check in experiment. The original use-case (for which this was a practical fix) had same manually-named snapshots in source and destination pools, but no intersecting (or none at all on the destination, per original case) snapshots made and named by znapzend.

On the implementation and intention side, this PR just allows znapzend to "see" the named snapshot it did not create (not necessarily matching the configured pattern) while making a list of snapshots to compare and eventually zfs send something, and to protect this snapshot from being automatically cleaned at the end of this run-once. Otherwise depending on other options it may mean or not mean deleting some snapshots on the destination (e.g. interaction of zfs send with -R replication to zfs recv -F causing rollbacks and removal of whatever is not present on source, and/or whether truly incremental vs. big-jump send streams are used for zfs send -i/-I)... I suppose in that web of possible option combinations there is a place for "zfs recv" attempting to remove from destination some manually-named snapshots newer than the "since" argument, e.g. for a big-jump increment with skipIntermediates enabled, and no other manually-named snapshots seen through this mask (so not considered for being the latest common point)... though possibly it would/should fail to receive without rollback with default options, and then the user/admin is alerted to need fixing something (e.g. provide that newer snapshot as the --since=Xnewer argument). And after all that there may be cleanup of whatever automatic snaps expired, if the send went well, as usual.

Overall, my assumption is that:

...but the full matrix was not experimentally checked.

jimklimov commented 4 years ago

Rebased over current master for a cleaner changeset.

oetiker commented 4 years ago

if you 'send' an older snapshot then all the new stuff on the drive gets removed ... I think we even set the necessary --force option

so with your new option we can tell znapzend that there is a common snapshot between the two, although it is not named properly ... but one must be aware that running with this, all newer snapshots on the destination drives will get removed anbd replaced with the snapshots on the source drive

jimklimov commented 4 years ago

I guess so... then in application to how znapzend uses zfs, it inspects existing "interesting" snapshots comparing between origin and destination (with default "interesting ones" only including the pattern made by znapzend; with this option also adding the named "X" into the pattern), and:

So the review point is to summarize the above into option help text, right?

Notably (from "theoretic" details above but not deeply experimented), if the --runonce=D --since=X option is used, there may be practical cases where it does not cause removal and recreation of all destination snapshots newer than "X", and even cases where "X" would not appear on destination - for example if there are newer than "X" common snapshots between the two storages, or if an older than "X" snapshot is the newest common and other options ensure a replication which would skipIntermediates (perhaps we can later handle this case to ensure that the named snapshot exists in such case, effectively replicating up to it in a single hop and from it to one just created, which is not destructive to data already written on destination).

oetiker commented 4 years ago

so lets wait with this one until you have completed your deliberations with #497 ?

jimklimov commented 4 years ago

Maybe. On one hand it builds on top of this PR so both will be merged if that one is accepted, on another it can help about verifying (and/or ensuring) more determinate activity for the different scenarios this discussion has raised.

Curiously, while CI tests pass ok on Github, my local runs hiccup even with master branch... (followed up in #500)