oetiker / znapzend

zfs backup with remote capabilities and mbuffer integration.
www.znapzend.org
GNU General Public License v3.0
607 stars 137 forks source link

Runonce --since=X made safer (only do dangerous stuff if --sinceForced=X is asked) #497

Closed jimklimov closed 3 years ago

jimklimov commented 4 years ago

This PR extends the proposal from #492 with ideas for failsafes and a more deterministic behavior that can be ordered by the user e.g. to enforce or not removal of destination snapshots newer than the one specified in --since=X argument to make it appear in the destination (for that effect, adds --sinceForced=X CLI option), and generally for other maintenance usages a --forbidDestRollback CLI option.

NOTE: at the time of this posting this is completely not tested in practice, a "theoretical PoC" code for discussion, so separate from the original PR adding --since=X.

jimklimov commented 4 years ago

Rebased over recent master...

jimklimov commented 4 years ago

Experimental data point:

Snapshot history attributed to source does not list the origin snapshot name:

NAME USED AVAIL REFER MOUNTPOINT nvpool/ROOT/hipster_2019.10-20200425T140249Z/opt@znapzend-auto-2020-04-27T06:30:00Z 0 - 684M - nvpool/ROOT/hipster_2019.10-20200425T140249Z/opt@znapzend-auto-2020-04-27T07:00:00Z 0 - 684M - ...

But let's see what happens:

:; ./bin/znapzend --features=zfsGetType,recvu,compressed,oracleMode --autoCreation --runonce=nvpool/ROOT/hipster_2019.10-20200127T101549Z --inherited --debug --since=@2020-04-14-09:55:32

this yields:

[2020-07-31 13:00:55.86696] [1318] [debug] sending snapshots from nvpool/ROOT/hipster_2019.10-20200127T101549Z to backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z [2020-07-31 13:00:55.86734] [1318] [debug] Are we sending "--since"? since=="2020-04-14-09:55:32", skipIntermediates=="0", forbidDestRollback=="1"

zfs send -Lce 'nvpool/ROOT/hipster_2019.10-20200127T101549Z@znapzend-auto-2020-07-31T11:00:40Z'|/usr/bin/amd64/mbuffer -q -s 256k -W 600 -m 1G|zfs recv -u 'backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z'

cannot receive new filesystem stream: destination 'backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z' exists must specify -F to overwrite it mbuffer: error: outputThread: error writing to at offset 0x0: Broken pipe [2020-07-31 13:00:56.90504] [1318] [warn] ERROR: cannot send snapshots to backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z mbuffer: warning: error during output to : Broken pipe

indeed - `since` is not `sinceForced` and does not let me shoot myself in the foot.
* Same with `--sinceForced` seems to work, sort of. Not sure it took the intermediate step I wanted it to:

:; ./bin/znapzend --features=zfsGetType,recvu,compressed,oracleMode --autoCreation --runonce=nvpool/ROOT/hipster_2019.10-20200127T101549Z --inherited --debug --sinceForced=@2020-04-14-09:55:32

...

[2020-07-31 13:08:55.18041] [1403] [debug] sending snapshots from nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt to backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt [2020-07-31 13:08:55.18066] [1403] [debug] Are we sending "--since"? since=="2020-04-14-09:55:32", skipIntermediates=="0", forbidDestRollback=="0"

zfs list -H -o name -t filesystem,volume nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt

zfs list -H -o name -t filesystem,volume backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt

zfs list -H -o name -t snapshot -s creation -d 1 nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt

zfs list -H -o name -t snapshot -s creation -d 1 backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt

zfs send -Lce 'nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt@znapzend-auto-2020-07-31T11:08:03Z'|/usr/bin/amd64/mbuffer -q -s 256k -W 600 -m 1G|zfs recv -u -F 'backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt'

...

indeed, only the newlymade snapshot name is here, without the one I expressly wanted to see in the history... back to drawing board... probably something about zero snaps in destination?.. or that "2020-04-14-09:55:32" is not in direct history (just the origin of) dataset being sent?..

:; zfs list -tall -d1 -r backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z

NAME USED AVAIL REFER MOUNTPOINT backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z 10.5G 590G 506M legacy backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z@znapzend-auto-2020-07-31T11:08:03Z 0 - 506M - backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z/opt 575M 590G 575M legacy backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z/usr 8.66G 590G 6.39G legacy backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z/var 787M 590G 179M legacy

jimklimov commented 4 years ago

Rebased over recent master again, added small bugfixes and generated manpage updates...

jimklimov commented 4 years ago

After the fix (notably, to look at snapshots not other datasets)...

... [2020-07-31 13:55:51.97800] [8480] [debug] sending snapshots from nvpool/ROOT/hipster_2019.10-20200127T101549Z to backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z [2020-07-31 13:55:51.97841] [8480] [debug] Are we sending "--since"? since=="2020-04-14-09:55:32", skipIntermediates=="0", forbidDestRollback=="1"

zfs list -H -o name -t snapshot nvpool/ROOT/hipster_2019.10-20200127T101549Z@2020-04-14-09:55:32

cannot open 'nvpool/ROOT/hipster_2019.10-20200127T101549Z@2020-04-14-09:55:32': dataset does not exist

[--since mode]: Source dataset nvpool/ROOT/hipster_2019.10-20200127T101549Z does not have a snapshot named by --since='2020-04-14-09:55:32' at /usr/share/src/znapzend/bin/../lib/ZnapZend.pm line 485.

zfs list -H -o name -t snapshot -s creation -d 1 nvpool/ROOT/hipster_2019.10-20200127T101549Z

zfs list -H -o name -t snapshot -s creation -d 1 backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z

zfs send -Lce -I 'nvpool/ROOT/hipster_2019.10-20200127T101549Z@znapzend-auto-2020-07-31T11:49:09Z' 'nvpool/ROOT/hipster_2019.10-20200127T101549Z@znapzend-auto-2020-07-31T11:55:37Z'|/usr/bin/amd64/mbuffer -q -s 256k -W 600 -m 1G|zfs recv -u 'backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z'

* same done with `--sinceForced` now fails to fulfill the user's harder requirement:

:; ./bin/znapzend --features=zfsGetType,recvu,compressed,oracleMode --autoCreation --nodestroy --runonce=nvpool/ROOT/hipster_2019.10-20200127T101549Z --inherited --debug --sinceForced=@2020-04-14-09:55:32 The --sinceForced option is used and will try to ensure that the snapshot exists in history of destinations (can delete and rewrite subsequent snapshots) at /usr/src/znapzend/bin/znapzend line 92. ... [2020-07-31 14:05:39.35143] [8715] [debug] sending snapshots from nvpool/ROOT/hipster_2019.10-20200127T101549Z to backup-adata/snapshots/nvpool/ROOT/hipster_2019.10-20200127T101549Z [2020-07-31 14:05:39.35192] [8715] [debug] Are we sending "--since"? since=="2020-04-14-09:55:32", skipIntermediates=="0", forbidDestRollback=="0"

zfs list -H -o name -t snapshot nvpool/ROOT/hipster_2019.10-20200127T101549Z@2020-04-14-09:55:32

cannot open 'nvpool/ROOT/hipster_2019.10-20200127T101549Z@2020-04-14-09:55:32': dataset does not exist

[--since mode]: Source dataset nvpool/ROOT/hipster_2019.10-20200127T101549Z does not have a snapshot named by --since='2020-04-14-09:55:32' at /usr/share/src/znapzend/bin/../lib/ZnapZend.pm line 485.

[2020-07-31 14:05:39.36427] [8715] [warn] User required --sinceForced='2020-04-14-09:55:32' but there is no match in source dataset nvpool/ROOT/hipster_2019.10-20200127T101549Z at /usr/share/src/znapzend/bin/../lib/ZnapZend.pm line 487. ... [2020-07-31 14:05:39.43284] [8715] [warn] ERROR: suspending cleanup source dataset because 7 send task(s) failed: [2020-07-31 14:05:39.43296] [8715] [warn] +--> User required --sinceForced='2020-04-14-09:55:32' but there is no match in source dataset nvpool/ROOT/hipster_2019.10-20200127T101549Z at /usr/share/src/znapzend/bin/../lib/ZnapZend.pm line 487. ...


note that although `die()` is used in the second case, this block of code is handled inside an `eval{}` and so throwing the error does not actually kill the whole script, just skips this dataset and the final cleanup. Not sure why it adds extra newlines after printing the message though.
jimklimov commented 4 years ago

(WIP for some other cases)

oetiker commented 3 years ago

please turn it into a PR when you are ready :)

jimklimov commented 3 years ago

Tested that --since=X mode does not destroy snapshots to ensure that "X" appears on destination, if there is already a newer common snapshot to continue resync from, but does ensure that "X" appears in destination history if destination's last common snapshot is older than "X" or destination was just autoCreated. It can also remove newer than "X" automatic snapshots if they are not (anymore) common, e.g. source was automatically cleaned by policy, and that way the "X" can also get put into destination history.

Tested that --sinceForced=X mode can destroy snapshots on destination to ensure that "X" appears there, and can use a previous older common snapshot to increment from it (not from scratch) if there is one; can also resync from scratch otherwise. Well, that's what the user explicitly asked for in such case.