openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.68k stars 1.76k forks source link

Add support for Entire Send in receive_resume_token (zfs send -I) #16764

Open Haravikk opened 1 week ago

Haravikk commented 1 week ago

Describe the feature would like to see added to OpenZFS

When resuming an interrupted send of multiple snapshots, zfs send -t should attempt to send not only the resumed snapshot, but all following snapshots that were originally requested, if possible.

Consider pool1/dataset has ten snapshots named @snapshot1, @snapshot2, ...@snapshot10, and the following command is used:

zfs send -R pool1/dataset@snapshot10 | zfs receive -sd pool2

During the transfer of @snapshot3 the send is interrupted, and the command is resumed as follows:

zfs send -t "$(zfs get -Ho value receive_resume_token pool2/dataset)" | zfs receive -sd pool2

The send should now produce a stream to finish @snapshot3, before continuing to send @snapshot4 etc. until it reaches @snapshot10 as originally requested.

In the event that @snapshot10 no longer exists, an error will be produced after @snapshot3 has finished sending, since there is no way to continue the original request.

How will this feature improve OpenZFS?

It will make it easier to resume interrupted sends involving multiple snapshots as a result of zfs send -R or zfs send -I @snap or similar.

Additional context

The behaviour of this feature with regards to zfs send -R is a bit complex – the current behaviour effectively ignores the -R flag already, so the most basic proposal would be for the resume to be limited only to the specific dataset that was being resumed (all snapshots will be sent, but child/sibling datasets will be skipped).

However, ideally the resume should resume the full replication stream if that is at all possible – in the worst case this will resume the interrupted snapshot, finish that and then fail, resulting in the same as current behaviour, but in an ideal case it would also complete the replication stream by sending all other snapshots for the current dataset, then moving onto the remaining datasets that were not started (if any).