Closed jimklimov closed 3 years ago
My primary question in this direction at the moment is, whether current code in the ZnapZend.pm sendRecvDestroy()
suffices for this (so making the send/recv part optional - and enabled by default - would cut it), or would this new feature need a new way to discover which snapshots exist on all src/dst combos and are obsolete to be killed off? At least, to solve the original practical problem, find which src snaps we won't be sending anymore (some newer snaps are seen on all dst's)?..
I would also assume that just running the cleanup step should be feasible
It seems so, but I am not sure without very deep digging in the logic if the cleanup step does not assume that send/recv happened before it, and succeeded, so all snaps are by definition in place. The start of send/recv routine has a good-looking test for whether there are compatible snaps in the destination... maybe it can be snatched and adapted into the "destroy-only" codepath.
Currently I'm busy with other work so background-processing this idea in general and inputs/corrections are welcome ;)
yes, it can not know if it 'may' remove something without actual syncing ... note there is another patch which records the status of the syncing to work around this problem
I wonder if it is easy to use the existing code to just tackle the original problem directly:
Unless I've missed something, this should produce warnings about what could be blocking normal operations for the daemon, as well as a list of snapshots safe to remove almost as quickly as it would take to zfs list -o ...
the root SRC/DST datasets involved (preferably including recursion and so reducing the amount of zfs
callouts).
Does anything stick out as a problem in such approach?
As an added benefit, from today's experience, such "safe snapshot deletion" would especially help when the pool (or quota) are filled and no more snaps can be created. In this case, honest sending (and subsequent post-factum cleanup) is blocked from succeeding, even if a lot of space can be gained by dropping older unneeded snapshots. In fact, might make sense to start the big loop with such cleanup, when/if it is implemented and stable.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
While investigating the issue I have with znapzend sometimes refusing to do cleanup of old snaps and/or not managing to do it in time (when I/Os are slow), I thought it would be nice if there were a mode to just drop the older autosnaps safely - e.g. if they are beyond retention policy timeout and their disappearance would not preclude further incremental sync's... because the deepest original problem in the stack is the (source) system overflowing the pool with data referenced in snapshots that we expected to be long gone by the time we still see them. And manually killing off older snapshots sometimes did misfire, when no common ones remained between dst and src.