jimsalterjrs / sanoid

These are policy-driven snapshot management and replication tools which use OpenZFS for underlying next-gen storage. (Btrfs support plans are shelved unless and until btrfs becomes reliable.)
http://www.openoid.net/products/
GNU General Public License v3.0
3.15k stars 309 forks source link

Feature Request: One off/triggered snapshots #108

Open redmop opened 7 years ago

redmop commented 7 years ago

zfsnap easily ties in with apt (and maybe rpm) and can do pre/post apt snapshots with their own expiry. This with etckeeper makes me very happy that I can recover from exploding updates. Also, the cron pseudo-entry "@reboot take-snapshot" also makes me happy. It's more for a bookmark of when a reboot/crash happened, so I don't keep them long.

jimsalterjrs commented 7 years ago

How do you envision this tying in?

Any manually taken snapshot with the proper name syntax will be interpreted by Sanoid as "one of its own" and handled according to that policy. For example if you've got hourly=36 and manually issue a command zfs snapshot pool/dataset@autosnap_yyyy-MM-dd_hh:mm:ss, Sanoid will let that snapshot live for 36 hours, after which it will purge it (assuming at least 36 total hourly snapshots would remain after it being purged).

What would you feel like you needed for things to "tie in" the way you envision?

Would it be sufficient if I loosened the type checking Sanoid uses to identify "its own" snapshots, so that it would recognize pool/dataset@autosnap_yyyy-MM-dd_hh:mm:ss-commentgoeshere as a snapshot it should be managing, for example? (Right now the type checking is too strict for that to pass.)

redmop commented 7 years ago

I got the idea from zfsnap:

-p prefix Enable filtering to only consider snapshots with prefix; it can be specified multiple times to build a list.

Your example works pretty well though maybe include a custom expiry like this: pool/dataset@autosnap_yyyy-MM-dd_hh:mm:ss_2w_pre-apt

Having both the comment and expiration in there makes quite a bit more work to parse it. zfsnap handles it like this: pool/fs@[prefix]Timestamp--TimeToLive

But a prefix might break compatibility.

jimsalterjrs commented 7 years ago

Not really sure how the -p prefix thing is intended to work here.

Can you be more concrete about what, exactly, you'd like Sanoid to do - how you'd invoke it, with what argument, with what expected outcome?

redmop commented 7 years ago

Ahh, ok. I'm a little scattered today.

In /etc/sanoid/sanoid.conf

[rpool/os]
    use_template = production
[template_production]
        hourly = 36
        daily = 30
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes
        tagttl_pre-install = 2w # snapshots tagged pre-install last 2 weeks
        tagttl_post-install = 2w # snapshots tagged post-install last 2 weeks
        tagttl_reboot = 1w # snapshots tagged reboot last 1 weeks

In /etc/apt/apt.conf.d/05sanoid

DPkg::Pre-Invoke       { "if [ -x /usr/local/bin/sanoid ]; then /usr/local/bin/sanoid --take-snapshots --verbose --tag pre-install; fi"; };
DPkg::Post-Invoke      { "if [ -x /usr/local/bin/sanoid ]; then /usr/local/bin/sanoid --take-snapshots --verbose --tag post-install; fi"; };

After an apt-get update: rpool/os@autosnap_yyyy-MM-dd_hh:mm:ss_2w_pre-install rpool/os@autosnap_yyyy-MM-dd_hh:mm:ss_2w_post-install

In crontab @reboot /usr/local/bin/sanoid --take-snapshots --verbose --tag reboot

or in /etc/rc.local /usr/local/bin/sanoid --take-snapshots --verbose --tag reboot

After a reboot: rpool/os@autosnap_yyyy-MM-dd_hh:mm:ss_1w_reboot

Of course, obey recursive and process_children_only

jimsalterjrs commented 7 years ago

I think you're maybe not just thinking about this correctly? You can accomplish it more sanely by defining a custom template without changing any sanoid code:

[template_apt]
    hourly = 0
    daily = 14
    monthly = 0
    yearly = 0
    autosnap = no
    autoprune = yes
    daily_warn = 0
    daily_crit = 0

This keeps sanoid from taking any snapshots automatically during sanoid --cron, but it will still purge them according to policy as they accumulate (with the caveat that you won't lose any until you have 14 of them and they get older than 14 days).

The only hitch in your giddyup at that point is relatively minor: taking a snapshot in the form of zfs snapshot pool/dataset@autosnap_2017-07-12_18:05:50_apt. So maybe you'd want a sanoid command-line argument to cause it to simply take a snapshot with the desired tag immediately, just to avoid you having to build the syntax in your apt.conf.d and where-ever?

jimsalterjrs commented 7 years ago

Oh, wait a minute. I haven't quite thought that through. That would work, but it would create dailies, not "apt"-lies.

Hm. I still feel like we're missing a cleaner way of accomplishing what you really want. I think just making it easy to ask Sanoid to manually make a snapshot, and append a tag to it, would probably do 90% of what you want without needing to muck about with much else.

If you used dailies, they'd expire after your normal daily interval along with all the other dailies. Or hourlies, and expire after your normal hourly interval; etc. Tagging them somehow - whether in a custom dataset property or in a suffix - would be nice to help you find them more easily. But I don't know that you really need a separate policy there.

Thoughts?

redmop commented 7 years ago

I agree on "there must be a cleaner way," I just don't know what that is at the moment.

They don't need to be a separate policy, though they need to be in a predictable one. That and visible (dataset property is nice) would cover it.

The [template_apt] would be missing the normal snapshots, I want those too. My plan isn't rpool/os for updates and reboots, it's all datasets (so [rpool] and [dpool]). Needing to list them out is fine.

So lets drop the tagging part a moment.

Given the following:

[rpool/os]
    use_template = apt
[template_apt]
    hourly = 0
    daily = 14
    monthly = 0
    yearly = 0
    autosnap = yes
    autoprune = yes

if I ran sanoid --take-snapshots && apt-get update && apt-get dist-upgrade && sanoid --take-snapshots daily AND sanoid --cron, what would I end up with? If i guess correctly, I would end up with my midnight snapshots, and a snapshot before and after each dist-upgrade. They would each last 14 days. If I put in an hourly=24, I would close my reboot snapshots after 24 hours (cron schedule willing) because the midnight run of cron would make the 'daily'. That seem about right?

jimsalterjrs commented 7 years ago

I'm not entirely sure I'm following that last paragraph, but let's say you had hourly=24, and you had 30 total hourly snapshots, some of which were taken with normal sanoid --cron, some of which were taken manually in some other way - either directly with zfs snapshot and a matching name pattern, or with some vapor sanoid argument that forced a single snapshot to be taken as directed.

Four of those thirty hourlies took place in the same hour, because three of them were manual, and one of them was taken by sanoid --cron.

All four will stay live until they're more than 24 hours old. As each becomes more than 24 hours old, it gets purged (assuming there are 24 remaining, not counting the one getting purged). Sanoid neither knows nor cares which one is the "real" hourly snapshot. You might care, and you might look for the information as to which was taken as a result of your apt shenanigans by looking either in a zfs dataset property, or in a suffix to the name (which currently actually wouldn't work, but that could be patched in).

Now, the other interesting question is what all you actually want snapshotted when you apt. Personally, I don't see much point in, and some good reasons not to, snapshot anything other than the actual root volume. But maybe your opinion differs. (And maybe your opinion is compelling enough for me to write code for; maybe it's something you need to do some shuffling for yourself - all that's still up for grabs.)

Right now, properly formatting the time to make the snapshot named correctly is a little bit of a pain in the ass, so I'd say we'd probably want something along the lines of sanoid --manual-snapshot --type=hourly pool/dataset. Possibly add in tagging of some kind like sanoid --manual-snapshot --type=hourly --tag=apt pool/dataset.

Are we getting close to what you're looking for here?

saurabhnanda commented 5 years ago

I think I have a similar request to what is being discussed in this issue. I want to take a daily snapshot immediately after the daily DB backup has actually completed. Some days it might take 30 mins, other days it might take 45 minutes. The foolproof way is to call sanoid with some command-line switch which forces it to take a daily snapshot immediately. I don't believe we have such a feature currently, right?

phreaker0 commented 5 years ago

@saurabhnanda not yet, PR welcome :-D

gnordli commented 4 years ago

Yes @jimsalterjrs what you have above is what I am thinking.

Right now, properly formatting the time to make the snapshot named correctly is a little bit of a pain in the ass, so I'd say we'd probably want something along the lines of sanoid --manual-snapshot --type=hourly pool/dataset.

I would actually be happy with just the

sanoid --manual-snapshot --type=hourly

I don't care if there are extra snapshots, they age out after 24 hours anyhow.

thanks, Geoff

mscdex commented 5 months ago

Just want to add my +1 for something like this since I need to make sure the snapshotting occurs at exactly the same time every day for daily snapshots. As far as I understand it, taking snapshots via --cron is subject to "drift" since the current date/time is used for each new snapshot and future snapshots will only be taken based on elapsed time since the previous snapshot.