zfsonlinux / zfs-auto-snapshot

ZFS Automatic Snapshot Service for Linux
GNU General Public License v2.0
840 stars 242 forks source link

snapshots cleaned up globally #67

Open bartmeuris opened 7 years ago

bartmeuris commented 7 years ago

When specific volumes are specified instead of using //, the 'old' snapshots are cleaned up from the entire system.

This is problematic since I have 2 systems both running zfs-auto-snapshot which do a periodic zfs send/receive to each other, and zfs-auto-snapshot destroys parts of the synced content.

Example: the following command is executed in cron:

zfs-auto-snapshot --quiet --syslog --default-exclude --label=frequent --keep=4  -r tank/share

I do a zfs recv to tank/backup/... - which means zfs-auto-snap_hourly-... snapshots exist there too. These are however also cleaned up - screwing up the send/receive.

Currently I work around this by changing the prefix, but I still think this is serious issue which is destroying data unexpectedly.

mailinglists35 commented 7 years ago

@kpande do you mean to use bookmarks on the synchronization code? if yes, do you have a working synchronizing code that uses bookmarks, and if yes, could you post it? that would be very useful! :)

mailinglists35 commented 7 years ago

@kpande I just noticed your comment from back in 2015, are you still using/maintaining that zfs-auto-replicate script, and if yes, do you use bookmarks in it? could you share it?

bartmeuris commented 7 years ago

@kpande using bookmarks doesn't solve the issue, at least I don't see how? Not only that, the zfs-auto-snapshot script doesn't use or support bookmarks.

This is a case where the script causes data-loss, it removes snapshots it shouldn't touch at all - which imho is pretty serious.

The case is pretty simple, it's the equivalent to rm foo* in a folder removing all files matching foo* across the entire filesystem, which is not exactly something you want.

The snapshots follow the same naming scheme by default everywhere, and the cleanup procedure only takes into account the name of the snapshots, not the dataset/filesystem.

So if you have 2 machines syncing snapshots to each-other, the do_snapshots function does a sweep over the $SNAPSHOTS_OLD list, that simply contains all snapshots across the filesystem, including the location where I do the zfs recv to from the other machine. Since it only takes into account the name of the snapshot, not the dataset, it also deletes the snapshots I just synced from another machine that serve as backup.

It simply should not touch snapshots in datasets that are explicitly not being handled by zfs-auto-snapshot on that machine, simply because they happen to have a name matching a pattern it wants to delete.