digint / btrbk

Tool for creating snapshots and remote backups of btrfs subvolumes
https://digint.ch/btrbk/
GNU General Public License v3.0
1.72k stars 124 forks source link

Add exclude patterns #258

Open marstj opened 6 years ago

marstj commented 6 years ago

I would love to see support for exclude patterns so that certain files and directories can be excluded from snapshots.

Currently I use nested subvolumes to exclude certain subdirectories in my home directory where only build artefacts etc end up. Since only the top level subvolume gets snapshotted that means those subdirectories/volumes effectively are left out of the snapshots.

However that has some drawbacks:

I think exclude patterns could work by taking writable snapshots, delete all files and dirs in them that match the patterns, and then change them to read-only.

As a simpler fallback to this feature, I'd like to see a btrbk config option that makes it create writable snapshots. Then I could just run my own script to regularly delete the stuff I don't want from them.

Thanks!

digint commented 6 years ago

I feel for you with this, I also have some subvolumes in my home (and elsewhere) for exactly this purpose. While this comes with some other benefits (e.g. setting "nocow" on cache directories), it's also a pain in the ass.

I think exclude patterns could work by taking writable snapshots, delete all files and dirs in them that match the patterns, and then change them to read-only.

This should probably work, something like:

btrfs subvolume snapshot <src> <snapshot>
exec cleanup_script <snapshot>
btrfs property set -ts <snapshot> ro true

Further considerations:

  1. Using btrfs property set (from my point of view a low-level command to be used with care) in combination with incremental send/receive needs some testing, and might be "asking for trouble". Another option would be to btrfs subvolume snapshot -r <snapshot> <snapshot-ro>; btrfs subvolume delete <snapshot>, but this would break the parent/child relationship (parent_uuid) and efficiently disable some nice btrbk features.

  2. I definitively don't want to implement complicated include/exclude patterns. A simple list of excluded files/directories should be simple enough and already help a lot, and adding the possibility to execute a user-defined cleanup script (e.g. find places with CACHEDIR.TAG and delete them) should be doable.

  3. Special care must be taken when deleting stuff, as this has to done as root (rm -rf as root is never a good idea, but unavoidable in this case. Consider a user configuring "exclude_dirs ../../../../home"). Not sure if this can be mitigated with some kind of sandboxing.

@marstj do you have some experience with this (point 1)?

Note that you can already do this with current btrbk:

btrbk snapshot --preserve --format=raw | while read -r line; do
    [[ "$line" == "TRANSACTION LOG" ]] && continue
    # ^^^ bug in btrbk < 0.27.1, should not be printed for raw format
    eval $line
    [[ "$format" == "transaction" ]] || continue
    [[ "$type"   == "snapshot"    ]] || continue
    [[ "$status" == "success"     ]] || continue
    btrfs property set -ts "$target_url" ro false
    cleanup_script.sh "$target_url"
    btrfs property set -ts "$target_url" ro true
done
btrbk resume
marstj commented 6 years ago

Awesome, thanks for the tip! Looks like it'd solve my immediate problems.

  1. Can't say that much about it, sorry. Right now I'm using a hacky script that just unsets the ro flags on the snapshots and deletes the stuff, and I run it manually every so often. I also only use btrbk as a local time machine - remote backups are already provided for me.

  2. Not sure what you mean with "complicated" patterns, but for such a feature to be useful to me it should handle ordinary shell glob patterns, preferably including **. (Would be just a couple of lines in Python, at least.)

  3. That's indeed a good reason for not solving it through a shell script. With a decent file API it's easy to match patterns stringently with properly normalised paths. Concurrent access tricks are perhaps easiest to avoid by temporarily mounting the snapshot in a location that's inaccessible to all non-root users during the delete.

Thank you!

marstj commented 6 years ago

I implemented the approach above, and my experience is that deleting large (1+ GiB) directories in the snapshots completely kills btrfs performance. The system doesn't seem to do much but most filesystem operations hang during the delete. I've had to hard reset the computer several times when I gave up on waiting.

So unfortunately with btrfs in its current state (I'm using kernel 4.17) the solution discussed here seems to be less useful. It may still be a complement to converting subdirectories to subvolumes, for files and small directory trees.

marstj commented 6 years ago

Turns out that the extensive hangs occur when quota is enabled. After disabling it on the filesystem level rm works fine on the snapshots. I can now delete 100+ GiB from the snapshots this way, and although rm takes a minute or two it doesn't lock up other activity noticeably.

Massimo-B commented 6 years ago

Excluding parts from snapshots is what btrfs subvolumes are made for. Even if some guides were telling that nested subvolumes are not very useful, I use exactly these nested subvolumes for excluding parts from snapshots without having too many complex specific mounts in the fstab. Having nested subvolumes you can only mount the surrounding subvolume and still have the nested ones mounted as well, but excluded from snapshots. This is not very useful to exclude specific files, but only directory trees. But I doubt it is useful to exclude single files. Even nocow I usually set for whole trees like VirtualMachines.

On Gentoo I exclude parts of the / root fs that are just mirrored from the internet, caches and temporary build locations:

# btrfs subvolume list / |grep -v snapshots
ID 607 gen 407721 top level 655 path usr/portage
ID 608 gen 407735 top level 655 path var/cache
ID 609 gen 407724 top level 655 path var/tmp
ID 611 gen 407724 top level 655 path var/lib/layman
ID 612 gen 381096 top level 655 path usr/src
ID 653 gen 407744 top level 5 path home
ID 654 gen 394242 top level 5 path data
ID 655 gen 407744 top level 5 path root
ID 667 gen 392508 top level 654 path data/VirtualMachines

The first 5 subs are nsted in root, the last sub is nested in data and nocow for VMs. Fstab has only mounted subvol=root, data and home for that structure.

tbertels commented 7 months ago

I feel for you with this, I also have some subvolumes in my home (and elsewhere) for exactly this purpose. While this comes with some other benefits (e.g. setting "nocow" on cache directories), it's also a pain in the ass.

Off topic, but I think still useful to mention:

man 5 btrfs

BTRFS SPECIFIC MOUNT OPTIONS This section describes mount options specific to BTRFS. For the generic mount options please refer to mount(8) manual page. The options are sorted alphabetically (discarding the no prefix). NOTE: Most mount options apply to the whole filesystem and only options in the first mounted subvolume will take effect. This is due to lack of implementation and may change in the future. This means that (for example) you can't set per-subvolume nodatacow, nodatasum, or compress using mount options. This should eventually be fixed, but it has proved to be difficult to implement correctly within the Linux VFS framework.

The right way to do it is chattr -R +C

C A file with the 'C' attribute set will not be subject to copy-on-write updates. This flag is only supported on file systems which perform copy-on-write. (Note: For btrfs, the 'C' flag should be set on new or empty files. If it is set on a file which already has data blocks, it is undefined when the blocks assigned to the file will be fully stable. If the 'C' flag is set on a directory, it will have no effect on the directory, but new files created in that directory will have the No_COW attribute set. If the 'C' flag is set, then the 'c' flag cannot be set.)

Silverbullet069 commented 1 month ago

Excluding parts from snapshots is what btrfs subvolumes are made for. Even if some guides were telling that nested subvolumes are not very useful, I use exactly these nested subvolumes for excluding parts from snapshots without having too many complex specific mounts in the fstab. Having nested subvolumes you can only mount the surrounding subvolume and still have the nested ones mounted as well, but excluded from snapshots. This is not very useful to exclude specific files, but only directory trees. But I doubt it is useful to exclude single files. Even nocow I usually set for whole trees like VirtualMachines.

@Massimo-B Hi, sorry to bring this up again but I'm losing my mind now. I tried creating nested subvolumes to exclude some directories but it's not working.

This is my /etc/fstab:

#
# /etc/fstab
# Created by anaconda on Sat Sep 14 11:38:38 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
UUID=09f4af86-bd2a-4050-863c-8de6a5e16fe2 /                       btrfs   subvol=@,ssd,noatime,space_cache=v2,compress=zstd:1,commit=120,discard=async,x-systemd.device-timeout=0 0 0
UUID=09f4af86-bd2a-4050-863c-8de6a5e16fe2 /home                   btrfs   subvol=@home,ssd,noatime,space_cache=v2,compress=zstd:1,commit=120,discard=async,x-systemd.device-timeout=0 0 0
UUID=09f4af86-bd2a-4050-863c-8de6a5e16fe2 /opt/servarr            btrfs   subvol=@servarr,ssd,noatime,space_cache=v2,compress=zstd:1,commit=120,discard=async,x-systemd.device-timeout=0 0 0
UUID=09f4af86-bd2a-4050-863c-8de6a5e16fe2 /var/lib/docker         btrfs   subvol=@docker,ssd,noatime,space_cache=v2,compress=zstd:1,commit=120,discard=async,x-systemd.device-timeout=0 0 0
UUID=09f4af86-bd2a-4050-863c-8de6a5e16fe2 /home/silverbullet069/.cache    btrfs   subvol=@cache,ssd,noatime,space_cache=v2,compress=zstd:1,commit=120,discard=async,x-systemd.device-timeout=0 0 0
UUID=374bcc0e-b0f4-4e84-9968-fdb2c0ae71d9 /boot                   ext4    defaults 1 2
UUID=D87C-787C                            /boot/efi               vfat    umask=0077,shortname=winnt 0 2
UUID=09f4af86-bd2a-4050-863c-8de6a5e16fe2 /mnt/btrfs              btrfs   subvolid=5,ssd,noatime,space_cache=v2,compress=zstd:1,commit=120,discard=async,x-systemd.device-timeout=0 0 0

As you can see, I have mounted /mnt/btrfs as the "true root", inside are 5 subvolumes: @, @home, @servarr, @docker and @cache. I want to exclude @servarr, @docker and @cache out of @ and @home but when backup up @ still includes @docker and @servarr.

I have tried simple nested subvolumes without tweaking /etc/fstab or create @-ish subvolumes inside /mnt/btrfs, like sudo btrfs subvolume create /var/lib/docker or sudo btrfs subvolume create /opt/servarr. But it's not working. @ still includes them when backed up.

Massimo-B commented 1 month ago

I don't see nested subvolumes in your example. When looking at the btrfs from top-lvl, nested subvolumes must be really subdirectories for other subvolumes. But @servarr is just another subvolume beside @ or @home Beside that I don't repeat all the mount options for every single mount, as only the first mount applies the options.

Here is my fstab:

# cat /etc/fstab |grep btrfs
LABEL=gentoo    /               btrfs   compress-force=zstd:3,nodiscard,subvol=volumes/root 0 1
LABEL=gentoo    /mnt/btrfs-top-lvl      btrfs   subvol=/                    0 1
LABEL=gentoo    /home               btrfs   subvol=volumes/home             0 1
LABEL=gentoo    /mnt/data           btrfs   subvol=volumes/data             0 1
LABEL=gentoo    /mnt/data/workspace.sync    btrfs   subvol=volumes/workspace.sync           0 1
LABEL=gentoo    /mnt/data/VirtualMachines   btrfs   subvol=volumes/vm,nodatacow         0 1
LABEL=gentoo    /tmp                btrfs   subvol=volumes.nosnap/tmp           0 1
LABEL=gentoo    /usr/portage            btrfs   subvol=volumes.nosnap/usr.portage       0 1
LABEL=gentoo    /usr/src            btrfs   subvol=volumes.nosnap/usr.src           0 1
LABEL=gentoo    /var/cache          btrfs   subvol=volumes.nosnap/var.cache         0 1
LABEL=gentoo    /var/db/repos           btrfs   subvol=volumes.nosnap/var.db.repos      0 1
LABEL=gentoo    /var/lib/layman         btrfs   subvol=volumes.nosnap/var.lib.layman        0 1
LABEL=gentoo    /var/tmp            btrfs   subvol=volumes.nosnap/var.tmp           0 1

Then I don't use nested subvolumes anymore as it makes things more complicated. I use plain subvolumes side by side and mounted them in the tree. I just organized them into volumes/ and volumes.nosnap/ directories for better understanding. Actually the same like your approach. The question is, how do you create your snapshots? When I use btrbk I always use the top-lvl as source like:

volume /mnt/btrfs-top-lvl/
    subvolume volumes/root

Looking in such a snapshot, /var/tmp is not included and is only an empty mountpoint. If there are data inside it may be caused by a tricky situation: If you have copied data to /var/tmp at a time when the other subvolume wasn't mounted to /var/tmp, then this data belongs to the / mount. And more tricky, later if the mount is done, the underlying data is not visible anymore. This situation was hitting and confusing me in the past, when there is suddenly appearing different data after some (network) mount is lost for whatever reason...