borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
10.94k stars 740 forks source link

Exclude items based on xattr #4972

Open m3nu opened 4 years ago

m3nu commented 4 years ago

Borg already processes extended attributes (borg.xattr) and has the --exclude-nodump to exclude items based on file flags. (set via chflags nodump $FILE)

On macOS (com.apple.metadata:com_apple_backup_excludeItem) and potentially Linux desktops (user.xdg.robots.backup=false), extended attributes are used to exclude files from backups. Example from macOS:

$ xattr ~/Downloads
com.apple.macl
com.apple.metadata:com_apple_backup_excludeItem

So my suggestion would be to add a flag that accepts an OS-specific xattr as argument to exclude any item with this xattr from the backup. E.g. --exclude-xattr "com.apple.metadata:com_apple_backup_excludeItem"

What do you guys think of such a feature? I imagine that the implementation effort wouldn't be terribly high and I'm happy to prepare a PR, when pointed to the right places.

Related:

ThomasWaldmann commented 4 years ago

From stackexchange:

sudo mdfind "com_apple_backup_excludeItem = 'com.apple.backupd'"

So it seems not just to be some xattr key, but there is also a value.

What does com.apple.backupd mean? Would borg check for that key and value?

m3nu commented 4 years ago

From what I saw, the value is always the same. But to be future-proof, it could compare both. Depends on the performance difference. Currently Borg reads this from an excluded file:

In [0]: xattr.get_all(path, follow_symlinks=False)
Out[3]: ...
'com.apple.metadata:com_apple_backup_excludeItem': b'bplist00_\x10\x11com.apple.backupd\x08\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x1c'
enkore commented 3 years ago
>>> import plistlib
>>> plistlib.loads(b'bplist00_\x10\x11com.apple.backupd\x08\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x1c')
'com.apple.backupd'

Is there a spec for this? It seems kinda weird to encode just a string using plists to store it as just a string... :sweat_smile:

ThomasWaldmann commented 3 years ago

If an external tool can feed paths into borg based on their xattrs, this will be doable with borg 1.2.

nicolas17 commented 3 years ago

If an external tool can feed paths into borg based on their xattrs, this will be doable with borg 1.2.

That would require scanning the filesystem and reading xattrs twice, once to generate the path list, and again by borg when making the backup (surely it reads the xattrs to back them up, right?).

ThomasWaldmann commented 3 years ago

Yeah, it would read (some) xattrs twice then.

Artoria2e5 commented 3 years ago

Is there a spec for this?

As is the case with anything Apple, not officially. Wikipedia refers to some third-party sources.

It seems kinda weird to encode just a string using plists to store it as just a string... 😅

Since you are using Python's plistlib, I should probably mention that it neglects the plain old OpenStep plist format, which allows for encoding the string simply as either com.apple.backupd or "com.apple.backupd". Apple's own programs don't generate this format, but they do accept it. Lazy people who rely on one-liner scripts may end up creating this format.

(I do have a plistlib-like library for this, but it's not in any good shape. Just match the two exact strings.)