Open digitalsignalperson opened 2 years ago
I agree, i'm creating a seperate zfs-rsync issue for this.
On what snapshots should it operate? Just the latest common? And if you run it again and there are newer snapshots, should it send increments to the other side as well?
please have a look at #114 and comment overthere
There are definitely pros and cons in both approaches and it highly depends on the context which is better. As for insight on the original design choice, I'm not sure on that one, but for me the ability to define what to backup on the "source system" rather than the "backupper" was the primary reason why I switched my primary backup solution to zfs_autobackup.
In my primary infrastructure design I have a bunch of servers which all contain both "critical" and "non-critical" data (critical being things like databases, non-critical being things like configurations, or data that can be trivially recreated on demand), and these all depend on the services running on each server.
What the zfs_autobackup design allows me to do is to simplify my infrastructure setup and configuration regarding the backups; the setup scripts (ansible mainly) for the services set up the necessary zfs datasets needed by the service, sets their properties (such as tagging the critical datasets for backup) and obviously sets up the services themselves.
This way when a new service is created in my infrastructure setup or an existing one added to a new server, everything that needs to be done can be done only on that server, the backuppers that backs up all the servers don't need to have knowledge "what datasets are important to backup". The only "knowledge" my backuppers need is "which servers and where to backup, and what are the zpool names", thus "config" changes need to be done only to the server that has the data related to the change, be it adding a completely new service or setting up a new instance of a service.
@Scrin good point, i should reiterate that more clearly in the documentation. zfs-autobackup makes it so that other tools/admins can select datasets on the sourcesystem, without needed to access the backup server ad all.
Thanks for sharing, I can see how the property is useful depending on the scenario. To not make a breaking change, that could stay the default behavior, but have a new optional argument to instead supply a list of sources (or txt or yaml with list of sources)
thats true, i could add a --select=... --select-child=... and --select-single=... (non recursive) perhaps
rest of the syntax stays the same and you wont need to set properties. (but still can, and you could use both if you want)
Any thoughts on this for a PR? https://github.com/digitalsignalperson/zfs_autobackup/compare/d0b58b98e7971493bac30a85fa4ec3e8e0192878...digitalsignalperson:zfs_autobackup:v3.1.2-hacks
example usage:
zfs-autobackup -v \
--no-holds \
--no-thinning \
--no-snapshot \
--other-snapshots \
--min-change 1 \
--strip-path=1 \
--clear-mountpoint \
backupname-does-nothing-here \
rpool/test-destination \
rpool/recursive-source-dataset/\* \
rpool/some-source-dataset \
rpool/some-other-source-dataset
I went with ignoring trying to select datasets with the BACKUP-NAME property if source paths are specified, but that could still be an option. The BACKUP-NAME param is still used for snapshots and thinning in general, except in this example with --no-snapshot
and --no-thinning
.
To use as a snapshot tool without specifying a TARGET-PATH, it's a little weird with the order of args. I allowed for "/None" to be used as a target path to solve this, but maybe there's a more sensible way to order the args or add other options.
Hmm i'm not sure if i already responded to this somewhere?
I think this solution is too hackish, i would rather see --select-... options for this.
zfs-autobackup -v \
--no-holds \
--no-thinning \
--no-snapshot \
--other-snapshots \
--min-change 1 \
--strip-path=1 \
--clear-mountpoint \
--select-recursive=rpool/recursive-source-dataset \
--select=rpool/some-source-dataset \
--select=rpool/some-other-source-dataset \
backupname-does-nothing-here \
rpool/test-destination
Have select behave consistent with https://github.com/psy0rz/zfs_autobackup/wiki/Manual#dataset-property
e.g. something like --select, --select-recursive, --select-exclude, --select-child
And perhaps ignore the autobackup property when --select is used or something.
Edwin
I'm currently wondering about the design requiring setting of a
autobackup:$name
property to select source_datasetWhat are the advantages compared to just providing e.g.
or any insight in to the design choice would be curious to hear.
Cons of using property to manage the config:
autobackup:$name
property); possible relation to exclude_received?The code seems like it would be clean to change without any issues (don't see other use of 'property_name'), changing
source_datasets = source_node.selected_datasets(property_name=property_name, ...
tosource_datasets =
a list as parsed from commandline argumentpossibly related to #41 (rsync for zfs??
zfsync src_pool/data dst_pool/data
)Curious to hear your thoughts, cheers!
Edit: wasn't thinking about snapshots, holds which use self.args.backup_name; that could still be an argument for those naming purposes. Or in my case I'd use --no-snapshot --no-holds