oetiker / znapzend

zfs backup with remote capabilities and mbuffer integration.
www.znapzend.org
GNU General Public License v3.0
609 stars 136 forks source link

Feature request: remote source #158

Closed dandanio closed 3 years ago

dandanio commented 9 years ago

For various reasons, I'd like to be able to do a "pull"-style backup, as such (dev environment):

root@kotick:/opt/znapzend-0.14.0/bin# znapzendzetup create --recursive --mbuffer=/usr/bin/mbuffer \ --mbuffersize=1G --tsformat='%Y-%m-%d-%H%M%S' \ SRC '7d=>1h,30d=>4h,90d=>1d' root@mowgli:zfs1/test \ DST:a '7d=>1h,30d=>4h,90d=>1d,1y=>1w,10y=>1month' zfs1/backup/test * backup plan: root@mowgli:zfs1/test * dst_a = zfs1/backup/test dst_a_plan = 7days=>1hour,30days=>4hours,90days=>1day,1year=>1week,10years=>1month enabled = on mbuffer = /usr/bin/mbuffer mbuffer_size = 1G post_znap_cmd = off pre_znap_cmd = off recursive = on src = root@mowgli:zfs1/test src_plan = 7days=>1hour,30days=>4hours,90days=>1day tsformat = %Y-%m-%d-%H%M%S

Do you want to save this backup set [y/N]? y cannot open 'root@mowgli:zfs1/test': invalid dataset name ERROR: could not set property post_znap_cmd on root@mowgli:zfs1/test

Some of my backup machines are not online 24/7 and are being brought online only during certain time of the day (cold storage) and I would love to be able to execute backups from those hosts. How much work would be needed to implement that?

sempervictus commented 8 years ago

+1

oetiker commented 8 years ago

the problem with this is, that the configuration of what to backup is stored in the properties of the fileset to be backed up ... in keeping with this idea, one might imagine that there could be a setup where the configuration resides inside properties of the fileset receiving the backups ... The way to get such a feature would be a) you create a patch and provide a PR or b) you hire us to implement this for you.

jasonblewis commented 8 years ago

This is a good idea for security reasons. If an unauthorised user gains access to the machine being backed up, conceivably they could use the local ssh key to gain access to the machine where the backups are being sent

oetiker commented 8 years ago

if the 'bad' guy gets access to the backup machine, he gets instant access to ALL the servers ...

I think the 'right' way todo this would be to have a special ssh command wrapper that restricts incoming commands and a special switch to znapzend so that it can do only local work PLUS either pull or push the data ... cleanup would happen locally only ....

this would require for znapzend to be installed at both ends ...

sempervictus commented 8 years ago

Agreed. Restricted shells for git and sftp may be good reference points from which to scaffold implementation.

On Fri, May 20, 2016 at 2:46 AM, Tobias Oetiker notifications@github.com wrote:

if the 'bad' guy gets access to the backup machine, he gets instant access to ALL the servers ...

I think the 'right' way todo this would be to have a special ssh command wrapper on the backup server that restricts incoming commands and a special switch to znapzend so that it can do only local work PLUS either pull or push the data ... cleanup would happen locally only ....

this would require for znapzend to be installed at both ends ...

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/oetiker/znapzend/issues/158#issuecomment-220530232

Baughn commented 8 years ago

I'd think the "proper" fix for this would be to use ZFS-level ACLs. Once those work on Linux, which may still be a while.

BotoX commented 7 years ago

Well, right now if somebody breaks into one of your backed-up servers he will get access to all your backups as well as your backup server unless you put all backups in LXC/VZ/chroot containers. I think keeping the backup server secure is an easier task in general than keeping all production servers secure.

Or did anyone of you implement the restricted shell - maybe possible to run this under a normal user account with fakeroot and special sudoers for ZFS snapshot manipulation?

sempervictus commented 7 years ago

Actually, now that zfs crypto is pretty much done, this makes even more sense. The backup server doesn't have access to contents of encrypted data streams it receives.

-------- Original Message -------- From: BotoX notifications@github.com Sent: Monday, December 12, 2016 03:47 PM To: oetiker/znapzend znapzend@noreply.github.com Subject: Re: [oetiker/znapzend] Feature request: remote source (#158) CC: RageLtMan rageltman@sempervictus.com,Comment comment@noreply.github.com

Well, right now if somebody breaks into one of your backed-up servers he will get access to all your backups as well as your backup server unless you put all backups in LXC/VZ/chroot containers. I think keeping the backup server secure is an easier task in general than keeping all production servers secure.

Or did anyone of you implement the restricted shell - maybe possible to run this under a normal user account with fakeroot and special sudoers for ZFS snapshot manipulation?

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/oetiker/znapzend/issues/158#issuecomment-266547920

ShaRose commented 7 years ago

With regards to storing the data on the fileset properties, why not set it up so that if a remote source is specified, it stores the source in a org.znapzend:src_remote property? If that property is valid, it uses that path for the source. Otherwise it can be left off. As for which fileset the property should be set on, it should probably just pick the first specified local destination. That way the command structure stays the same, but still allows the extra features.

oetiker commented 7 years ago

@ShaRose try an implementation ... good PRs are always welcome

ser commented 5 years ago

Any progress on this, guys? It's really worrisome to have a Central Hackers Repository which currently is de facto znapzend backup machine. It's just insane.

dunron commented 5 years ago

Znapzend is great for having local snapshots with progressive thinning. For pulling the snaps to my remote backup server I abandoned the push mechanism and switched to Syncoid.

oetiker commented 5 years ago

we setup zones on the target machines, so the 'hacker impact' would be pretty limited ...

sempervictus commented 5 years ago

@oetiker: if you mean true Illumos/Solaris/whateveryouprefer zones, that functionality (in terms of resource encapsulation and isolation) isn't easily available in Linux. The machinations which can be cobbled together through namespacing, cgroups, and "fun mount options" still lack the resource and process isolation semantics present in Zones/Jails or full VMs. Having capabilities permitting control over mounts or nested namespace creation (even unprivileged, relatively to execution context) can lead to the ability to create suid binaries, traverse incomplete isolation implementations, and other silly things. Spender has a great writeup on how most caps lead to root, and i think that was from before Linux namespaces (or their standard inclusion), and especially the "all users can be god in their own little universe" user namespaces. Even pull-based replication can have its dangers - attacker controlled data and metadata may be problematic if the host from which you're pulling was compromised. It is feasible to create a very restricted namespace or MAC context with bare minimum caps and syscalls permitted, with zfs delegations on the recv targets, for each stakeholder sending data, running something like a "restricted shell" explicitly for recv of the snaps... That seems a bit out of scope for this project, has its own concerns, while pull-based replication is a pretty reasonable feature (comparably) :).

ShaRose commented 5 years ago

Just a small note, you aren't wrong on potential risks for pull-based replication. Afaik, there isn't really anything stopping a malicious source from responding to any send request with a stream that includes hundreds of snapshots with fake backdated times after deleting all data. That would cause the destination to receive all those snapshots and then start to purge all the good snapshots out because it has so many, actually removing all your data. That depends on how znapsend handles trimming afterwards though.

That said, the alternative is they can do that even easier as soon as they get access to the source, without deleting the files there possibly delaying any response. So I'm still hoping for pull support in znapzend. And maybe a sanity check that the number of snapshots that the destination ends up with matches how many there should be.

On Tue, Mar 12, 2019, 10:36 PM RageLtMan, notifications@github.com wrote:

@oetiker https://github.com/oetiker: if you mean true Illumos/Solaris/whateveryouprefer zones, that functionality (in terms of resource encapsulation and isolation) isn't easily available in Linux. The machinations which can be cobbled together through namespacing, cgroups, and "fun mount options" still lack the resource and process isolation semantics present in Zones/Jails or full VMs. Having capabilities permitting control over mounts or nested namespace creation (even unprivileged, relatively to execution context) can lead to the ability to create suid binaries, traverse incomplete isolation implementations, and other silly things. Spender has a great writeup https://forums.grsecurity.net/viewtopic.php?f=7&t=2522 on how most caps lead to root, and i think that was from before Linux namespaces (or their standard inclusion), and especially the "all users can be god in their own little universe" user namespaces. Even pull-based replication can have its dangers - attacker controlled data and metadata may be problematic if the host from which you're pulling was compromised. It is feasible to create a very restricted namespace or MAC context with bare minimum caps and syscalls permitted, with zfs delegations on the recv targets, for each stakeholder sending data, running something like a "restricted shell" explicitly for recv of the snaps... That seems a bit out of scope for this project, has its own concerns, while pull-based replication is a pretty reasonable feature (comparably) :).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oetiker/znapzend/issues/158#issuecomment-472239755, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXz4rHu1V4ADcb93GGrOmVK-JX5vcDHks5vWE74gaJpZM4Fpt6Z .

oetiker commented 5 years ago

@ShaRose note that znapzend does not operate by number of snapshots but by their age ... excess intermediary snapshots do get removed ...

@sempervictus yes ... on linux this is more complicated ... we are not using OmniOS though so our problem set is slightly different than your ... that said, I am very interested in reviewing a PR for a pull based model.

sempervictus commented 5 years ago

@oetiker - might have a bit of time next week. Been a while since i wrote anything in Perl, probably a good way to de-cobweb the braincase a bit. @ShaRose - neat concept for a data-only attack strategy, probably works against all manner of similar tools.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.