QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
534 stars 46 forks source link

[Contribution] qubes-incremental-backup-poc OR Wyng backup #858

Open marmarek opened 9 years ago

marmarek commented 9 years ago

Community Devs: @v6ak, @tasket @v6ak's PoC: https://github.com/v6ak/qubes-incremental-backup-poc @tasket's PoC: https://github.com/tasket/wyng-backup | Status update as of 2022-08-16: https://github.com/QubesOS/qubes-issues/issues/858#issuecomment-1217463303


Reported by joanna on 14 May 2014 10:38 UTC None

Migrated-From: https://wiki.qubes-os.org/ticket/858


Note to any contributors who wish to work on this issue: Please either ask for details or propose a design before starting serious work on this.

unman commented 1 year ago

wyng is restricted to lvm, so is only usable in a subset of installations. Bacula is not so restricted.

emanruse commented 1 year ago

So to ask things differently: have you tried wyng to compare it against your knowledge of Bacula?

Not yet. I have just read about it in a forum thread. Based on that scarce info only, wyng seems good because it is simpler and (IIUC) can back up changes without "seeing" actual decrypted files. The later, however, can be considered a cons too, as it makes it impossible to backup particular file sets only (not every change).

Other than that, Bacula's supremacy is undisputed (Please don't get me wrong, I don't mean to undervalue your work). If you have never used it, I highly recommend that you try it. It needs some reading in the beginning but it is very worthwhile in the long run. Perhaps you may work with Bacula's developers to combine your work with theirs, resulting in a plugin for Bacula, considering the specifics of Qubes OS. What do you think?

tlaurion commented 1 year ago

Other than that, Bacula's supremacy is undisputed (Please don't get me wrong, I don't mean to undervalue your work). If you have never used it, I highly recommend that you try it. It needs some reading in the beginning but it is very worthwhile in the long run. Perhaps you may work with Bacula's developers to combine your work with theirs, resulting in a plugin for Bacula, considering the specifics of Qubes OS. What do you think?

Not my work, but @tasket. Props to him!

wyng is restricted to lvm, so is only usable in a subset of installations. Bacula is not so restricted.

wyng expension to other filesystems, which already support snapshot and CoW, would be possible but @tasket as well needs to be convinced: ZFS support question: https://github.com/tasket/wyng-backup/issues/110 BRTFS support question: https://github.com/tasket/wyng-backup/issues/75

Agreed that BRTFS outperforms EXT4 on top of LVM in default install of Qubes OS as reported here: https://forum.qubes-os.org/t/ext4-vs-btrfs-performance-on-qubes-os-installs/

But those performance differences are more in link of current Qubes OS defaults and old fedora-32 available tools at installthen a real filesystem comparison, which should be investigated more as well: https://forum.qubes-os.org/t/ssd-maximal-performance-native-sector-size-partition-alignment

wyng seems good because it is simpler and (IIUC) can back up changes without "seeing" actual decrypted files. The later, however, can be considered a cons too, as it makes it impossible to backup particular file sets only (not every change).

Then we go back into having dom0 do the backups instead of having dom0 just doing a differential compressed backup of blocks that changed, which wyng does really well and fast, also deduplicating (YES. That means only differences across specialized templates (clones) and app qubes clones) will be saved in backups as well.

I am not aware of any dom0 tools that would permit to only save minimal changes like wyng does.

@unman: If I may ask, Bacula supports deduplication across different volumes? This is, to me, why wyng is so important. Again, we lack the tools on LVM to be able to deduplicate on restoration as well (pool level deduplication) which is also another topic https://forum.qubes-os.org/t/pool-level-deduplication/12654

tasket commented 1 year ago

wyng seems good because it is simpler and (IIUC) can back up changes without "seeing" actual decrypted files. The later, however, can be considered a cons too, as it makes it impossible to backup particular file sets only (not every change).

Given that Qubes (at some level) is a user exercise in isolating data sets, I don't see that as minus. What really needs de-selecting during backups is non-essential data, namely cache, which is possible and I think has already touched on in this issue.

I'm not familiar with Bacula, but tend to doubt a backup tool that withholds both VM and snapshot support in its libre edition. @unman this appears to indicate Bacula (for us) supports neither LVM nor Btrfs snapshots, or anything else; the Enterprise product would be required.

OTOH, if Btrfs support is becoming a sticking point for Wyng, I'm happy to prioritize that. As with Thin LVM, its not rocket science.

Wyng is also built around the idea of dom0/admin network isolation; this aspect is not an add-on. No part of it assumes admin network access. The amount of interactive handling of untrusted data from the storage device is tiny: a few booleans, command result codes and free disk space.


@tlaurion The benchmarks are interesting. My own view is that Thin LVM chunk size is a critical factor. I'm still using a machine that (due to an alpha install hiccup) formatted with a thin pool chunk size of 64k (the smallest IIRC). I would expect this to improve write performance over the typical default for this size of SSD (say 512k or 1M).

Its also interesting because very few people ever try to compare apples to apples (in this case, COW to COW). Phoronix is a prolific benchmark site, yet always compares Ext4 in a non-COW context vs Btrfs, ZFS, etc. so its not very useful.

DemiMarie commented 1 year ago

Its also interesting because very few people ever try to compare apples to apples (in this case, COW to COW). Phoronix is a prolific benchmark site, yet always compares Ext4 in a non-COW context vs Btrfs, ZFS, etc. so its not very useful.

That is because Qubes OS is a very different workload.

In Qubes OS, the first write to a given block in a given run of a VM almost always involves a CoW operation. That can be in either reflink (for BTRFS and XFS) or in dm-thin. My understanding is that dm-thin is not optimized for provisioning-heavy workloads, especially with small block sizes and zeroing enabled. It is instead optimized for workloads (such as databases) that write the same blocks over and over, especially with large block sizes and zeroing disabled.

tlaurion commented 1 year ago

From OP:

BTW, since avoiding .cache data was mentioned, I'll note issue #4217 was created for this purpose.

Latest subject being retouched:

What really needs de-selecting during backups is non-essential data, namely cache, which is possible and I think has already touched on in this issue.

@tasket I recently commented under https://github.com/QubesOS/qubes-issues/issues/4217#issuecomment-1277911302 stalled discussion for a plan.

@marmarek @demimarie That should be discussed separately and should move forward

emanruse commented 1 year ago

OK, I will try to answer some of the questions about Bacula.

Re. deduplication - personally I have never needed that but I found this:

https://www.bacula.lat/block-level-deduplication-with-aligned-volumes-tutorial-bacula-7-9-9-0-and-above/?lang=en#main

https://www.bacula.org/whitepapers/DedupVolumes.pdf

Given that Qubes (at some level) is a user exercise in isolating data sets, I don't see that as minus. What really needs de-selecting during backups is non-essential data, namely cache, which is possible and I think has already touched on in this issue.

Why restrict the user to (not) backing up data which the developer considers (non)essential? Suppose a simple case: raw photos with meta data development settings in (say) .xmp files. For some reason, the user may want to backup only the raw photos without the .xmp files, or vice versa. Or one may want (not) to backup only files with particular name pattern. Can you do that with wyng? With Bacula you can (plus much more, e.g. backup different file sets to different storage using different schedules etc).

I'm not familiar with Bacula, but tend to doubt a backup tool that withholds both VM and snapshot support in its libre edition.

I have never used snapshots with Bacula but a quick search for it gave me this. It explains that both the Community ("libre") and Enterprise Editions support snapshots. This is news for me too.

Some other pages probably worth checking:

https://www.bacula.org/kvm-backup-vm/

https://www.bacula.org/free-hyper-v-backup-software/

https://www.baculasystems.com/corporate-data-backup-software-solutions/professional-backup-software/enterprise-community-comparison/

There seems to exist even a Xen backup module, at least in the Enterprise Edition, no idea if there is a version of it in the Community version. For the later I found only this:

https://www.bacula.lat/citrix-xen-bpipe-configuration-script-to-backup-all-running-virtual-machines/?lang=en

FWIW Bacula also has forks which I haven't researched. I have only used BareOS for a short while some years ago. I don't know if the forks may have some interesting modules which may be helpful for Qubes OS.

My suggestion for Bacula is based on the fact that it is really very powerful and flexible, has many years of development behind it, and perhaps it is better to build up on a great tool rather than re-create what others may have already created.

unman commented 1 year ago

[quote] I'm not familiar with Bacula, but tend to doubt a backup tool that withholds both VM and snapshot support in its libre edition. @unman this appears to indicate Bacula (for us) supports neither LVM nor Btrfs snapshots, or anything else; the Enterprise product would be required. [/quote] This isn't true. The enterprise product has useful snapshot capabilities but the community edition supports working with snapshots. (I haven't used the community edition in some time, but I know people who do.) I don't think that Bacula is a good fit for Qubes, but it shouldn't be rejected on the basis of misinformation.

tasket commented 1 year ago

My impression from that Bacula page is deduplication was intended as an afterthought (being a rather old code base generally oriented toward tape storage). It requires both data alignment between volumes (not needed with Wyng) and compression turned off (IMO unacceptable). And even then... it does not really deduplicate. The user must setup a ZFS backup drive with dedup enabled. This is "patent pending" BTW, and probably will be for quite some time (or forever) since their idea is to mimic the way filesystems allocate space for a data file within a sparse disk image.

Wyng cannot de-select files within a volume, however this is already the qvm-backup behavior which avoids creating attack surface (otherwise it would have been possible to use an existing backup tool running in guest VMs). Whether or not this incurs added risk in Bacula is debatable, but the comparison seems academic if libre Bacula cannot query VMs internally.

Bacula is a sprawling codebase written in C++. A basic Debian install pulls in 22 packages using 65MB of disk space. This includes postgresql server + client, a programmable dbms. Absolutely none of this fits Qubes design philosophy IMHO.

Snapshots: Bacula's feature matrix shows only snapshot "management", not full snapshot support "with ZFS, BTRFS, LVM" specifically. There is a significant difference between creating/rotating/mounting snapshots (i.e. management) vs being able to parse various snapshot metadata formats.

I'm seeing only functional (and likely security) deficits here, and it seems clear no one here has used Bacula on Qubes. I don't think I'll comment on it further.

tlaurion commented 1 year ago

On the context of this specific issue, migrated from older ticket from 2014, which goal is either starting work for qubes integration or have wyng considered an alternative community project to qubes-backup tool, I think it should be time to look at wyng for what it is, what it offers, the current improvements over qubes-backup and attempt to consider what is missing to have current qubes support/recommend it and or packaged under contribution repositories after addressing what it misses to reach any of those states.

I have nothing against continuing comparing other existing solutions, but those solutions should be comparable to what qubes-backup actually offers in terms of qubes-backup use case.

That would mean considering what wyng is currently missing to try to attain a stage where qubes-backup could consider integrating, or support wyng?

In that path, I read elsewhere debate/point from @adw about qubes-backup providing both authenticity and integrity contract through qubes-backup. From my understanding of wyng 0.4, I think now would be a good time to adress wyng state and maybe restart from there?


I use both qubes-backup and wyng. qubes-backup to backup a point in time, self contained (without dom0 since it backs up only home directory files) and I'm quite happy with what it offers. I use wyng on a weekly basis to backup deduplicated states of my whole system, including dom0. And I use wyng to restore such states on a monthly basis, while really appreciating the space, speed, and automation of the tool to provide me a carefree experience.

What I consider missing is a high-level integration, hiding wyng command line details, which would permit management from dom0 of the volumes to be backed up. It would of course miss initial creation of the archive, compression algo selection and all those little details. It could also miss automatic integration of newly created templates/qubes into the backup, which once again, thanks to reduplication, should not be be costly on a space consumption level. It could also miss pruning scheme and automation.

Those are the things that should be discussed here. A review of encryption/authentication under 0.4 would also be beneficial.

Basically, what I'm saying here is that having reports from wyng users would be beneficial here on what its good at and what should be improved to reach Quves standards.

emanruse commented 1 year ago

@unman

I don't think that Bacula is a good fit for Qubes,

Why?

but it shouldn't be rejected on the basis of misinformation.

"Misinformation is incorrect or misleading information." (Wikipedia)

I don't think anyone here is sharing incorrect or misleading information which serves as basis for rejection. It rather seems that there is lack of information which leads to this quick rejection.

@tasket

My impression from that Bacula page is deduplication was intended as an afterthought (being a rather old code base generally oriented toward tape storage). It requires both data alignment between volumes (not needed with Wyng) and compression turned off (IMO unacceptable). And even then... it does not really deduplicate. The user must setup a ZFS backup drive with dedup enabled. This is "patent pending" BTW, and probably will be for quite some time (or forever) since their idea is to mimic the way filesystems allocate space for a data file within a sparse disk image.

Forgive me if I misunderstood, is this discussion open to really good backup system suggestions or is it just "if it is not Wyng (with all its features) then its worthless even trying"? If it is the later, I would like to apologize for stepping in. I cannot possibly force anyone to try some software, neither it was my intention.

Wyng cannot de-select files within a volume, however this is already the qvm-backup behavior which avoids creating attack surface (otherwise it would have been possible to use an existing backup tool running in guest VMs). Whether or not this incurs added risk in Bacula is debatable, but the comparison seems academic if libre Bacula cannot query VMs internally.

What do you mean "query VMs internally"? If you install a Bacula File Daemon (FD) in a qube, it has full access to every file. That FD communicates with Bacula Director which concerts all FDs. The Director also controls the Storage Daemon(s) (SD) which store the actual backup data. In a LAN, you would install Director on one machine and FDs on all clients. Based on purpose, you run a SD on one or more machines. Now, replace "machine" with "qube" and we are in Qubes OS.

At first sight, the network connection requirement may be a problem for network-isolated VMs. However, it is obviously possible to "workaround" this, just like installing/updating software or copying files vm-to-vm is possible. Although I am not familiar with the internals, technically, network connection should not be a blocker.

I don't know if dom0 can access files inside domUs. If it can, then no FDs per qube would be necessary. One FD controlled by dom0 would be enough.

Bacula is a sprawling codebase written in C++. A basic Debian install pulls in 22 packages using 65MB of disk space. This includes postgresql server + client, a programmable dbms. Absolutely none of this fits Qubes design philosophy IMHO.

If this is a key factor, then Qubes OS must not allow installing any software written in C++ and all the rest you listed. That doesn't seem the case though. There is no requirement to install the Director in dom0. There can be a dedicated qube running only the Director and it can communicate with FDs.

Yes, Bacula is a complex system. There is a reason for this though. Unfortunately, it is not possible to have many features and keep things as simple as "Hello world".

Snapshots: Bacula's feature matrix shows only snapshot "management", not full snapshot support "with ZFS, BTRFS, LVM" specifically. There is a significant difference between creating/rotating/mounting snapshots (i.e. management) vs being able to parse various snapshot metadata formats.

I'm seeing only functional (and likely security) deficits here, and it seems clear no one here has used Bacula on Qubes.

@unman and I have. Why others have not or are unwilling to even try to get a feel for it is really strange.

I don't think I'll comment on it further.

That is quite unfortunate as it means no discussion is possible. It is like "I say Wyng. I won't even look at anything else. Period". I am not saying you said that literally. It is just how it sounds to me :)

@tlaurion

On the contract of this specific issue, which is starting work for qubes integration, I think it should be time to look at wyng for what it is, what it offers, and attempt to consider what it is missing to have current qubes-backup supoort it.

IIUC it lacks: File sets (like Bacula), schedules, tape backup, integration with other backup in an infrastructure (e.g. backup of multiple physical machines in a LAN, possibly running different OS - Linux, Windows, other).

I have nothing against continuing comparing other existing solutions, but those solutions should be comparable to what qubes-backup actually does in terms of current qubes-backup use case.

That would mean considering what wyng is currently missing to try to attain a stage where qubes-backup could consider integrating, or support wyng?

qubes-backup is quite simplistic (or should I say, minimalistic) feature-wise. If there are strict requirements to compare it only to other simplistic solutions, (not) written in particular languages, (not) using particular things etc, then would be difficult to suggest anything from a different level.

In that path, I read elsewhere debate from @adw about qubes-backup providing both authenticity and integrity contract through qubes backup. From my understanding of 0.4, I think now would be a good time to adress wyng state and maybe restart from there?

In another discussion (might be the same you mean) I also mentioned that Bacula has that feature too.

Friends, forgive me for bothering. I am not sponsored by Bacula or anyone else. Just wanted to share something worth checking out, just like one likes to share when one sees something beatiful. I won't insist or argue further. If anyone is interested in Bacula integration, it would be great and I would be glad to answer any questions if I can. If not, well - so be it :)

DemiMarie commented 1 year ago

My impression from that Bacula page is deduplication was intended as an afterthought (being a rather old code base generally oriented toward tape storage). It requires both data alignment between volumes (not needed with Wyng) and compression turned off (IMO unacceptable). And even then... it does not really deduplicate. The user must setup a ZFS backup drive with dedup enabled. This is "patent pending" BTW, and probably will be for quite some time (or forever) since their idea is to mimic the way filesystems allocate space for a data file within a sparse disk image.

Forgive me if I misunderstood, is this discussion open to really good backup system suggestions or is it just "if it is not Wyng (with all its features) then its worthless even trying"? If it is the later, I would like to apologize for stepping in. I cannot possibly force anyone to try some software, neither it was my intention.

I would say that other backup software is out of scope for this issue, as each issue must be about a single, actionable thing. Discussions about which backup software is best would be better suited to the mailing list or Qubes Forum.

tasket commented 1 year ago

I think it should be time to look at wyng for what it is

My biased description of what Wyng is: A backup tool that can rival Apple's Time Machine in use cases and efficiency – frequent, brief backup sessions and continual pruning of old data (simple use of available space w/o need of "managing" it). Wyng uses features on local OS (COW metadata) and backup destinations (POSIX filesystem semantics & features) in ingenious ways to achieve this efficiency.

Its a product of "Qubes ecosystem" although useful beyond that. The name "Wyng" came from the idea of laptop users doing quick backups from wherever... backups on-the-wing.

Downsides: Wyng's only big hurdle on the horizon is having its cryptography vetted. Its single conceptual deficit is inability to de-select individual files for backup (however there are various ways to address even this issue).

I have a new Qubes integration (VM settings restore) script in the works that I expect to have ready next week. It improves on the prior attempt by applying restored VM settings via Qubes API instead of interfering with qubesd. Its intended to be functional and low complexity/loc (easily used/re-used). An alternative to this would be, I think, to minimally modify qubes-backup-restore to accept restored data from external source (Wyng) and perform its Qubes-sanctioned settings restoration/renaming method for VMs.

Lastly, those of you who would like to try Wyng might prefer v0.4alpha2. I made some usability mistakes in prior versions to get the basic concept working, so I'm improving the UX. For example, selecting backup destinations is now more like rsync, just give it a URL/path or a nickname for one.

Wyng issue for this issue.

tlaurion commented 1 year ago

@emanruse the discussion here is about incremental backup, and comes from and ever waiting solution to deal with differential backups, in the goal of maybe replacing qubes-backup with something more efficient. More efficient here can mean a lot of things, some of which, since 2015, were discussed in OP referred projects which are at stake in this issue.

Qubes issues are linked with one specific topic, and normally aimed at resolving implementation details and directions, where pull requests try to address those implementation details.

Here we are talking about sparsebak now renamed to wyng, which implemented criticisms and their own issues in their own projects spaces.

The place to continue the discussion about bacula should be in the forum, continuing post https://forum.qubes-os.org/t/a-new-kind-of-back-up-tool/14191

As opposed to the forum, discussions in github issues should stay on point, where forum posts can be splitted and reorganised.

Features missing on wyng should be addressed here though.

In that point, I understand, as of now, that the concepts of agentless/block layer differential backups/data agnostic notions are being mixed up.

@emanruse I understand from you perspective that having agents in your windows/BSD/Linux based qubes to be able to filter the content to be sent to a a central controller which has storage back end and/or that central controller being dom0 would be interested to your use case.

And that is totally fine. But as I see it, if bacula is desired, Bacula could be a project for @unman's shaker project to deploy bacula in such way.

But yet again, as I read here, I do not read Bacula's user input on its capabilities to reduce specialized backups size from duplication.

Also, if I read agent enforced capabilities, I also read in small characters under that the qubes in question need to be booted for the backups to run and I'm not sure that fits qubes is per prior comparison of replacing "hosts" by qubes.

Discussion not being about differential, block based solutions, not needing agents deployed in Qubes, should definitely continue under https://forum.qubes-os.org/t/a-new-kind-of-back-up-tool/14191

tlaurion commented 1 year ago

@emanruse the discussion here is about incremental backup, and comes from and ever waiting solution to deal with differential backups, in the goal of maybe replacing qubes-backup with something more efficient. More efficient here can mean a lot of things, some of which, since 2015, were discussed in OP referred projects which are at stake in this issue.

Qubes issues are linked with one specific topic, and normally aimed at resolving implementation details and directions, where pull requests try to address those implementation details.

Here we are talking about sparsebak now renamed to wyng, which implemented criticisms and their own issues in their own projects spaces.

The place to continue the discussion about bacula should be in the forum, continuing post https://forum.qubes-os.org/t/a-new-kind-of-back-up-tool/14191

As opposed to the forum, discussions in github issues should stay on point, where forum posts can be splitted and reorganised.

Features missing on wyng should be addressed here though.

In that point, I understand, as of now, that the concepts of agentless/block layer differential backups/data agnostic notions are being mixed up.

@emanruse I understand from you perspective that having agents in your windows/BSD/Linux based qubes to be able to filter the content to be sent to a a central controller which has storage back end and/or that central controller being dom0 would be interested to your use case.

And that is totally fine. But as I see it, if bacula is desired, Bacula could be a project for @unman's shaker project to deploy bacula in such way.

But yet again, as I read here, I do not read Bacula's user input on its capabilities to reduce specialized backups size from duplication.

Also, if I read agent enforced capabilities, I also read in small characters under that the qubes in question need to be booted for the backups to run and I'm not sure that fits qubes is per prior comparison of replacing "hosts" by qubes.

Discussion not being about differential, block based solutions, not needing agents deployed in Qubes, should definitely continue under https://forum.qubes-os.org/t/a-new-kind-of-back-up-tool/14191

gasull commented 1 year ago

Please consider the possibility of integrating Bacula in Qubes OS. I have been using it for many years and I can say it is a wonderful backup system - extremely powerful, configurable, supports tape backup, etc.

@emanruse, consider opening a new Qubes issue for "integrate Bacula with Qubes OS", and then link to this issue as reference.

andrewdavidwong commented 1 year ago

Just a friendly reminder that this issue tracker (qubes-issues) is not intended to serve as a discussion venue. Instead, we've created a designated forum for discussion and support. (By contrast, the issue tracker is more of a technical tool intended to support our developers in their work.) Thank you for your understanding!

Rudd-O commented 1 year ago

@Rudd-O in #1588 (comment) It looks like Duplicity supports making an incremental backup even when part of a file was changed (include diff of file, not full changed files). So indeed it may be good idea to somehow use it.

FWIW I settled on a custom frontend script that backs up all VM volumes in dom0, using a qrexec tunnel to an AppVM, which then launches SSH to my backup server. The thing simply uses borg backup to work. It works fantastically well, runs nightly across all my dom0s (yes, I have quite a few nowadays) and the deduplication is amazingly good. When it's done, it writes telemetry data for node_exporter to submit to Prometheus, so I can get alerted if backups fail or somehow got too big.

This thing where I have to manually connect a disk or start a program to back up... I would never, ever have backups, if that is what it took. It's not acceptable. There needs to be a Time Machine–like solution for backups, and that's what I built for myself.

Rudd-O commented 1 year ago

Here is my borg-offsite-backup script: borg-offsite-backup.py.txt

Here is a sample config file that goes in /etc/default/borg-offsite-backup: borg-offsite-backup-default.txt

Note how it supports both datasets and filesystems to backup, because over here the system snapshots all ZFS datasets prior to backing up the VM volumes (stored in /var/lib/qubes), and backs up the snapshots rather than the live files. That wouldn't work correctly with a regular non-snapshotting file system, and it certainly won't work with the LVM monstrosity that ships in normal Linux distros (including Qubes).

gangamstyle commented 1 year ago

I am ordinal Qubes user and want you to know my opinion. I started to use Qubes OS only after I discovered Wyng backup tool. Why ? Because OS without incremental backups is a nonsense. Full backups suck! And Qubes internal backup system sucks too!

Most of suggested incremental backup tools (I tried almost all of them) for incremental backups can't create backup for LVM volumes efficiently except one tool - Wyng.

Just try Wyng backup, it is AWESOME, and it was created for Qubes OS! (it can be used without Qubes too)

Wyng uses LVM "magic" and this makes it very fast and very efficient. I will write it twice - VERY FAST and VERY EFFICIENT. It is faster than anything I have tried before, I was very surprised!

Wing's lack of encryption can be solved (and I did it) by standard linux tools. Just upload data into encrypted LUKS container (99% of the time using sshfs to backup server). And LUKS is very secure and tested by time.

I am using it every day and it saved my files once already, I love it.

gangamstyle commented 1 year ago

Qubes is image-based os. All other mentioned tools are file-based tools. You can use file-based tool for raw images, but that requires full re-scan of that images to detect changes for incremental backup.

Wyng is different, it is image-based tool and filesystem-agnostic and it does not require full re-scan, how is this possible ? LVM magic!

tasket commented 1 year ago

Update:

Wyng has entered its final alpha for v0.4. The big changes have been completed, which include:

There is also a Qubes integration script which allows backup/restore by VM name. Currently this only works with LVM but reflink support is planned.

emanruse commented 1 year ago

Is it possible to see a comparison between Wyng and rdiff-backup (which is also used for incremental backup of VMs)?

Rudd-O commented 1 year ago

Would love to see support for incremental backups based on the ZFS volume driver too.

tasket commented 1 year ago

Wyng now has a format specification draft: https://github.com/tasket/wyng-backup/tree/04alpha3/doc/Wyng_Archive_Format_V3.md

mooreye commented 5 months ago

Once Wyng reaches a stable release, can we expect officially-supported incremental backups? This is a serious usability issue if you have lots of data.

tlaurion commented 4 months ago

Would love to see support for incremental backups based on the ZFS volume driver too.

@Rudd-O https://github.com/tasket/wyng-backup/issues/110#issuecomment-2054268657

tlaurion commented 4 months ago

Update:

Wyng has entered its final alpha for v0.4. The big changes have been completed, which include:

  • Btrfs and XFS source volumes

  • Authenticated encryption with auth caching

  • Simpler authentication of non-encrypted archives

  • Overall faster detection of changed/unchanged volumes

  • Fast differential receive when using available snapshots

  • Simple switching between multiple archives: Choose any (dest) archive location each time you run Wyng

  • Multiple volumes can now be specified for most Wyng commands

There is also a Qubes integration script which allows backup/restore by VM name. Currently this only works with LVM but reflink support is planned.

@tasket maybe a quick update is needed here?

Or @marmarek, I see you removed assignment to yourself. Maybe you want to say what is missing to go forward under https://github.com/tasket/wyng-backup/issues/102 instead?

andrewdavidwong commented 4 months ago

I see you removed assignment to yourself.

That's not specific to this issue. It's just a general change in how issue assignments are used. Before, issues were assigned to people who work in certain areas, even if there was no current work being done (including just "someday, maybe"). Now, by contrast, issues are only assigned to devs while those devs are actively working on the issues. You can read more about the new policy here.

tasket commented 4 months ago

@tlaurion @marmarek Wyng is now in final beta for v0.8 and all features are frozen. Its exhibiting good stability overall, but I have added a caveat to the Readme about lvmthin's need for metadata space since adding Wyng's snapshots on top of Qubes' snapshots will naturally consume more and LVM does not have good defaults or out-of-space handling.

The wyng-util-qubes wrapper for Qubes integration has just gone to v0.9 beta and pushed to main branch, as I recommend using this version now. The new version includes support for both reflink and lvmthin pools; the wrapper can now alias volume names as necessary between the two pool types during restore. This is now generally usable for Qubes users who are comfortable with the command line, and its quite feasible to adapt for a GUI.

This is what a typical backup session from a Btrfs Qubes system looks like:

[me@dom0 ~]$ sudo wyng-util-qubes backup -i --dest qubes-ssh://sshvm:user@192.168.0.10/home/user/wyng.backup
wyng-util-qubes v0.9beta rel 20240424

Wyng 0.8beta release 20240423
Encrypted archive 'qubes-ssh://sshvm:user@192.168.0.10/home/user/wyng.backup' 
Last updated 2024-04-25 13:22:30.245619 (-04:00)

Preparing snapshots in '/mnt/btrpool/libqubes/'...
  Queuing full scan of import 'wyng-qubes-metadata'
Acquiring deltas.

Sending backup session 20240425-144142:
———————————————————————————————————————————————————
no change |  -  | appvms/banking/private.img
no change |  -  | appvms/dev/private.img
no change |  -  | appvms/mail/private.img
    5.0MB |  2s | appvms/personal/private.img
    0.5MB |  1s | appvms/root-backup/private.img
no change |  -  | appvms/sshvm/private.img
no change |  -  | appvms/sys-vpn2/private.img
   45.8MB | 10s | appvms/untrusted/private.img
no change |  -  | vm-templates/debian-12/private.img
no change |  -  | vm-templates/debian-12/root.img
    0.0MB |  1s | wyng-qubes-metadata
———————————————————————————————————————————————————
 11 volumes, 79428——>51 MB in 15.6 seconds.

This is a restore from a backup session containing two VMs:

[me@dom0 ~]$ sudo python3 wuq restore --session=20240424-123248 --dest qubes-ssh://sshvm:user@192.168.0.10/home/user/wyng.backup
wyng-util-qubes v0.9beta rel 20240424

VMs selected: temp1, temp2
Warning:  Restoring to existing VMs will overwrite them.
Continue [y/N]? y
Wyng 0.8beta release 20240423
Encrypted archive 'qubes-ssh://sshvm:user@192.168.0.10/home/user/wyng.backup' 
Last updated 2024-04-25 17:32:53.942176 (-04:00)

Receiving volume 'appvms/sys-vpn2/private.img' 20240424-123248
Saving to file '/mnt/btrpool/libqubes/appvms/sys-vpn2/private.img'
OK

Receiving volume 'appvms/mail/private.img' 20240424-123248
Saving to file '/mnt/btrpool/libqubes/appvms/mail/private.img'
OK

Currently, the Qubes default pool for newly created VMs, or whichever one an existing VM resides, is used for restore but you can also specify --pool <name> to override the Qubes default when creating VMs.

tlaurion commented 4 months ago

Also, if wyng is to be considered in newer versions of QubesOS, I would recommend thinking about seperating dom0 data from qubes hypervisor. That is a whole separated topic here, but on my side i'm used wyng against a lot of cloned templates and cloned qubes which results in the following:

[user@dom0 ~]
(130)$ du -chs /var/lib/wyng/
719M    /var/lib/wyng/
719M    total
[user@dom0 ~]
$ df -h /
Filesystem                   Size  Used Avail Use% Mounted on
/dev/mapper/qubes_dom0-root   20G   11G  7.8G  58% /
[user@dom0 ~]

To talk about moving to brtfs+ bees dedup here is out of point of course and would be #6476, not here.

But since the caveat of TLVM metadata is touched, I would love to take the opportunity to remind devs that having separated dom0 from vm-pool was one step into the direction of limiting impacts of using TLVM in the first place under Qubes considering that QubesOS uses a lot of sometimes stalling snapshots for back+x and volatile volumes which could have locked the user out, which impacts pool metadata, a lot, also to be discussed under #6476 not here.

All of those could of course be dodged altogether if TLVM was reconsidered as first QubesOS candidate to envvision going into something more practical fitting specialized cloned templates, salting qubes and massive qubes private disk data preventing quick shutdown and all that jazz. But that would be #6476 not here.

So I would just want to remind that using wyng consumes dom0 space for meta-dir (chunks mapping downloaded from the archive qube) and that dom0 LVM is 20GB considering past decisions (it could be within same pool if not TLVM), and that extending TLVM pool metadata requires to steal some of the swap space (manually), while having 20Gb today is quite limitating (cannot instlal multiple templates, be cautious and check dom0 usage) because dom0 should actually be quite static, if dom0 was just dom0 without dom0 keeping external states that should not exist in dom0 in the first place.


@tasket and @marmarek @DemiMarie : amazing work outside of those critical, but constructive points, as usual :)

Currently ongoing: users hacking qubes-backup encryption/authentication to use incremental backup tools: https://forum.qubes-os.org/t/guide-incremental-backup-using-the-official-backup-system/25792 Let's not permit end users (while still devs) into hacking qubes-backup to do what they actually want (differential backups) for still too long, ok?

... What about writing a grant application to achieve that pressing goal? Yes?

tasket commented 4 months ago

@tlaurion Just FYI, Wyng has an issue open (109) for reducing metadata footprint. Currently it keeps both a compressed and uncompressed copy of manifests in /var. It does set the compression bit on the uncompressed files, so if /var is on Btrfs or other compressing fs then used space will be reduced somewhat. To be completely honest, the main issue with that issue is the low level of interest in it, otherwise I think it would have been done already.

BTW, you can keep deleting all of the 'manifest' files, which should reduce the footprint by up to 2/3.

What about writing a grant application to achieve that pressing goal? Yes?

Its not something I'm familiar with, and the goal needs to be better-defined. AFAIC, Wyng is fully functional now and the only things a typical user would really miss would be a GUI and perhaps a way to mount archived volumes.

tlaurion commented 4 months ago

What about writing a grant application to achieve that pressing goal? Yes?

Its not something I'm familiar with, and the goal needs to be better-defined. AFAIC, Wyng is fully functional now and the only things a typical user would really miss would be a GUI and perhaps a way to mount archived volumes.

@tasket Can you contact me over Matrix /mastodon/QubesOS forum?

tlaurion commented 4 months ago

@marmarek @tasket @rapenne-s should we team up for a grant application? Interest?

rapenne-s commented 4 months ago

@marmarek @tasket @rapenne-s should we team up for a grant application? Interest?

I'm be happy to work on this

tlaurion commented 4 months ago

@marmarek @tasket @rapenne-s should we team up for a grant application? Interest?

I'm be happy to work on this

Random thoughts for grant application.

What I envision to be done as of now :

@rapenne-s I think you are amazing candidate for documentation and insights on infra optimization.

@marmarek @marmarta @DemiMarie UX and GUI integration is needed. Fill us in with high level requirements?

@tasket of course to tackle wyng-backup wyng-util-qubes work and input other features envisioned missing to reach massive adoption with clear use cases.

@tlaurion : heads and general plumbing, facing NLnet, grant application writeup, validation of proof of work (PoW, normally PR) prior of request for payment (RfP) and project management. Heads integration, possibly using SecureDrop workstation use case to drive this.

@deeplow interest from FPF' SD and interfacing with them with requirements? Goal here would be to have restorable states from network booted environement to pull states from the network and prepare drive content to be ready to use under minutes, hosted on FPF infra.

Maybe we should create discussion on QubesOS forum if there is interest or in a seperate issue? Whether you see fit best.


Notes on grant application process.

Grant work is paid upon PoW for scoped tasks. Grant application needs high level view of the whys and deliverables, not to so much on the hows. When grant application passes first round, more details need to be given on scoped deliverables and costs, where PoW makes work paid upon validation of reaching scoped task.

So for teaming up here, we need first acceptation (consent) of engagement into doing the work within a year after grant application is accepted. Scoping of general tasks, by whom the work will be done and required approximate funds to be budgetized do such high level tasks, to be then broken down in smaller tasks upon project approval in terms of deliverables paid upon PoW.

@rapenne-s @marmarek @marmarta @deeplow @DemiMarie @tasket : would you be willing to engage into teaming up to expend on the needs of such integration and documentation, and agree on the first step, which is to consent into accomplishing such integration as a goal if such grant application was accepted to fund the work needed?

tasket commented 3 months ago

Aside from organizational plans, I've decided that there will have to be a beta5 now. The metadata caching issue @tlaurion pointed out needs to be addressed before v0.8 rc, along with a unicode-handling issue I identified. Fixes are now in the 08wip branch and I expect beta5 to be available within a week.

Session metadata will now be aged-out from /var/wyng before 3 days, by default. This can be controlled using an option like --meta-reduce=on:0 will remove uncompressed metadata immediately; in that case the user should consistently see a ~2/3 reduction of /var usage.

marmarek commented 2 months ago

@tlaurion here is what I'd like from a backup solution, to be considered a replacement for the current (non-incremental) one:

  1. Backup should be integrity-protected. Attacker with write access to the backup archive should not be able to compromise dom0 on restore (resistance against malicious metadata modification), nor should be able to silently modify data (resistance against malicious data modification). Some attacks in this threat model are not easily avoidable - attacker can break the backup (or simply remove it) making it impossible to restore - that's acceptable risk. Protection against rollback is also likely non-trivial - rollback of a full backup archive is acceptable risk (but, nice to have if it could be detected), but rollback on individual VMs or even blocks should still be detected. See https://www.qubes-os.org/news/2017/04/26/qubes-compromise-recovery/#recovering-from-a-full-qubes-system-compromise for more explanation. The approach with using DispVM to extract data/metadata in such a model is okay (and even desirable).
  2. Backup should be encrypted by default. Any data leaving dom0 should be already encrypted, it shouldn't be necessary to require some external entity to do encryption to ensure backup confidentiality. At the very least the VMs data should be encrypted and their names. But ideally (nice to have), other metadata (like how big each VM is) should be encrypted too. Similar to above, some information leak is unavoidable - for example you can't hide amount of data in total, or amount of changes in each increment - that's okay. There can be an option to disable encryption if one wishes to.
  3. It should be possible to restore all VMs at once on a freshly installed system, with just access to the backup archive and its passphrase. I mean, restoring backup shouldn't require having any extra metadata, keys etc that were created when making the backup.
  4. It should be possible to restore an individual VM without touching others.
  5. Restoring a VM should restore all its metadata (properties, tags, what netvm is used etc)
  6. It would be nice to be able to restore to an older version of a VM, maybe even under a different name (but one can use qvm-clone, so it's easy to do). Nice to have.
  7. It should be possible to restore an archive made in older qubes version into newer qubes version. In other words: if archive format would change in the future, the tool should still support reading the old format
  8. It should be possible to restore into a different storage pool than than the backup was initially created on.
  9. It should be possible to access the data (of individual VMs) without qubes (emergency restore). It can be an instruction how to do that manually (like we have right now), or a tool that works outside of qubes too (doesn't require LVM/btrfs/qubes-specific packages etc).
  10. It should be possible to backup into popular some cloud service (S3, Dropbox, Nextcloud, ...), ideally ("nice to have") without storing the whole backup archive in some intermediate wyng-aware place. So, ideally, backup target accessible with only simple object operations (get, put, list) should work ("nice to have"), but the minimal requirement is that backup archive stored in wyng-aware place (USB disk? local NAS?) can be then synced to some cloud without loosing incremental properties. And similarly restore: nice to have if possible directly from the cloud service, but necessary to work when backup archive is retrieved from the cloud first (for example info a fresh USB disk). Wyng doesn't need to support every possible cloud service itself, but it should be possible to achieve with relatively simple external tool (like, s3cmd).

Those are the main ones on the functional side. Some are optional (marked "nice to have"). Some (if not most) are already satisfied by wyng, but I've written them down anyway.

Some of those apply to Wyng itself, some to its integration with Qubes OS. Lets discuss how we can make this happen :)

tlaurion commented 2 months ago

@tlaurion here is what I'd like from a backup solution, to be considered a replacement for the current (non-incremental) one:

  1. Backup should be integrity-protected. Attacker with write access to the backup archive should not be able to compromise dom0 on restore (resistance against malicious metadata modification), nor should be able to silently modify data (resistance against malicious data modification). Some attacks in this threat model are not easily avoidable - attacker can break the backup (or simply remove it) making it impossible to restore - that's acceptable risk. Protection against rollback is also likely non-trivial - rollback of a full backup archive is acceptable risk (but, nice to have if it could be detected), but rollback on individual VMs or even blocks should still be detected. See https://www.qubes-os.org/news/2017/04/26/qubes-compromise-recovery/#recovering-from-a-full-qubes-system-compromise for more explanation. The approach with using DispVM to extract data/metadata in such a model is okay (and even desirable).
  2. Backup should be encrypted by default. Any data leaving dom0 should be already encrypted, it shouldn't be necessary to require some external entity to do encryption to ensure backup confidentiality. At the very least the VMs data should be encrypted and their names. But ideally (nice to have), other metadata (like how big each VM is) should be encrypted too. Similar to above, some information leak is unavoidable - for example you can't hide amount of data in total, or amount of changes in each increment - that's okay. There can be an option to disable encryption if one wishes to.
  3. It should be possible to restore all VMs at once on a freshly installed system, with just access to the backup archive and its passphrase. I mean, restoring backup shouldn't require having any extra metadata, keys etc that were created when making the backup.
  4. It should be possible to restore an individual VM without touching others.
  5. Restoring a VM should restore all its metadata (properties, tags, what netvm is used etc)
  6. It would be nice to be able to restore to an older version of a VM, maybe even under a different name (but one can use qvm-clone, so it's easy to do). Nice to have.
  7. It should be possible to restore an archive made in older qubes version into newer qubes version. In other words: if archive format would change in the future, the tool should still support reading the old format
  8. It should be possible to restore into a different storage pool than than the backup was initially created on.
  9. It should be possible to access the data (of individual VMs) without qubes (emergency restore). It can be an instruction how to do that manually (like we have right now), or a tool that works outside of qubes too (doesn't require LVM/btrfs/qubes-specific packages etc).
  10. It should be possible to backup into popular some cloud service (S3, Dropbox, Nextcloud, ...), ideally ("nice to have") without storing the whole backup archive in some intermediate wyng-aware place. So, ideally, backup target accessible with only simple object operations (get, put, list) should work ("nice to have"), but the minimal requirement is that backup archive stored in wyng-aware place (USB disk? local NAS?) can be then synced to some cloud without loosing incremental properties. And similarly restore: nice to have if possible directly from the cloud service, but necessary to work when backup archive is retrieved from the cloud first (for example info a fresh USB disk). Wyng doesn't need to support every possible cloud service itself, but it should be possible to achieve with relatively simple external tool (like, s3cmd).

Those are the main ones on the functional side. Some are optional (marked "nice to have"). Some (if not most) are already satisfied by wyng, but I've written them down anyway.

Some of those apply to Wyng itself, some to its integration with Qubes OS. Lets discuss how we can make this happen :)

@tasket yhis needs a thorough project status update from you here!

tasket commented 2 months ago

@marmarek Thanks for taking time to post your queries/requirements. My answers follow:

  1. Authentication & integrity checks should already be thoroughly addressed by the Wyng and wyng-util-qubes code and archive format: The Wyng format spec shows a hierarchical structure where everything is hash-checked from the root file (archive.ini) downward, which is how the Wyng code operates. (However, this is not very complex as there are only 3 levels of metadata, 1 of data.) This also means every bit of an archive, such as various volumes, must validate in lock-step fashion. The root file is updated with new hashes as archive elements change, and it is AEAD authenticated so any subsequent access after initialization is also authenticated.

    Rollback protection: Apart from the internal hashing described above, which prevents piecemeal replacement with older authenticated messages, on startup Wyng persistently compares the locally-cached archive root with the remote. It first checks for an exact match, and if not exact then the internal timestamp in the cache cannot be newer. The root must pass the AEAD decryption phase before the timestamps are compared. This protects against whole-archive rollbacks. There are also comparisons made to protect encryption counters. Otherwise, if there is no current cache of the root present (such as when restoring to a new system or moving back and forth between systems) then the user must be careful to check the 'Updated at' time that is displayed when the archive is accessed, or at least take heed of the session date-times that are present in the archive.

    The wyng-util-qubes code doesn't mix-and-match volumes from different backup sessions in a single invocation; the user either has to specify a session date-time, or else any VMs you request for restore must all be present in the same session (users who want to restore from more than one session may run the util more than once).

  2. The Wyng default is to encrypt the archive, or to fail during archive creation if encryption dependency is not present. A user must specify --encrypt=off to create an unencrypted archive. Further, Wyng displays the (un)encrypted status when accessing an archive.

    Volume details like name and actual size are encrypted, however the amount of data in compressed/deduplicated form is visible without decryption. Also, the backup session date-times are visible (these are typically close to the Unix timestamps on component files, so was not identified by me as a must-have); the in-volume data chunk addresses are also visible, for similar reasons and that the resolution is fairly low. Also, which chunks are hard-linked to each other are visible in the case where deduplication is used. The volume names are encrypted and a volume ID number is used for the on-disk representation instead.

  3. Only the passphrase is required for archive decryption; the key is derived from only the passphrase and the salt stored in the archive. There is one package dependency for decryption (python3-pycryptodomex, which is already in dom0 by default) and another one if the archive was created with zstandard compression.

  4. Individual VMs may be specified, although in that case there will be no restriction on which backup session can be selected. Currently, the util's default will not restore dispVMs unless they are specified by name. The --include-disposable option must be specified to restore dispVMs implicitly. To restore an entire backup session, a user can run sudo wyng-util-qubes restore --session=<date-time> --include-disposable=on --dest=<archive URL> to restore all VMs in a session. The util will use either the Qubes default storage pool or an existing VM's pool; the default can be overridden with --pool option).

  5. VM settings are given a similar best-effort approach similar to qvm-backup-restore and will return a latent error at completion if the setting wasn't possible. The restored metadata are: prefs (properties), features, tags, devices. VM names are always preserved, so that existing VMs will be overwritten; failure during restore leaves the VM tagged with _'restoreincomplete'.

  6. Older versions can be restored by specifying --session, but not as different VM names. Wyng itself has a --save-to option that only works for individual volumes.

  7. There has been one format transition of note, from V2 to V3, that requires the user run wyng arch-check --upgrade-format in the current Wyng beta release (however, the upgrade doesn't add some of the V3 features, like encryption). Its my intention that future versions of Wyng would be able to at least list + extract volume data from V3 format archives onward.

  8. Restoring into pools is determined by the receiving Qubes system pool configuration; the name/path info in the archive should be considered as relative. As described earlier, the Qubes default pool can be overridden with --pool for non-pre-existing VMs, otherwise the existing VM's pool will be used. User removal of an existing VM before restore would be a precondition for controlling where its restored to.

  9. Non-Qubes access of data volumes is already be possible just by using Wyng directly on a typical Linux distro; how a user finds which volumes belong to which VMs could be documented with just a few paragraphs. I've started testing Wyng on FreeBSD to see where its retrieval functions can be made more portable, but for Linux distros Wyng itself is intended to run full-featured on them.

  10. Cloud protocol backups are planned for the [next])https://github.com/tasket/wyng-backup/issues/197) release of Wyng after v0.8. Currently Wyng needs either filesystem access (local mount or FUSE), or access via a helper script over ssh which uses basic shell commands + CPython.

    For backing up the backup, simple rsync -aH --delete is recommended (with a caveat—the copy should be renamed '.updating' or similar during updates so that an aborted update isn't mistaken for whole). Incremental updates with this method should yield rsync run times that are proportional to the incremental delta. There is an issue to eventually provide an internal archive duplication function for convenience and increased efficiency.

CC @tlaurion

mooreye commented 2 months ago

One more nice-to-have thing I would like to see: possibility to exclude a directory/file (not sure if possible since wyng works on LVMs) or at least backup just the VM metadata (name, color, netvm, template name, etc.) without backing up its contents.

Example use case: You have a "media" qube where you watch videos offline. You download a playlist from youtube with yt-dlp which you want locally only temporarily, and you will delete it when done watching it. Excluding a dir where it is from backups so you don't have to remember to move it out of the VM before backup. Or if excluding dir not possible, dedicate a media-tmp qube to it, and backup only its metadata (not ideal, since things like media player's local config will not be backed up). Currently the only workaround seems to be not to backup the media VM at all until you delete the temp videos (not ideal), or create a partition outside of Qubes and mount it to media qube (cumbersome...), so backup utility won't work with it.

tlaurion commented 1 week ago

@marmarek @tasket https://github.com/QubesOS/qubes-issues/issues/858#issuecomment-2094344093: ping