milkey-mouse / backup-vm

Back up a full image of a libvirt-based VM using Borg
MIT License
38 stars 9 forks source link

Dump disk images using qemu #18

Open milkey-mouse opened 6 years ago

milkey-mouse commented 6 years ago

If a VM has complex chains of disks (e.g. you want to back up snapshots already created, or even just having a qcow2 with another backing disk before running) not all the content in the VM would truly be backed up, just the last overlay image.

It should be fine to recursively run qemu-img on the disks in the domain so it can be sure images aren't depending on other images that should be backed up (and this option should have a CLI flag, because in the case of e.g. a common fresh Debian install base and overlay images with different software) it would be annoying to have many copies of the base image.

A better solution might be to read the disks the same way as qemu itself, which could probably be accomplished with a simple qemu-img convert -O raw <image> -. (The image should always be exported as raw regardless of input format because borg will do its own deduplication & compression.)

milkey-mouse commented 6 years ago

Doing the latter option above would lead to messing up the restore-vm code (since all the images would be "flattened" & couldn't be restored to the same place). Keeping backwards compatibility with existing backups in mind, perhaps backup-vm should store the hashes of each disk image in the chain (regardless of whether it's been backed up or not) in a HASHES file in the backup and restore-vm should (after prompting the user) follow the chain from the bottom, stopping at the first file that differs and rebasing the backed-up image on top of that (something like qemu-img convert -O qcow2 -o backing_file=last-unchanged.img /tmp/mount/sda.img first-changed.img).

milkey-mouse commented 5 years ago

Crazy idea to generalize even further & support libvirt backends other than QEMU, as long as they support snapshots:

  1. Snapshot the VM's disks & pivot to the snapshot, same as we're doing right now.
  2. Create a new VM with the same (snapshots of) disks attached as read-only
  3. In the new VM, boot a minimal image with a worker (nothing more than a C program in an initramfs) which dumps the contents of its disks to the outside backup-vm instance (over a serial port, virtio channel, etc.)
  4. Use the ideas above to take these raw streams and format them into coherent backups.