dguerri / LibVirtKvm-scripts

Libvirt/KVM scripts - online forward incremental backup for libvirt/KVM virtual machines
GNU General Public License v3.0
72 stars 25 forks source link

No clear documentation on restoring backups #48

Open JeremyRand opened 3 years ago

JeremyRand commented 3 years ago

The documentation is unclear on how to restore backups. It would be useful to document this clearly from an end-user perspective (preferably with an example command-line in the README).

AJRepo commented 3 years ago

If everything is consolidated then it can be as simple as replacing the current image(s) with the backup one(s) with suitable permissions. If the running disk image has a backing file that is part of a snapshot chain (forward or backward) then restoration is more complicated as you'd have to figure out which file(s) the VM will be using.

If it's the simple (all consolidated) case it's as simple as (1) stopping the VM, (2) moving current VM file(s) out of the way (3) putting the backup file(s) in the spot the original files were (making sure the name/permissions are the same as the files you've replaced) (4) starting the VM.

But - if the last backup is part of a series of backing files, then the restoration would require moving all those files too and leaving the snapshot chain files un-renamed. So having a set of generic instructions for restoring might give some a sense of overconfidence.

JeremyRand commented 3 years ago

What is the correct procedure for determining which backing files, if any, would also need to be moved?

AJRepo commented 3 years ago

There are lots of ways that disks for a VM can be setup. This assumes that this VM is running and needs to have that virtual system disk restored from backup. There are lots more ways to do this, but that's a quick and clean way if you have console access to the system that runs the VM.

I happened to run into a situation on a client's system where (I think) a new kernel update happened to conflict with a backup and although restoration from a backup wasn't required, since I was new to their system, I documented the "where is stuff" process which might answer your question. YMMV.

Step 1. Find the backing files for the virtual machine (e.g. the virsh domain) you want to restore.

Let's say the name of the domain = 'm59_web01'

$ virsh domblklist m59_web01
Target     Source
------------------------------------------------
hda        -
vda        /path/to/m59_web01.bimg-20210509-030454

Here there is one file for the one virtual machine. If you had more than one virtual hard drive (vda) they would appear in that list.

Step 2. Check if that backing file is consolidated (should be) or not.

$ sudo qemu-img info /path/to//m59_web01.bimg-20210509-030454

image: /path/to/m59_web01.bimg-20210509-030454
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 2.9G
cluster_size: 65536
backing file: /path/to/m59_web01.bimg-20210124-024807
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

See the line backing file? This has a backing file /path/to/m59_web01.bimg-20210124-024807 You can get the full backing chain with

$ sudo qemu-img info --backing-chain /srv/libvirt/images/m59_web01.bimg-20210509-030454

image: /path/to/m59_web01.bimg-20210509-030454
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 2.9G
cluster_size: 65536
backing file: /path/to/m59_web01.bimg-20210124-024807
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /path/to/m59_web01.bimg-20210124-024807
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 11G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Here you see the number of backing files is 1 (one) file. There could be many.

If there are no backing files then skip step 3 below.

Because restoration of the primary file and all backing files is more messy than it should be if there are backing files then it is better to consolidate for backup/restore.

Step 3. Consolidate (Skip this step if no backing files)

First try using fi-backup to consolidate and take care of all the messiness.

$ sudo fi-backup.sh -c m59_web01

but if you get an error like:

$ sudo fi-backup.sh -c m59_web01
[ERR] Error consolidating block device '/path/to//m59_web01.bimg-20210509-030454' for 'm59_web01':
 error: block copy still active: disk 'vda' already in active block job

Then there's already a problem. Let's check what's going on with

$ sudo virsh blockjob m59_web01 //path/to/m59_web01.bimg-20210509-030454 --info
Active Block Commit: [100 %]

This has happened to others ( https://bugzilla.redhat.com/show_bug.cgi?id=1197592 ) and was resolved. If you get to this case then the process is to force the close with pivot. As in:

$ sudo virsh blockjob m59_web01 /path/to/m59_web01.bimg-20210509-030454 --pivot

which probably gave you a new file for the VM as the drive.

$ virsh domblklist m59_web01
Target     Source
------------------------------------------------
hda        -
vda        /path/to/m59_web01.bimg-20210124-024807

Let's also check to see that it has no backing files:

$ sudo qemu-img info --backing-chain /path/to/m59_web01.bimg-20210124-024807
image: /path/to/m59_web01.bimg-20210124-024807
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 11G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Success. Now you have to also clean up the virtual system's pivot record:

Shut down the running VM. In the VMs system logs if you see something like

var/log/syslog:... m59web01 systemd[1]: Stopping Create final runtime dir for shutdown pivot root

then shutdown and restart again until you no longer get that "Stopping" message on shutdown, boot. "Starting" and "Finishing" are ok messages.

At this point you have a consolidated drives for your VM. Using fi-backup in consolidation mode should run now with no errors.

Step 4. Find where your backup is.

If you ran fi-backup then it would be in the directory specified with -b If you ran fi-backup with -v you'd get a message like

[VER] Copy backing file '/path/to/m59_web01.bimg-20210124-024807' to '/path/to/backup/dir/m59_web01.bimg-20210124-024807'

/path/to/backup/dir/m59_web01.bimg-20210124-024807 is your backup file.

Step 5. Shut down running VM.

Step 6. Move the file that was the current VM blockfile somewhere else.

Step 7. Move the backup file into where the current VM file was (same name and permissions)

Step 8. Start the VM