Open JeremyRand opened 3 years ago
If everything is consolidated then it can be as simple as replacing the current image(s) with the backup one(s) with suitable permissions. If the running disk image has a backing file that is part of a snapshot chain (forward or backward) then restoration is more complicated as you'd have to figure out which file(s) the VM will be using.
If it's the simple (all consolidated) case it's as simple as (1) stopping the VM, (2) moving current VM file(s) out of the way (3) putting the backup file(s) in the spot the original files were (making sure the name/permissions are the same as the files you've replaced) (4) starting the VM.
But - if the last backup is part of a series of backing files, then the restoration would require moving all those files too and leaving the snapshot chain files un-renamed. So having a set of generic instructions for restoring might give some a sense of overconfidence.
What is the correct procedure for determining which backing files, if any, would also need to be moved?
There are lots of ways that disks for a VM can be setup. This assumes that this VM is running and needs to have that virtual system disk restored from backup. There are lots more ways to do this, but that's a quick and clean way if you have console access to the system that runs the VM.
I happened to run into a situation on a client's system where (I think) a new kernel update happened to conflict with a backup and although restoration from a backup wasn't required, since I was new to their system, I documented the "where is stuff" process which might answer your question. YMMV.
Let's say the name of the domain = 'm59_web01'
$ virsh domblklist m59_web01
Target Source
------------------------------------------------
hda -
vda /path/to/m59_web01.bimg-20210509-030454
Here there is one file for the one virtual machine. If you had more than one virtual hard drive (vda) they would appear in that list.
$ sudo qemu-img info /path/to//m59_web01.bimg-20210509-030454
image: /path/to/m59_web01.bimg-20210509-030454
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 2.9G
cluster_size: 65536
backing file: /path/to/m59_web01.bimg-20210124-024807
backing file format: qcow2
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
See the line backing file? This has a backing file /path/to/m59_web01.bimg-20210124-024807 You can get the full backing chain with
$ sudo qemu-img info --backing-chain /srv/libvirt/images/m59_web01.bimg-20210509-030454
image: /path/to/m59_web01.bimg-20210509-030454
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 2.9G
cluster_size: 65536
backing file: /path/to/m59_web01.bimg-20210124-024807
backing file format: qcow2
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
image: /path/to/m59_web01.bimg-20210124-024807
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 11G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
Here you see the number of backing files is 1 (one) file. There could be many.
If there are no backing files then skip step 3 below.
Because restoration of the primary file and all backing files is more messy than it should be if there are backing files then it is better to consolidate for backup/restore.
First try using fi-backup to consolidate and take care of all the messiness.
$ sudo fi-backup.sh -c m59_web01
but if you get an error like:
$ sudo fi-backup.sh -c m59_web01
[ERR] Error consolidating block device '/path/to//m59_web01.bimg-20210509-030454' for 'm59_web01':
error: block copy still active: disk 'vda' already in active block job
Then there's already a problem. Let's check what's going on with
$ sudo virsh blockjob m59_web01 //path/to/m59_web01.bimg-20210509-030454 --info
Active Block Commit: [100 %]
This has happened to others ( https://bugzilla.redhat.com/show_bug.cgi?id=1197592 ) and was resolved. If you get to this case then the process is to force the close with pivot. As in:
$ sudo virsh blockjob m59_web01 /path/to/m59_web01.bimg-20210509-030454 --pivot
which probably gave you a new file for the VM as the drive.
$ virsh domblklist m59_web01
Target Source
------------------------------------------------
hda -
vda /path/to/m59_web01.bimg-20210124-024807
Let's also check to see that it has no backing files:
$ sudo qemu-img info --backing-chain /path/to/m59_web01.bimg-20210124-024807
image: /path/to/m59_web01.bimg-20210124-024807
file format: qcow2
virtual size: 15G (16106127360 bytes)
disk size: 11G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
Success. Now you have to also clean up the virtual system's pivot record:
Shut down the running VM. In the VMs system logs if you see something like
var/log/syslog:... m59web01 systemd[1]: Stopping Create final runtime dir for shutdown pivot root
then shutdown and restart again until you no longer get that "Stopping" message on shutdown, boot. "Starting" and "Finishing" are ok messages.
At this point you have a consolidated drives for your VM. Using fi-backup in consolidation mode should run now with no errors.
If you ran fi-backup then it would be in the directory specified with -b If you ran fi-backup with -v you'd get a message like
[VER] Copy backing file '/path/to/m59_web01.bimg-20210124-024807' to '/path/to/backup/dir/m59_web01.bimg-20210124-024807'
/path/to/backup/dir/m59_web01.bimg-20210124-024807 is your backup file.
The documentation is unclear on how to restore backups. It would be useful to document this clearly from an end-user perspective (preferably with an example command-line in the README).