Open sabelka opened 6 years ago
I tried the script, but it did not work for me:
# virsh snapshot-create-as vfw1 20180304 20180304-backup --disk-only --atomic
error: unsupported configuration: source for disk 'sda' is not a regular file; refusing to generate external snapshot name
Maybe this is because I use lvm logical volumes for the VMs disk images? It works when I add explicit image path names, though.
--diskspec sda,file=/data/vm/20180305-backup-sda.qcow2 --diskspec sdb,file=/data/vm/20180305-backup-sdb.qcow2
With the script from the bug report modified in such a way, it worked. I let it run for 1000 loop iterations but could not reproduce the error. I also added some command to copy some data on the VMs disks while the snapshot was active, in order to put some load on the block commit. Still I did not get an error.
Does the error show up every time with backup-vm? If you make another test VM with the same environment (i.e. LVM) and try to back it up twice, is it left in a similar inconsistent state?
I think this has something to do with either LVM, which I never explicitly tested backup-vm with (note to self, add this for #10), or some difference in how I'm snapshotting/pivoting disks from how virsh does it. The snapshot code is pretty much a direct port of the calls made in virsh, but I'll look again to see if virsh has made any changes that cause it to work when my version doesn't.
Since a couple of days I'm using "backup-vm" for some qemu/libvirt VMs, so far mostly successful. Today, a backup failed with the following error:
The first backup of this VM a day before completd without errors, so either the first backup left the VM in some state which caused problems during the next run, or there was some non-deterministic (e.g. timing-dependent) issue in the second run.
The VM (called mail2) has three disks (LVM logical volums):
After the failed backup, the VMs disks were in the following state:
I tried then to remove the snapshots manually, but only sdb was succesful:
Next, I've shut the VM down and restarted it again. After I did that I was able to remove the snapshot and the status of the disks was back to normal:
I wonder if this is an issue with libvirt and/or qemu (I have libvirt version 4.0.0 and qemu 2.9.0) or with "backup-vm". What could I do to debug things further?