eslam-gomaa / virt-backup

Fully backup your KVM Virtual Machines
28 stars 2 forks source link

Snapshot checksums don't validate even though they are identical #4

Closed justinhartman closed 2 years ago

justinhartman commented 2 years ago

While I recognise you haven't officially tested this on Debian, I have installed and been running the project with great success on Debian 11. The key issue however is that you cannot restore a backup when you use the --with-snapshots, or -s option. The backup succeeds with the snapshot option set but a restore always fails the checksum. The interesting part is that the checksum's between the original and the restored version are identical save for the order in which they appear.

Have a look at this output:

$ virt-backup --restore \
 --backup-file /home/backups/images/base-ubuntu18.zip \
 --restore-dir ~/

[ INFO ] Restoring VM: (base-ubuntu18), May take time based on size
      => base-ubuntu18.checksum  [OK]
      => base-ubuntu18-base-ubuntu18_2021-10-16_01:49am-snap.xml  [OK]
      => base-ubuntu18.xml  [OK]
      => base-ubuntu18.qcow2  [OK]
[ INFO ] Backup restored successfully in (/root/base-ubuntu18)
[ INFO ] Getting checksum of the restored backup - May take time based on size
[ INFO ] Only Primary disk is detected for this backup
[ INFO ] Comparing backup MD5 vs restored MD5
[ Error ] Found checksum mismatch between backup and restored files
      => Difference: {"base-ubuntu18-base-ubuntu18_2021-10-16_0149am-snap.xml"=>["692b4090872999c1ea223bd5035ce79b", nil], "base-ubuntu18-base-ubuntu18_2021-10-16_01:49am-snap.xml"=>[nil, "692b4090872999c1ea223bd5035ce79b"]}
[ INFO ] Rolling back

When looking at the difference in a more structured way, this is what the difference looks like:

{
    "base-ubuntu18-base-ubuntu18_2021-10-16_0149am-snap.xml" => [
        "692b4090872999c1ea223bd5035ce79b", nil
    ], 
    "base-ubuntu18-base-ubuntu18_2021-10-16_01:49am-snap.xml" => [
        nil, "692b4090872999c1ea223bd5035ce79b"
    ]
}

I think it's safe to say that the checksums are in fact identical but it appears the ordering of the checksum and nil is where there's a difference.

OS / Build

$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 11 (bullseye)
Release:    11
Codename:   bullseye
$ ruby -v
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x86_64-linux-gnu]
$ kvm --version
QEMU emulator version 5.2.0 (Debian 1:5.2+dfsg-11+deb11u1)
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
eslam-gomaa commented 2 years ago

Thank you Justin,

Will test it today

eslam-gomaa commented 2 years ago

Hi @justinhartman

I've introduced automated testing & improved restoring snapshot behavior It works fine on Debian 11 on my test

I'll close the issue now, if you still face it please re-open the issue


Test Script

The Pipeline test log for Debian11 (needed part)

    debian11: Cloning into 'virt-backup'...
    debian11: Usage: /var/virt-backup/virt-backup.rb --backup | --restore [options]
    debian11:     -B, --backup                     Backup KVM VM
    debian11:     -R, --restore                    Restore KVM VM
    debian11:     -s, --with-snapshots             Backup the Snapshots along with the VM
    debian11:     -S, --system-disk-only           Backup the system disk only
    debian11:     -o, --original-vm                Original VM to be Cloned
    debian11:     -D, --save-dir                   Backup save directory
    debian11:     -d, --backup-file                ZIP File which represents the VM backup
    debian11:     -r, --restore-dir                Restore directory, with --restore
    debian11:     -c, --compression                Choose the compression level; Default: default

==> debian11: Running provisioner: Run Tests (shell)...
    debian11: Running: /tmp/vagrant-shell20220128-18310-hkt0la.sh
    debian11: 
    debian11: **************************************
    debian11:       Create a test VM
    debian11: **************************************
    debian11: 
    debian11: --2022-01-28 06:57:37--  https://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img
   .....................................
   .....................................
    debian11: Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... connected.
    debian11: HTTP request sent, awaiting response... 200 OK
    debian11: Length: 13287936 (13M) [application/octet-stream]
    debian11: Saving to: ‘cirros.img’
    debian11: 
    debian11:      0K .......... .......... .......... .......... ..........  0% 5.43M 2s
    debian11:     50K .......... .......... .......... .......... ..........  0% 9.32M 2s
    debian11:    100K .......... .......... .......... .......... ..........  1% 5.71M 2s
    .............................................................
   .............................................................
    debian11:  12950K .......... .......... ......                          100% 11.2M=1.9s
    debian11: 
    debian11: 2022-01-28 06:57:40 (6.53 MB/s) - ‘cirros.img’ saved [13287936/13287936]
    debian11: 
    debian11: WARNING  OS name 'rhel7' is deprecated, using 'rhel7.0' instead. This alias will be removed in the future.
    debian11: WARNING  Requested memory 512 MiB is less than the recommended 1024 MiB for OS rhel7.0
    debian11: 
    debian11: Starting install...
    debian11: Domain creation completed.
    debian11: 
    debian11: **************************************
    debian11:       Take snapshots
    debian11: **************************************
    debian11: 
    debian11: Domain snapshot test-running created
    debian11: Domain 'cirros' destroyed
    debian11: 
    debian11: Domain snapshot test-shutdown created
    debian11: Domain 'cirros' started
    debian11: 
    debian11: 
    debian11: **************************************
    debian11:       Backup the VM
    debian11: **************************************
    debian11: 
    debian11: 
    debian11: [ INFO ] Current VM State: running
    debian11: [ INFO ] Pausing the VM
    debian11: [ INFO ] Getting checksum of the Files will be backed up - May take time based on size
    debian11: [ INFO ] Backing up VM: (cirros), N of disks: (1) - May take time based on size
    debian11:     => cirros.checksum  [OK]
    debian11:     => cirros-test-running-snap.xml  [OK]
    debian11:     => cirros-test-shutdown-snap.xml  [OK]
    debian11:     => cirros.xml  [OK]
    debian11:     => cirros.img  [OK]
    debian11: [ INFO ] Resuming the VM
    debian11: [ INFO ] Current VM State: running
    debian11: [ INFO ] Backup stored successfully in (/var/lib/libvirt/images/backup/cirros.zip)
    debian11: 
    debian11: **************************************
    debian11:       Delete Original VM
    debian11: **************************************
    debian11: 
    debian11: Domain snapshot test-shutdown deleted
    debian11: 
    debian11: Domain snapshot test-running deleted
    debian11: 
    debian11: Domain 'cirros' destroyed
    debian11: 
    debian11: Domain 'cirros' has been undefined
    debian11: 
    debian11: 
    debian11: **************************************
    debian11:       Restore the VM
    debian11: **************************************
    debian11: 
    debian11: 
    debian11: [ INFO ] Restoring VM: (cirros), May take time based on size
    debian11:     => cirros.checksum  [OK]
    debian11:     => cirros-test-running-snap.xml  [OK]
    debian11:     => cirros-test-shutdown-snap.xml  [OK]
    debian11:     => cirros.xml  [OK]
    debian11:     => cirros.img  [OK]
    debian11: [ INFO ] Backup restored successfully in (/var/lib/libvirt/images/cirros)
    debian11: [ INFO ] Getting checksum of the restored backup - May take time based on size
    debian11: [ INFO ] Comparing backup MD5 vs restored MD5
    debian11: [ INFO ] MD5 check is OK :)
    debian11: [ INFO ] Updating disk location with the restored dir
    debian11: [ INFO ] Defining the restored VM: (cirros)
    debian11: [ INFO ] VM: (cirros) defined successfully
    debian11:     => Domain 'cirros' defined from /var/lib/libvirt/images/cirros/cirros.xml
    debian11: [ INFO ] Updating snapshots disks location with the restored dir
    debian11: [ INFO ] Restoring Internal Snapshots - (2) detected
    debian11: [ INFO ] (1) snapshots in RUNNING/PAUSED state detected
    debian11: [ INFO ] Starting the VM: (cirros)
    debian11: 
[ INFO ] Waiting for: 60 seconds  
[ INFO ] Waiting for: 59 seconds  
............................................
............................................
[ INFO ] Waiting for: 1 seconds  
[ INFO ] Waiting for: 0 seconds  
    debian11:     => Snapshot: (test-running) Restored Successfully
    debian11:     => Snapshot: (test-shutdown) Restored Successfully
    debian11: [ INFO ] Reverting to the last snapshot: (test-shutdown)
    debian11:     => Snapshot: (test-shutdown) Reverted Successfully
    debian11: 
    debian11: **************************************
    debian11:            End of tests
    debian11: **************************************