lima-vm / lima

Linux virtual machines, with a focus on running containers
https://lima-vm.io/
Apache License 2.0
15.26k stars 600 forks source link

Unneeded convert and actual copy of basedisk for every instance when using qcow2 image #2580

Open nirs opened 2 months ago

nirs commented 2 months ago

Description

With the current scheme on macOS, each time I create a vz vm from the same cached image, there is a slow convert from qcow2 to raw (#2579) to the instance basedisk. Then we create a fast copy-on-write copy for vz for the diffdisk.

So we end with:

Instead we can convert the qcow2 image once after the download and. Then create basedisk as a fast copy-on-write copy of the cached image.

Suggested layout:

Of course this works only on file systems providing copy-on-write. When we don't have copy-on-write we can do actual copies.

For qemu we can use qcow2 images using a shared template as backing file, but is very hard to managed correctly. You must ensure that template images are never deleted or changed while an instance image used them as a backing file. If we fail, the instance image will be corrupted when you start the instance.

jandubois commented 2 months ago

I think the simple solution would be to keep the converted image in the regular cache as well, so all instances use a copy-on-write copy of the cached image.

We do need to figure out how to deal with verifying checksums, but maybe we should just trust the conversion and only verify that the checksum of the original image matches.

nirs commented 2 months ago

Great question!

For verifying checksums we can use blksum: https://gitlab.com/nirs/blkhash

It requires libnbd and qemu-nbd for reading qcow2 images (or any other format), but if you have your own qcow2 reader you can use the blkhash library via cgo.

But if want an easy solution with existing tools you can do this:

curl ... -o xxx.qcow2
qemu-img convert -f qcow2 -O raw xxx.qcow2 xxx.raw
qemu-img compare xxx.qcow2 xxx.raw

The checksum published for the qcow2 image can be used to verify the download, but it is useless for verifying the converted image, since it is a checksum of the actual qcow2 image and not of the content of the image.

qemu-img compares image content seen by the guest, regardless of the image format (including compression). It skips unallocated areas so you read and compare only the actual data, and you don't pay for hash computation.

You can create a new checksum for the raw image if you want to verify that an image matches the downloaded qcow2 image or another image copied from this raw image. But this checksum will be slow since it will compute a hash for the entire image, including the unallocated areas.

The same process can be done with one command and no additional space with blksum, and much faster since it compute hashes only for data blocks an can use multiple threads:

blksum xxx.raw
blksum xxx.qcow2

This does not work on macOS yet since libnbd is not available - someone need to port it and package it.

blkhash is not packaged for anything yet, but you can use it form the copr repo for rpm based distros. If you want deb or other packages I'm happy to get patches :-)

afbjorklund commented 2 months ago

I think the reason for the current library, was to avoid a dependency on tools like qemu-img (and blksum)

So the features should probably go in the go-qcow2reader. I think it can also do file cloning, and real COW?

i.e. on APFS, you can do cp -c (unix.Clonefile)

     -c    copy files using clonefile(2).  Note that if clonefile(2) is not
           supported for the target filesystem, then cp will fallback to using
           copyfile(2) instead to ensure the copy still succeeds.
AkihiroSuda commented 2 months ago

We should also remove basedisk