Closed nwf closed 3 years ago
I haven't looked into the resulting download sizes but would be a bit more convenient. We could try pushing both versions (and maybe the qcow2 as well) to docker hub and compare how much is downloaded for each.
Using QCow2 looks like an easy win:
~/cheri/output> qemu-img info cheribsd-morello-purecap.img
image: cheribsd-morello-purecap.img
file format: raw
virtual size: 2.72 GiB (2916090368 bytes)
disk size: 2.7 GiB
~/cheri/output> qemu-img info cheribsd-morello-purecap.qcow2
image: cheribsd-morello-purecap.qcow2
file format: qcow2
virtual size: 2.72 GiB (2916090368 bytes)
disk size: 1.5 GiB
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false
extended l2: false
However, xz -9 is still significantly smaller and I doubt docker compresses layers that much?
-rw-r--r-- 1 alex staff 2916090368 10 Jun 14:58 cheribsd-morello-purecap.img
-rw-r--r-- 1 alex staff 200731036 10 Jun 14:58 cheribsd-morello-purecap.img.xz
-rw-r--r-- 1 alex staff 1589379072 26 Aug 11:53 cheribsd-morello-purecap.qcow2
-rw-r--r-- 1 alex staff 199961488 26 Aug 11:53 cheribsd-morello-purecap.qcow2.xz
I wonder if it makes sense to decompress automatically inside the ENTRYPOINT?
I'm entertained that the .qcow2.xz is smaller than the .img.xz. It looks like docker used to xz
their images but stopped due to tar
implementation compatibility concerns. :(
I'd rather not decompress in the ENTRYPOINT
, if we can get away with it. Note that if you add -c
to the qemu-img convert -O qcow2
command, the gain of xz
is less significant, as the disk image transparently uses zlib
on its data:
$ qemu-img convert -O qcow2 cheribsd-riscv64-purecap.img -c cheribsd-riscv64-purecap.qcow2
-rw-r--r-- 1 root root 456065024 Aug 26 11:47 cheribsd-riscv64-purecap.qcow2
Newer qemu
s also have support for zstd
compression within qcow2
itself, which might further reduce the margin, as per https://wiki.qemu.org/ChangeLog/5.1 , but, experimentally, it only shaves off a little bit. I had to build qemu-img
in an environment with libzstd-dev
installed; Debian apparently doesn't do that by default. Anyway, /cheri/out/mainline/sdk/bin/qemu-img convert -O qcow2 -o compression_type=zstd -c cheribsd-riscv64-purecap.img cheribsd-riscv64-purecap.zstd.qcow2
generated a 425328640 byte image, which is ~7% smaller than 456065024, but not quite as impressive as the factor of two smaller that the .img.xz
would get us.
Still, I think the factor of two is worth being able to directly use the disk images, without needing to decompress separately. If the transport layer ever does become compressed again, it looks like it will be able to recover the difference and so it'd merely be a matter of disk space.
-rw-r--r-- 1 Jess staff 183M 26 Aug 14:32 cheribsd-aarch64.gz.qcow2
-rw-r--r-- 1 Jess staff 1.8G 26 Aug 13:43 cheribsd-aarch64.img
-rw-r--r-- 1 Jess staff 98M 26 Aug 13:43 cheribsd-aarch64.img.xz
-rw-r--r-- 1 Jess staff 611M 26 Aug 14:20 cheribsd-aarch64.qcow2
-rw-r--r-- 1 Jess staff 182M 26 Aug 14:32 cheribsd-morello-purecap.gz.qcow2
-rw-r--r-- 1 Jess staff 1.8G 26 Aug 13:43 cheribsd-morello-purecap.img
-rw-r--r-- 1 Jess staff 96M 26 Aug 13:43 cheribsd-morello-purecap.img.xz
-rw-r--r-- 1 Jess staff 614M 26 Aug 14:20 cheribsd-morello-purecap.qcow2
-rw-r--r-- 1 Jess staff 435M 26 Aug 14:33 cheribsd-riscv64-purecap.gz.qcow2
-rw-r--r-- 1 Jess staff 4.6G 26 Aug 13:43 cheribsd-riscv64-purecap.img
-rw-r--r-- 1 Jess staff 237M 26 Aug 13:43 cheribsd-riscv64-purecap.img.xz
-rw-r--r-- 1 Jess staff 1.4G 26 Aug 14:20 cheribsd-riscv64-purecap.qcow2
-rw-r--r-- 1 Jess staff 242M 26 Aug 14:33 cheribsd-riscv64.gz.qcow2
-rw-r--r-- 1 Jess staff 4.0G 26 Aug 13:43 cheribsd-riscv64.img
-rw-r--r-- 1 Jess staff 135M 26 Aug 13:43 cheribsd-riscv64.img.xz
-rw-r--r-- 1 Jess staff 795M 26 Aug 14:20 cheribsd-riscv64.qcow2
FWIW (.gz.qcow2 being qemu-img -c), latest Jenkins artifacts
As per https://github.com/CTSRD-CHERI/cheri-docker-images/blob/c2834f8a78186f3477e0eee9ddc6e2d9e9a8f50e/cheribsd/qemu/Dockerfile#L6 we just copy the compressed artifacts; it might be nicer to uncompress them before adding them to the image (which should then recompress them, yes?), but I'm not sure if that's the tack we want to take or if we want to just avoid compressing them in the build pipeline in the first place or convert them to qcow2 (thanks, Jess, for the suggestion) here or earlier in the pipeline or something else entirely.
As it stands, users have to unxz the kernel and image they want first, which isn't that much work, but is a little sad.