[Feature Request] qcow2 support

alexrelis commented 2 years ago

I don't mind using Virtualbox to create the VM images, but it would be nice if vtoyboot (and the Windows Vhd boot plugin) supported fixed and dynamic qcow2, the default image files in QEMU/KVM since that is a lot of GNU/Linux users' preferred hypervisor.

Thanks for your work on Ventoy.

hgkamath commented 1 year ago

The TLDR of this comment is that using raw-img for boot-virtual-disks isn't so bad. There was a time when I fretted over assuming raw-imgs were less desirable. I am a fellow user. my 2-cents.

Firstly, ventoy can only chainload to a OS/kernel that can itself loop-mount its device-filesystem during bootstrap. If the kernel can't do it, then no way.

As of today, Linux has not made any virtual-disk format to be first-class inside the kernel. I think the reasons include

Kernel philosophy is that anything that complex is better implemented in a userspace process that uses some kernel services, and which implements most of its functionality in the linked user-space library-code.
There are so many virtual-disk formats
Its better to let disk formats evolve and improve instead of kernel preferring any or getting tied to any one.

There have been 3rd party attempts to create native qcow2 kernel module support in that past (such as qloop), but, they never got mainstreamed. So formats like qcow2, vhdx, vmdk etc, need userspace tools like guestfs, qemu, libvirt and other vdisk libraries (libvhdx, libvmdk etc) etc in order for the kernel to loop mount them. Hence, they may be auto-mounted later in the boot sequence.

For this reason, dynamic virtual-disks are out of scope of ventoy, until the first-class kernel support appears.

But, in order to do native-boot on vdisk-images, the boot-virtual-disks must be of fixed type. Firstly, they reside on a containing-partition with its own filesystem. Both windows and Linux, when native booted on a virtual-disk, cannot expand/shrink the system disk booted-image, as that could corrupt the inode-table, file-allocation-table of the filesystem on the containing-partition. Let's say, as of today, it is too difficult to implement dynamic image native boot in kernel. The OS/kernel could mount the raw-img vdisk containing partition as read-write, if ventoy chainloads with VTOY_LINUX_REMOUNT option, but the OS/user must not cause externally visible changes such as altering the size of the booted-virtual-disk-image file. So the boot-virtual-disk file-format must be of fixed-type (fixed-raw) and so must have a predecided fixed size.

For fixed disks, like fixed-vhd & fixed-vdi, I think, the trick used is to make the OS/Linux read-fields from vdisk headers and dm-map the contiguous block-allocation of disk image, which ranges from the starting-block to the ending-block inside the fixed-vdisk. And, so ventoy chainloaded OS/Linux is able to mount them using ventoy's "Linux vdisk boot plugin" and "windows vhd boot plugin" Ventoy support for chainloading fixed-vhdx and fixed-qcow2 may follow sometime whenever development catches up. I think this happens in the initramfs scripts, before the kernel pivots the root filesystem. The initramfs script mechanisms have to recognize the fixed-disk format and the correctly dm-map the blocks inside the fixed image and present to the OS a linear contiguous block device. Note, that ventoy doing is only setting up the tooling for the loop-mounting, the OS being booted up is doing the loop mounting.

But, while on this point of using a fixed-virtual-disk, there is little advantage in using a fixed-virtual-disk-format over raw-img. raw-img are also fixed-size by definition. Having a single virtual-disk format only offers convenience and homogeneity in usage of tools that manage the chosen virtual-disk-format. The virtual-disk-format also stores disk meta-data like geometry in its header, which in this use-case maybe not required. So, the fixed-vdisk is slightly larger than the rawimg. The fixed-virtual-disk-format may impose extra setup work in order to do image access.

So one might as well use a raw-img for the VM boot images. Note that, only the boot image, the one with the efisysfs, bootfs and root_var_fs partitions, needs to be raw-img. Once the linux-image boots up, after fstab processing, there are ways of configuring the system/kernel to mount other filesystems of whichever filesystem type, in whichever block-device, in whichever virtual-disks, of whichever vdisk-type they might be.

So only boot-virtual-disks need to be be raw-img, and not other virtual-disks, which may be dynamic. The boot-virtual-disk may be of size >8gb each at minimum. The other virtual-disks, may be regular dynamic qcow2 if you prefer, and may contain other mount-points like home, opt, podman, etc. I made a fedora-36 boot-image on a 40Gb raw-img, of which the system is using only 15% disk-usage. This excess unused disk-space is reserved for possible future need and therefore the under-utilization of disk space of the boot disk, seems like an acceptable compromise. You need space for future installs and upgrades. If the space isn't enough for system upgrade, one can boot via a VM, attach and mount a temporary virtualdisk at /var/lib/dnf/system-upgrade and perform the system-upgrade. The advantage of the dynamic virtual-disk for the other partitions is that the virtual-disk needs to be only as large as its contents. This allows machine-user to do better space utilization and over-provision the virtual-machines/nativeboot-machines.

nb. Virtualbox can be made to read/write/boot to a raw-img, by specifying a file-path pointer to it in an external plain-text metadata descriptor VMDK file. (see hgkamath-createvmdk). Virtualbox qcow2 support may be read-only and it only claims qcow. Furthermore, VDI performance using qemu/kvm/libvirtfs/guestfs becomes slowed down while the dynamic disk fills with data and expands (in my tests, you should retest), but fast otherwise when not growing. In windows/virtualbox, one may use VMDK vdisks, with the boot image being a vmdkwrapper to the raw-img. In Linux, one can use raw-img for boot-image and for other-images use VMDK using nbdkit+VMware-Virtual-Disk-Development-Kit , which may be more performant (other choices being libguestfs and qemu-nbd). I may have gone this route, but then I switched, from Oracle-Virtualbox, first to Intel-HAXM/qemu, and then to Microsoft-HyperV/qemu, each of which have their own merits, so now I use a raw-img for native-boot disks, with additional qcow2 for data-vdisks that are mounted post-boot.

Developer can perhaps add corrections to my understanding.

lz-lunzi commented 1 year ago

@hgkamath Can you share some documents or tutorials on the implementation process of the “raw-img for native-boot disks, with additional qcow2 for data-vdisks that are mounted post-boot”

lz-lunzi commented 1 year ago

@hgkamath @ventoy Verified boot in vhdx format https://unix.stackexchange.com/questions/309900/deploy-linux-into-and-boot-from-vhd/465215#465215 http://wuyou.net/forum.php?mod=viewthread&tid=418705

hgkamath commented 1 year ago

1) The implementation process for booting fixed-raw-img is much simpler than fixed-vdisk-formats like fixed-vhd/vhdx/vdi etc as no special handling/plugins are required. The grub2 bootloader itself can bootload kernel and initramfs images from fixed-raw-imgs easily by grub using loopback grub command. Lots of grub2 info/tutorials on this. (link1, link2). But then, the initramfs needs scripts to also setup a dm-device corresponding to the image-file-on-disk and pivot mount to the root-fs in it. Ventoy-grub dynamically automates all the configuration and subsequent root-fs pivoting during bootup. For this, all one needs to do is give a suffix .vtoy to the raw image file's filename in the VENTOY-partition. (link) 2) Once the boot-vdisk/native-boot-OS comes up, one can first mount other host-partitions that may contain the data-vdisk images. It is less hassle if the data-vdisks are on a different partition than that raw-boot-image containing VENTOY partition. If they are on the same partition as the raw-boot-image (the VENTOY partition), then one needs to look into the VTOY_LINUX_REMOUNT feature that insmods a kernel tainting dm-patch (link) or ANTI_VENTOY method link. 3) After mounting any required host partitions, all one needs to do is use tools like libvirt/guestmount/guestfish or qemu-storage-daemon/nbd to find and mount the data vdisks of any format. ( link1-guestmount, link2-qemu-storage-daemon, link3-qemu-nbd). 3) One may either choose to do the attaching/mounting manually as needed on each boot. Or one may automate it by creating a systemd service script to attach/mount on systemctl service-start, unmount/detach on systemctl service-shutdown, or by using shell scripts.

lz-lunzi commented 1 year ago

@hgkamath Thank you very much for sharing But I didn't try to succeed Can you share your script and configuration or demo

lz-lunzi commented 1 year ago

@hgkamath I found a piece of information that may solve my problem https://github.com/MobtgZhang/VHD-Boot But I couldn't use the vdfuse in the tutorial, so I replaced it with qemu ndb But I didn't succeed in guiding If you have time, you can try and share your successful experiences

hgkamath commented 1 year ago

Presently, this is my setup, if I wanted to native-boot into a ventoy image

The boot image with raw image suffix raw.img.vtoy is in a different HDD in D: , in the first partition, also known as the 'ventoy partition'. The second partition is the ventoy-EFI. The nativeboot images must be fixed size type and not dynamic, So, one might as well use a raw fixed image instead of vhdx-fixed. This bootable image is prepared in a qemu VM on a regular boot host-OS. I have not explored using the other VHD boot ventoy plugins. So don't know anything regarding that avenue.
I reboot laptop, and navigate through my machine's system of boot menus in order to get to ventoy-grub. Then from ventoy grub, I find and boot into the raw vtoy image. Linux in the vtoy-image boots with no issues.
Inside the ventoy-native-booted linux,
- Mounting a DIFFERENT host-volume which could also be on a DIFFERENT SSD/HDD like /dev/sdb3 is trivial, nothing special has to be done. The host volume can be mounted with regular mount command. By 'different', I mean not the 'ventoy partition'
- After mounting any such "different" host-volume, it is also trivial to mount any qcow2/vhdx image located inside that host-volume using qemu-storage-daemon.
- On the other hand, accessing images that are on the SAME host-volume containing the boot image (the 'ventoy partition') is harder because mounting the host-volume containing the boot image is NOT trivial. Without intervention, ventoy booted image cannot mount the same host volume on which it resides (D:\=/dev/sdc1). This is because the blocks that constitute the mounted-boot-raw-image and that of the ventoy-partition overlap (one is obviously contained inside the other). In order to get passed the problem, ventoy's official solution is to use VTOY_LINUX_REMOUNT feature in conjunction with the vtoyboot dmpatch kernel code injection plugin. the vtoyboot boot scripts brute-forces creation of a device-mapper device /dev/mapper/ventoy despite the block overlap. The device-mapper volume is then mountable and the files-within confirm that it is the host-volume. But, I use the VTOY_LINUX_REMOUNT feature and the ANTI_VENTOY idea, which you should read about in link given. Be warned that both approaches have their own different dangers if you don't know what you are doing.
- Once the devce-mapper volume corresponding to the ventoy-partition containing the raw-image-file is mounted, then as was done in the previous case, it is trivial to mount any other qcow2/vhdx image located inside the device-mapper volume using qemu-storage-daemon. Be warned to not touch/modify/access/mount the native-booted raw-image-file itself.
```
PS C:\vol\scoop_01\scoopg\apps> dir D:\bootable\m02_lnx.raw.img.vtoy
Directory: D:\bootable
Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----         6/23/2023  10:08 AM    49861885952 m02_lnx.raw.img.vtoy  
```

PS C:\vol\scoop_01\scoopg\apps> dir .\ventoy\current\VentoyPlugson.exe Directory: C:\vol\scoop_01\scoopg\apps\ventoy\current Mode LastWriteTime Length Name

------ 6/23/2023 8:23 PM 372736 VentoyPlugson.exe

PS C:\vol\scoop_01\scoopg\apps> cat D:\ventoy\ventoy.json { "control":[ { "VTOY_LINUX_REMOUNT": "1" } ] }

ventoy / vtoyboot

[Feature Request] qcow2 support #47