vrothberg / bootc-playground

bifrost examples
Apache License 2.0
5 stars 2 forks source link

Making a bootable image #1

Closed rwmjones closed 7 months ago

rwmjones commented 7 months ago

Vivek asked me to look this one over and I have a few suggestions which I'll attach here.

First of all, nbdkit could really do with a plugin that automates everything below. The hard part (for me) is to find and download the container data. Surely there are APIs to do that, but I've no idea how. So we could collaborate on that.

Secondly you can make a bootable disk image directly in various ways. You need to choose your poison^Wboot method. Possibilities include extlinux or direct kernel booting (qemu -kernel). But here I'm going to use UEFI -> shim -> kernel which has the advantage that it is compatible with Secure Boot. Note this requires a newish feature in Fedora 38 called Unified Kernel Image.

So say you've got a *.tar.gz containing the files in your operating system. The following script will create a bootable qcow2 file from it. Note the script does not need to be run as root (and ideally should not be run as root).

#!/usr/bin/python3
import guestfs
import re
import codecs

# This must contain shim-x64 and kernel-uki-virt:
input="/var/tmp/fedora-38.tar.gz"
output="/var/tmp/fedora.qcow2"
format="qcow2"
MB=1024*1024
GB=1024*1024*1024
disk_size=6*GB
efi_part_size=512*MB
boot_part_size=512*MB

# https://uapi-group.org/specifications/specs/discoverable_partitions_specification/
root_guid="4f68bce3-e8cd-4db1-96e7-fbcaf984b709"
# Standard UEFI ESP GUID
esp_guid="C12A7328-F81F-11D2-BA4B-00A0C93EC93B"

g = guestfs.GuestFS(python_return_dict=True)
g.set_trace(1)
g.disk_create(output, format, disk_size)
g.add_drive_opts(output, format=format, readonly=0)
g.launch()

# Make the partition table for UEFI.
g.part_init("/dev/sda", "gpt")
n=128; e=int(n + efi_part_size / 512)
g.part_add("/dev/sda", "p", n, e - 1)
g.part_set_gpt_type("/dev/sda", 1, esp_guid)
n=e; e=int(n + boot_part_size / 512)
g.part_add("/dev/sda", "p", n, e - 1)
n=e
g.part_add("/dev/sda", "p", n, -128)
g.part_set_gpt_type("/dev/sda", 3, root_guid)

# Create the filesystems and mount them up.
g.mkfs("vfat", "/dev/sda1")
g.mkfs("ext4", "/dev/sda2")
g.mkfs("xfs", "/dev/sda3")
g.mount("/dev/sda3", "/")
g.mkdir("/boot")
g.mount("/dev/sda2", "/boot")
g.mkdir("/boot/efi")
g.mount("/dev/sda1", "/boot/efi")

# Unpack the filesystem from the tarball.
g.tar_in(input, "/", xattrs=True, selinux=True, compress="gzip")

# Find the kernel UKI and copy to the ESP.
uki=next(x for x in g.ls("/boot") if re.match(r"vmlinuz-virt\.efi.*", x))
g.cp("/boot/%s" % uki, "/boot/efi/EFI/KERNEL.EFI")

# Update BOOTX64.CSV so shim will boot the kernel directly,
# without grub.
cmdline=g.cat("/etc/kernel/cmdline")
csv="shimx64.efi,KERNEL,\\EFI\\KERNEL.EFI %s" % cmdline
g.write("/boot/efi/EFI/fedora/BOOTX64.CSV",
        codecs.BOM_UTF16_LE + csv.encode('utf-16-le'))

# Fix up /etc/fstab (may not be necessary with autodetect?)
root_fs_uuid=g.vfs_uuid("/dev/sda3")
boot_fs_uuid=g.vfs_uuid("/dev/sda2")
efi_fs_uuid=g.vfs_uuid("/dev/sda1")
fstab = """
UUID=%s / xfs defaults 0 0
UUID=%s /boot ext4 defaults 0 0
UUID=%s /boot/efi vfat defaults 0 0
""" % (root_fs_uuid, boot_fs_uuid, efi_fs_uuid)
g.write("/etc/fstab", fstab)

# Unmount everything and sync image.
g.umount_all()

To boot this as a transient VM:

$ rm ~/.config/libvirt/qemu/nvram/fedora38_VARS.fd
$ virt-install --transient --import --disk path=fedora.qcow2  --boot uefi --os-variant fedora38

(That rm command is because of a bug in libvirt)

To create the *.tar.gz file for testing purposes I used this script, but I guess in your case you'd somehow get podman to give you the tarball. (See my nbdkit request above).

#!/bin/bash -
set -e
set -x
virt-builder fedora-38 \
             --install shim-x64,kernel-uki-virt \
             --root-password password:123456
guestfish --ro -a fedora-38.img -i \
          tar-out / - xattrs:true selinux:true compress:gzip > fedora-38.tar.gz
rm fedora-38.img
rwmjones commented 7 months ago

Note my tarball was prepared with SELinux labels. systemd will refuse to boot hard if it finds an unlabelled filesystem, so if you have that you'll need to either do selinux=0 or g.touch("/.autorelabel") in the Python script.

rwmjones commented 7 months ago

The domain should go away on its own at shut down (since it is --transient), but you could also kill it with virsh destroy <N>. However the _VARS.fd file is left over afterwards (libvirt bug) so you have to delete that manually else it will interfere with future boots of any same-named VM.

berrange commented 7 months ago

Functionally what the python script above does is fine, but FWIW, the creation of a /boot partition is redundant in this particular setup. The above fully boots from just the ESP partition, so the /boot partition isn't adding any value. IOW, the whole thing would functionally work fine with merely the ESP and root FS partitions.

vrothberg commented 7 months ago

Thanks for reaching out, @rwmjones.

Are you sure this is the right repo to reach out to? I created github.com/vrothberg/bootc-playground just to make it easier to play with the various disk images and possibilities of using bootc. The Makefile was mostly intended to write the command lines once. In other words, it's just a pet project to make my life easier and to share it with others if they want to play with bootc and don't have much experience fiddling with virt-install.

Shall we move this issue over to https://github.com/containers/bootc? I think we'll reach the right audience there.

rwmjones commented 7 months ago

Sure, Vivek pointed me to this one. Can we copy the issue over somehow?

vrothberg commented 7 months ago

Sure, Vivek pointed me to this one. Can we copy the issue over somehow?

I thought it would be possible to move issues across projects (i.e., from github.com/vrothberg to github.com/containers) but it doesn't seem to work actually, sorry.

Can you recreate the issue over at https://github.com/containers/bootc?