NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.34k stars 14.31k forks source link

Towards unified image builders #324817

Open msanft opened 4 months ago

msanft commented 4 months ago

tl;dr

NixOS has many image builders that build an image file from a NixOS configuration, such as image.repart, system.build.isoImage, system.build.sdImage. Not only do these inconsistent interfaces create a bad UX (options being named differently but doing the same, etc.), they also make it hard to introduce new features for all of those builders due to duplication. With this, I propose to shift towards a unified image builder using systemd-repart. (i.e. the existing image.repart builder, with abstractions and default options that enable convenient builds of regularly built image types, such as the existing system.build.isoImage and system.build.sdImage). In addition to this proposition I'd like this issue to be a place for discussion on how such an interface should look like, ideally leading to a decision.

The Status Quo

We currently have (at least) 3 different publicly exposed builders that will emit a NixOS image based on a NixOS configuration:

Why is this an Issue?

Maintaining multiple builders doing similar things is tedious, especially if their work can also be done by other builders. Furthermore, all builders have wildly different configuration interfaces, making it hard for users to find which knobs they need to turn to get what they want, and even finding the correct builder for the use-case in the first place can be a hassle. Furthermore, there are features which would be great to implement for all NixOS image builders, such as build-time execution of activation scripts, making image-based appliances more immutable by not requiring a writable root filesystem.

What is desirable?

Part of this should be discussed within this issue. Personally, I propose a unified interface for building images (e.g. image) that should cover all current use-cases. The builder should be based on the current image.repart, as it provides the most configurability, and declarative partitioning. It also blends in very well with other systemd components, such as systemd-cryptsetup. There should still be convenience targets (think a system.build.isoImage, but based on the unified builder interface), which should be compatible with the current builder interface. For continuous development of such a unified interface, a first step could port a basic builder, and then the exisiting builders (e.g. system.build.isoImage) could be ported from their existing build scripts to the unified builder, maintaining compatibility. This would be a huge step for both installation- as well as image-based NixOS, increasing nixpkgs-internal development velocity on these topics, many of which are very desirable for NixOS in enterprise use-cases.

eclairevoyant commented 4 months ago

Unifying 3+ interfaces that do the same thing sounds pretty uncontroversial, doubt it needs an rfc :) Thanks for raising this.

msanft commented 4 months ago

Unifying 3+ interfaces that do the same thing sounds pretty uncontroversial, doubt it needs an rfc :) Thanks for raising this.

Yes, I think this part is out of question. The "RFC" part I was referring to rather meant a "discussion with an outcome" about how that new interface should look like

phaer commented 4 months ago

Other than that, there also are private builders such as lib/make-disk-image.nix, which are used throughout nixpkgs, e.g. to build VM images. Most of them are very use-case specific, but also encode a lot of logic.

I think more unification between disk-builders is a good idea as well. One thing that comes to mind, is that unless I am mistaken systemd-repart doesn't really support ZFS and such (yet)? Ideally a generic solution would support that to be usable in vm tests and such, but for the same reason focusing on system.build.{isoImage,sdcard} first sounds good IMO.

(I added the significant label to give this more visibility. Please remove if its not appropriate here. I believe there's at least one older issue on this topic but wasn't able to find it)

msanft commented 4 months ago

is that unless I am mistaken systemd-repart doesn't really support ZFS and such (yet)?

Correct. I should've probably mentioned that in the initial post, but didn't think of it. repart does, however, support btrfs, which should be able to handle most ZFS use-cases too. It would also not be impossible to add a custom "post-processor" that populates a blank repart partition with a ZFS. I think it should be quite easy actually.

My implementation plan is to create the general new interface (i.e. image) first, then migrate existing builders one-by-one, with enough of a waiting period in between to catch potential hiccups.

Thanks for adding the label!

arianvp commented 1 month ago

make-disk-image.nix seems to be in a little bit of rough shape. I'm running into flakiness due to its use of cptofs https://github.com/NixOS/nixpkgs/issues/345492

Given all uses of make-disk-image.nix inside of nixpkgs all use ext4 I'm curious if we can unify it with other stuff.

Idealy I'd switch to repart but I'd need some features added on top of repart:

Support grub in hybrid mode (needed for cloud images) as they don't support UEFI boot often. In general; unified support for "installing" a bootloader.

I wonder if the quickest way for this would be to write a tiny runInLinuxVM wrapper that calls NIXOS_INSTALL_BOOTLOADER=1 switch-to-configuration boot.

It's not ideal (we can probably install bootloaders without needing a VM) but we know it works.

Electrostasy commented 1 month ago

I use repart for generating flashable images for SBCs, such as Raspberry Pi, instead of the sdImage builder, so I would like to share my thoughts on this usecase. I noticed some inherent shortcomings in repart images from systemd-repart:

On my second point, just brainstorming here, but could we use systemd-repart/repart/something based on them to add partitions to an image file with a pre-existing GPT? I can see that adding partitions is supported in systemd-repart, at least, but this looks like it could get complicated.

In general, on SBCs that support booting from GPT, like the Raspberry Pi 4 and Compute Module 4, repart works very well. I find the overall interface of the repart module to be very nice to use, and while there are some limitations inherent with systemd-repart that cannot fully replace sdImage for me yet, having pre-/post-process options could work around some SBC-specific requirements.

arianvp commented 1 month ago

No MBR/hybrid MBR support means we can't use repart generated images on older SBCs that do not support booting from GPT, like the Raspberry Pi 02w and 3B+, without additional post-processing. As far as I can tell it's on the TODO.

Hybrid support shouldn't be too hard I think? It's a matter of creating a grub partition

https://github.com/systemd/mkosi/blob/main/mkosi/__init__.py#L2993

And then we need to install grub in the MBR in the final image (Could e.g. be done using a runInLinuxVM)

arianvp commented 1 month ago

On my second point, just brainstorming here, but could we use systemd-repart/repart/something based on them to add partitions to an image file with a pre-existing GPT? I can see that adding partitions is supported in systemd-repart, at least, but this looks like it could get complicated.

Yes this was the primary purpose it was developed for in the first place! If you pass it an image with --image that already has a GPT; it will just do purely additive operations. We could perhaps add an option to add a base image to the module?

Electrostasy commented 1 month ago

Hybrid support shouldn't be too hard I think? It's a matter of creating a grub partition

https://github.com/systemd/mkosi/blob/main/mkosi/__init__.py#L2993

And then we need to install grub in the MBR in the final image (Could e.g. be done using a runInLinuxVM)

Interesting, I may need to experiment with this. But isn't the resulting image still GPT? For the purposes of getting SBCs that can't boot from GPT to boot from GPT, just creating a hybrid MBR is easy enough with a configurable post-process stage IMO. From my understanding, if you mess with the partition table afterwards (resizing/adding partitions?), it can break in unexpected ways if not done correctly (so it can be brittle). Ideally we wouldn't need hybrid MBRs in the first place.

Yes this was the primary purpose it was developed for in the first place! If you pass it an image with --image that already has a GPT; it will just do purely additive operations. We could perhaps add an option to add a base image to the module?

This would be awesome to have and would probably solve my gripes with hardcoded u-boot offsets mentioned earlier.

arianvp commented 1 month ago

Interesting, I may need to experiment with this. But isn't the resulting image still GPT?

It's specific to grub I think. You put (part of) GRUB in the MBR; that code the loads the rest of the GRUB bootloader from the GRUB partition. And then the grub bootloader reads the kernels and initrds from the ESP

https://en.wikipedia.org/wiki/BIOS_boot_partition

DaanDeMeyer commented 1 month ago

@msanft If you're actually able to figure out El Torito, we wouldn't mind supporting that in repart itself (directly, no xorisso or similar tools allowed)