coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
263 stars 60 forks source link

Fedora CoreOS images should support discoverable partitions #1038

Open Richterrettich opened 2 years ago

Richterrettich commented 2 years ago

Describe the enhancement Fedora CoreOS images should follow the _Discoverable Partitions Specification (https://systemd.io/DISCOVERABLE_PARTITIONS/)._ At the moment, when I mount a bare metal image locally, it prints the following partition types:

lsblk -o NAME,LABEL,PTTYPE,PARTTYPE,PARTTYPENAME
NAME      LABEL                 PTTYPE PARTTYPE                             PARTTYPENAME
loop0                           gpt                                         
├─loop0p1                       gpt    8da63339-0007-60c0-c436-083ac8230908 Linux reserved
├─loop0p2 EFI-SYSTEM            gpt    c12a7328-f81f-11d2-ba4b-00a0c93ec93b EFI System
├─loop0p3 boot                  gpt    0fc63daf-8483-4772-8e79-3d69d8477de4 Linux filesystem
└─loop0p4 root                  gpt    0fc63daf-8483-4772-8e79-3d69d8477de4 Linux filesystem

PTTYPE of loop0p4 should be b921b045-1df0-41c3-af44-4c6f280d3fae (since this is an aarch64 system) instead of the generic linux filesystem that it currently is. Likewise, loop0p3 should have bc13c2ff-59e6-4262-a352-b275fd6f7172 as PTTYPE. This would make it easier for tools to correctly mount these partitions without resorting to label parsing. System details

Additional Information The image I've used was fedora-coreos-35.20211029.3.0-metal.aarch64.raw

bgilbert commented 2 years ago

Could you give an example of a tool that would benefit from this?

Richterrettich commented 2 years ago

I am currently working on a tool that mounts disk images and optionally opens a local shell with systemd-nspawn. One of the challenges is the detection of system relevant partitions like /boot, / and /boot/efi in a distribution/vendor agnostic way. The PARTTYPE id would be really helpful since labels differ between vendors.

travier commented 2 years ago

If we're going to do this we probably need Ignition to update partitions types when re-partitioning the disk with LUKS/RAID/etc.

Richterrettich commented 2 years ago

I am able to help if you want me to. In case you do, can you please compile me a list of related projects?

dustymabe commented 2 years ago

We discussed this during the community meeting yesterday.

12:12:15*        dustymabe | #agreed While we don't see a lot of immediate value
                           | in changing the partition type UUIDs we don't
                           | currently know of anything consuming the ones we do
                           | set. Switching them to match the Discoverable
                           | Partitions Specification is not something we're
                           | opposed to and may be worth it if other tools start
                           | to use this information when inspecting disk images.
                           | We need to consult for more expertise on
12:12:15         dustymabe | the implications for Ignition/Butane, but barring
                           | complications there this seems like a reasonable
                           | change to make.

@bgilbert - given your expertise with Igntion/Butane can you weigh in on the implications for that tooling and if there are any major concerns with moving forward on a plan like this?

bgilbert commented 2 years ago

I concur that this is harmless but not especially useful. Of the benefits listed at the top of the spec, 1 and 2 are not relevant, 4 isn't especially interesting, and 3 is dubious. If a container manager expects to mount an FCOS rootfs, bypass the initrd, and directly start pid 1, startup will fail since we won't have pivoted into the ostree deployment. And to the extent discoverability would benefit tooling that modifies disk images behind the back of the OS provisioning system, users have the freedom to do that of course, but I'm not excited about encouraging it.

As to practicalities: It appears that components of a multiple-device volume (RAID, LVM, etc.) aren't affected by the spec, even when such a volume is used for the root filesystem. The spec only mentions RAID once, to say that it's out of scope. This BZ comment explicitly says that some other type GUID should be used for such partitions.

For LUKS-encrypted root, the spec and the BZ comment both say that we should use the discoverable GUID and then name the LUKS volume root. The Butane sugar already does the latter.

So AFAICS the logistics would be:

  1. The OS itself would continue to ignore these GUIDs, e.g. by disabling systemd-gpt-auto-generator.
  2. create_disk.sh would be updated to use the discoverable GUIDs.
  3. The Butane boot_device templates would not actually need any changes. In the RAID and RAID+LUKS cases, we should not use the discoverable GUIDs. In the LUKS-only case, we reuse the existing partition without modifying its metadata, and we're already using the correct LUKS device name.
  4. Users can also bypass the Butane sugar and specify replacement root/boot partitions via the Ignition/Butane storage section. Users who do this, and who want the partitions to support the discoverable spec, would need to set the correct type_guid for their partitions. Since the OS ignores the GUIDs, there are no other consequences for failing to do so.

@Richterrettich, if you're up for it, I think this just needs a coreos-assembler PR for create_disk.sh and a fedora-coreos-config PR to add tests. We'd probably want a non-exclusive kola test verifying that the relevant GUIDs are used, plus additional GUID checks in the luks and raid1 root-reprovision tests.

jlebon commented 2 years ago

Some related discussions in https://github.com/coreos/fedora-coreos-tracker/issues/976.

cmurf commented 2 years ago

Mentioned in #976, asking upstream about GRUB support for discoverable partitions: https://lists.gnu.org/archive/html/grub-devel/2022-01/msg00171.html

Discussion in systemd-devel@ the idea of a "discoverable subvolumes spec" that mimics discoverable partitions but for storage supporting multiple trees. While the lingo and impetus is Btrfs, the idea is to include plain directories so that ostree could take advantage of this naming scheme, as well as LVM and ZFS if they wanted to. https://lists.freedesktop.org/archives/systemd-devel/2021-November/047059.html