osbuild / images

Image builder image definition library
Apache License 2.0
23 stars 54 forks source link

edge-raw-image with custom partition mount points does not boot after upgrade #352

Open mcattamoredhat opened 10 months ago

mcattamoredhat commented 10 months ago

We have been testing locally RHEL 9: Filesystem customizations for edge-raw-image #255 RHEL-9.4

osbuild-104-1.20240108gitc62e555.el9.noarch osbuild-composer-98-1.20240108git0169b7b.el9.x86_64 weldr-client-35.9-1.el9.x86_64

With blueprint:

[[customizations.filesystem]]
mountpoint = "/foo/bar"
size=2147483648

[[customizations.filesystem]]
mountpoint = "/foo"
size=8589934592

[[customizations.filesystem]]
mountpoint = "/var/myfiles"
size= "1 GiB"

After booting the image both in BIOS and UEFI mode, we see the corresponding logical volumes created.

changed: [192.168.100.51] => {"changed": true, "cmd": ["df", "-h"], "delta": "0:00:00.010319", "end": "2024-01-05 07:41:20.280108", "msg": "", "rc": 0, "start": "2024-01-05 07:41:20.269789", "stderr": "", "stderr_lines": [], "stdout": "Filesystem                        Size  Used Avail Use% Mounted on
devtmpfs                          4.0M     0  4.0M   0% /dev
tmpfs                             1.5G     0  1.5G   0% /dev/shm
tmpfs                             578M  936K  577M   1% /run
/dev/mapper/rootvg-rootlv         9.0G  1.7G  7.3G  19% /sysroot
/dev/mapper/rootvg-foolv          8.0G   89M  7.9G   2% /foo
/dev/mapper/rootvg-var_myfileslv  960M   39M  922M   5% /var/myfiles
/dev/mapper/rootvg-foo_barlv      2.0G   47M  1.9G   3% /foo/bar
/dev/vda3                         320M  126M  195M  40% /boot
/dev/vda2                         127M  7.0M  120M   6% /boot/efi
tmpfs                             289M     0  289M   0% /run/user/1000"

Nevertheless, the system is not able to boot the deployment ostree:0 after upgrade.


BdsDxe: loading Boot0002 "redhat" from HD(2,GPT,68B2905B-DF3E-4FB3-80FA-49D1E773AA33,0x1000,0x3F800)/\EFI\redhat\shimx64.efi
BdsDxe: starting Boot0002 "redhat" from HD(2,GPT,68B2905B-DF3E-4FB3-80FA-49D1E773AA33,0x1000,0x3F800)/\EFI\redhat\shimx64.efi

                               GRUB version 2.06

 ┌────────────────────────────────────────────────────────────────────────────┐
 │*Red Hat Enterprise Linux 9.4 Beta (Plow) (ostree:0)                        │ 
 │ Red Hat Enterprise Linux 9.4 Beta (Plow) (ostree:1)                        │
 │                                                                            │
[2024-01-05T07:53:10-05:00] 🗳 Upgrade ostree image/commit
+ sudo ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=5 -i /tmp/tmp.ANADTsJY96/id_rsa admin@192.168.100.51 'echo '\''foobar'\'' |sudo -S rpm-ostree upgrade'
Warning: Permanently added '192.168.100.51' (ED25519) to the list of known hosts.
[sudo] password for admin: 1 delta parts, 3 loose fetched; 21197 KiB transferred in 1 seconds; 79.9 MB content written
Staging deployment...done
Added:
  wget-1.21.1-7.el9.x86_64
Run "systemctl reboot" to start a reboot
+ sudo ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=5 -i /tmp/tmp.ANADTsJY96/id_rsa admin@192.168.100.51 'echo '\''foobar'\'' |nohup sudo -S systemctl reboot &>/dev/null & exit'
Warning: Permanently added '192.168.100.51' (ED25519) to the list of known hosts.
+ sleep 10
+ greenprint '🛃 Checking for SSH is ready to go'
++ date -Isecond
+ echo -e '\033[1;32m[2024-01-05T07:53:34-05:00] 🛃 Checking for SSH is ready to go\033[0m'
[2024-01-05T07:53:34-05:00] 🛃 Checking for SSH is ready to go
++ seq 0 30
+ for _ in $(seq 0 30)
++ wait_for_ssh_up 192.168.100.51
+++ sudo ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=5 -i /tmp/tmp.ANADTsJY96/id_rsa admin@192.168.100.51 '/bin/bash -c "echo -n READY"'
ssh: connect to host 192.168.100.51 port 22: Connection timed out
++ SSH_STATUS=
++ [[ '' == READY ]]
++ echo 0
+ RESULTS=0
+ [[ 0 == 1 ]]
+ sleep 10
+ for _ in $(seq 0 30)
++ wait_for_ssh_up 192.168.100.51
+++ sudo ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=5 -i /tmp/tmp.ANADTsJY96/id_rsa admin@192.168.100.51 '/bin/bash -c "echo -n READY"'
ssh: connect to host 192.168.100.51 port 22: Connection timed out

On the other hand, system deployment ostree:1 is able to boot with no problem, as a matter of fact is possible to see the custom lvs.

[admin@mcattamo-rhel-9-4-240105 cases]$ sudo ssh -i /tmp/tmp.ANADTsJY96/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null admin@192.168.100.51
Warning: Permanently added '192.168.100.51' (ED25519) to the list of known hosts.
Script '01_update_platforms_check.sh' FAILURE (exit code '1'). Continuing...
Boot Status is GREEN - Health Check SUCCESS
Last login: Fri Jan  5 07:43:39 2024 from 192.168.100.1
[admin@vm-uefi ~]$ lsblk
NAME                                          MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
vda                                           252:0    0 20.5G  0 disk  
├─vda1                                        252:1    0    1M  0 part  
├─vda2                                        252:2    0  127M  0 part  /boot/efi
├─vda3                                        252:3    0  384M  0 part  /boot
└─vda4                                        252:4    0   20G  0 part  
  └─luks-608f702f-1e37-409f-bdd0-6f67b0e8b6e6 253:0    0   20G  0 crypt 
    ├─rootvg-rootlv                           253:1    0    9G  0 lvm   /var
    │                                                                   /sysroot/ostree/deploy/redhat/var
    │                                                                   /usr
    │                                                                   /etc
    │                                                                   /
    │                                                                   /sysroot
    ├─rootvg-foo_barlv                        253:2    0    2G  0 lvm   /foo/bar
    ├─rootvg-foolv                            253:3    0    8G  0 lvm   /foo
    └─rootvg-var_myfileslv                    253:4    0    1G  0 lvm   /var/myfiles

Due to virsh console not showing any useful output, it has been difficult to track the root cause of the issue. May you please help to check this issue when you have time?

miabbott commented 10 months ago

cc: @say-paul @runcom @7flying

say-paul commented 10 months ago

@mcattamoredhat what upgrade was done on the new deployment(ostree: 0)?

say-paul commented 10 months ago

seems the custom lvm : /foo and /foo/bar failed to mount.

Jan 09 07:54:40 localhost systemd[1]: Reached target Preparation for Local File Systems.
Jan 09 07:54:40 localhost systemd[1]: foo.mount: Failed to check directory /foo: No such file or directory
Jan 09 07:54:40 localhost systemd[1]: Mounting /foo...
Jan 09 07:54:40 localhost systemd[1]: var.mount: Directory /var to mount over is not empty, mounting anyway.
Jan 09 07:54:40 localhost systemd[1]: Mounting /var...
Jan 09 07:54:40 localhost mount[734]: mount: /foo: mount point does not exist.
Jan 09 07:54:40 localhost systemd[1]: Starting Rule-based Manager for Device Events and Files...
Jan 09 07:54:40 localhost systemd[1]: foo.mount: Mount process exited, code=exited, status=32/n/a
Jan 09 07:54:40 localhost systemd[1]: foo.mount: Failed with result 'exit-code'.
Jan 09 07:54:40 localhost systemd[1]: Failed to mount /foo.
Jan 09 07:54:40 localhost systemd[1]: Dependency failed for /foo/bar.
Jan 09 07:54:40 localhost systemd[1]: Dependency failed for Local File Systems.
Jan 09 07:54:40 localhost systemd[1]: Dependency failed for Mark the need to relabel after reboot.
Jan 09 07:54:40 localhost systemd[1]: selinux-autorelabel-mark.service: Job selinux-autorelabel-mark.service/start failed with result 'dependency'.
Jan 09 07:54:40 localhost systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
Jan 09 07:54:40 localhost systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
Jan 09 07:54:40 localhost systemd[1]: foo-bar.mount: Job foo-bar.mount/start failed with result 'dependency'.
say-paul commented 10 months ago

The workaround mentioned in 337 works , Adding the following customization to upgrade blueprint.

[[customizations.files]]
path = "/etc/systemd/system/remount-lvm.service"
data = "[Unit]\nDescription=remount lvm\nDefaultDependencies=no\n[Service]\nType=oneshot\nRemainAfterExit=yes\nExecStartPre=chattr -i /\nExecStart=mkdir -p /foo/bar\nExecStopPost=chattr +i /\n[Install]\nWantedBy=remote-fs.target\n"

[customizations.services]
enabled = ["remount-lvm.service"]

It needs to be embedded inside osbuild-composer to ensure lvs are mounted correctly

7flying commented 10 months ago

The workaround mentioned in 337 works , Adding the following customization to upgrade blueprint.

[[customizations.files]]
path = "/etc/systemd/system/remount-lvm.service"
data = "[Unit]\nDescription=remount lvm\nDefaultDependencies=no\n[Service]\nType=oneshot\nRemainAfterExit=yes\nExecStartPre=chattr -i /\nExecStart=mkdir -p /foo/bar\nExecStopPost=chattr +i /\n[Install]\nWantedBy=remote-fs.target\n"

[customizations.services]
enabled = ["remount-lvm.service"]

It needs to be embedded inside osbuild to ensure lvs are mounted correctly

I wonder if we should add it transparently ourselves when filesystem customizations are needed so that the user doesn't need to remember that the service is needed.

cgwalters commented 10 months ago

I bet what's going on here is the osbuild pipeline is only making these directories on top of the deployed disk image (i.e. equivalent of anaconda %post - and actually because we're not using https://github.com/ostreedev/ostree/pull/3094 I bet we're losing the immutable bit on the deployment root / which would have otherwise stopped this incorrect behavior.

The osbuild pipelines need to change to create this directory as part of the ostree commit instead.

say-paul commented 10 months ago

okay, We have enabled filesystem customization for deployments(raw,image, iso) only. Image builder actually consumes the base-commit(which does not have any data about lvm ) to build the deployments with the lvm data. So, if obuild: edge-commit and edge-container works the way @cgwalters suggested. Then it would be a matter of enabling fs.customization for commits also. cc @achilleas-k The caveat to that is: it will not provide integrity of having the same base image for various applications requiring different lvms. Also I suspect it will bring complexity in-terms , deployment and upgrade. @nullr0ute @runcom @7flying

nullr0ute commented 10 months ago

okay, We have enabled filesystem customization for deployments(raw.image, iso) only. Image builder actually consumes the base-commit(which does not have any data about lvm ) to build the deployments with the lvm data.

If I understand this problem correctly this is about creating the mount points in the filesystem for custom mounts (eg /data)?

say-paul commented 10 months ago

@nullr0ute it fails on upgrade, as dirs gets deleted, needs to be created explicitly to mount the volume, else system does not boot and plunges into emergency shell.

nullr0ute commented 10 months ago

So based on the :+1: whether it's a LVM partition (or NFS or some other storage) is irrelevant here. The user has to know where they want to mount things and the mount point should be created as part of the ostree stage as not as part of raw/iso etc. otherwise as @cgwalters says it's going to be a problem.

How we fix that I don't know, maybe we need to interpret the blueprint for both stages, maybe we need a "custom mount points" blueprint to ensure the directories are created (including the correct permissions).

Any other solution, such as custom systemd services are just working around the problem and will no doubt cause other issues.

say-paul commented 10 months ago

So there are two things that's biting me:

I guess it can be tested out but if @achilleas-k or @cgwalters you are already aware of the process, please share your thoughts.

7flying commented 10 months ago

So there are two things that's biting me:

  • Will the deployment(raw,iso) blueprint need the file-system customization again if that's already applied in the base-commit itself

I wouldn't think so in this case.

  • Does the file-system customization in upgrade commit needs to match that of base commit.

That is a good question

I guess it can be tested out but if @achilleas-k or @cgwalters you are already aware of the process, please share your thoughts.

Either way, we need to test this to decide how we are going to tackle the rest.

say-paul commented 10 months ago

So I tested it by bypassing the fs.customization not allowed part for commit and container. but didn't workout with initial deployment nor with upgrade commit. There's few more things that I need to test by tweaking the filesystem creation to see if I can make it work.

say-paul commented 10 months ago

I bet what's going on here is the osbuild pipeline is only making these directories on top of the deployed disk image (i.e. equivalent of anaconda %post - and actually because we're not using https://github.com/ostreedev/ostree/pull/3094 I bet we're losing the immutable bit on the deployment root / which would have otherwise stopped this incorrect behavior.

Even we apply them to a commit pipelines, I am bit unsure if it would be of much use and user may want different mount-points from same base commit.

How we fix that I don't know, maybe we need to interpret the blueprint for both stages, maybe we need a "custom mount points" blueprint to ensure the directories are created (including the correct permissions).

Its wont be possible to sync both the stages which will render system unbootable,

Possible Solution :thinking: Modifying into osbuild deployment pipeline: layering fs.customization as new commit then building raw/iso.

achilleas-k commented 10 months ago

So I tested it by bypassing the fs.customization not allowed part for commit and container. but didn't workout with initial deployment nor with upgrade commit.

Overriding the customization checker to pass these through doesn't help because the customizations have no effect on ostree commits. IB doesn't do anything with that customization when it's building a commit.

Which brings me to

Then it would be a matter of enabling fs.customization for commits also. cc @achilleas-k

What would that do? Is it just a matter of creating the directories? The user would have to know that they will have to add the partitions/mountpoints to the image blueprint as well, so we'd have to document this at least. Also the same fs customizations would have to be added to any upgrade blueprints for the same reason. Not an issue, just making sure I understand all the implications.

achilleas-k commented 10 months ago

Does running post-copy on the initial deployment solve the issue?

cgwalters commented 10 months ago

I haven't dug into this deeply but AIUI (from previous conversations) osbuild has an architecture where it wants to create a full final filesystem tree, then copy that to disk, and then it's at that point we'd run post-copy.

This means that we'd still have the problem that we allow uncontrolled mutation of the toplevel filesystem root for disk images.

The most robust way to fix that would be to try to better honor ostree's rules in pipeline builds. Specifically when generating disk images, in the default configuration we want to ensure that only /etc and /var are writable. Source of truth for things should be the ostree commit (future: container image).

achilleas-k commented 10 months ago

This means that we'd still have the problem that we allow uncontrolled mutation of the toplevel filesystem root for disk images.

Do we want to disallow this? Adding partitions and mountpoints to the root of ostree-based disk images was enabled somewhat recently as a feature requirement for edge. If this is in conflict with ostree's rules, or if it's "more correct" to move that configuration to the commit/container and block filesystem customizations on disk builds, then we should do that.

My main question is: What does a filesystem customization look like when applied to a commit? Is it just about creating directories? We've talked about some form of metadata describing a partition table for containers, but afaik there's nothing like that for the base ostree case that we're using now for R4E and Fedora IoT.

cgwalters commented 10 months ago

Do we want to disallow this? Adding partitions and mountpoints to the root of ostree-based disk images was enabled somewhat recently as a feature requirement for edge. If this is in conflict with ostree's rules

The rule basically is "the commit should be source of truth", with local state in /etc and /var.

Now except, we just added a giant ability to relax this in https://github.com/ostreedev/ostree/pull/3114 ...but that still has the semantic that content placed there is dropped on upgrades.

or if it's "more correct" to move that configuration to the commit/container and block filesystem customizations on disk builds, then we should do that.

I wouldn't say block all filesystem customizations on disk builds. Today for example, we (should) support configuring e.g. subdirectories of /var like /var/home just in a disk image, without requiring changes to the commit/image.

A simple way to look at this is for cases like Fedora CoreOS where the commit/image is shipped from upstream, we still allow choose their backing filesystem type and create sub-mounts of /var.

The problem case comes more about toplevel mounts.

I should also clarify that we enabled root.transient in current centos-bootc, which again basically avoids this issue as it allows uncontrolled mutation of / by default, so custom systemd units which mount other devices in a toplevel mount will generally Just Work.

Since this is about toplevel mounts, I'd today disallow them in disk image builds, unless root.transient is enabled. The subtlety in all this really wants us to use the same technology at build time that we do at upgrade time. (And aligning the container/runtime state was the rationale behind root.transient).

say-paul commented 10 months ago

Does running post-copy on the initial deployment solve the issue?

No, I talked to alexlarsson regarding this , but it does not help our cause.

say-paul commented 10 months ago

So even we add a directory in commit it will be lost in the upgrade if the upgrade blueprint doesn't have the same fs/directory customization.

So unless we have a fix for 337 I think we can: 1 . add the above service auto-add in blueprint directly when fs-customization is added. User can be see this using composer-cli blueprints show blueprint.toml or

  1. use something like greenboot to create the directories
7flying commented 10 months ago

So even we add a directory in commit it will be lost in the upgrade if the upgrade blueprint doesn't have the same fs/directory customization.

So unless we have a fix for 337 I think we can: 1 . add the above service auto-add in blueprint directly when fs-customization is added. User can be see this using composer-cli blueprints show blueprint.toml or 2. use something like greenboot to create the directories

I think that we have already internally discussed with the team that greenboot has a defined use case and that it mustn't deviate from that role.

runcom commented 10 months ago

I think that we have already internally discussed with the team that greenboot has a defined use case and that it mustn't deviate from that role.

agreed, greenboot's scope doesn't include things like this and shouldn't be leveraged

say-paul commented 10 months ago

Thoughts regarding option 1?

File and Service customizations are not allowed in deployment stages, so the effective solution will be to embed a generic service(template unit file to create directories of the filesystem mountpoints) in commit.

7flying commented 10 months ago

Thoughts regarding option 1?

File and Service customizations are not allowed in deployment stages, so the effective solution will be to embed a generic service(template unit file to create directories of the filesystem mountpoints) in commit.

I don't consider adding a service file fragile at all, so I don't mind taking that path, but I would add it in the the deployment. Also, as I said above, we wouldn't be adding it with the blueprint (the user shouldn't be required to know that the service file is needed), but rather detect that the user is requesting filesystem customizations in the deployment blueprint and add the required service file for those to work internally in a transparent way.

cgwalters commented 10 months ago

My main question is: What does a filesystem customization look like when applied to a commit? Is it just about creating directories?

Yes. Except...there's also the approach of moving the partition creation to firstboot, and injecting logic to do that into the commit. This is the approach taken by Ignition and systemd-repart. In general I prefer that approach because it keeps the disk image simpler - or stated another way, it more strongly enforces the decoupling of the commit (container image) and disk image.

If there's a way to enforce that the corresponding toplevel directories are exist in the commit when specifying disk customizations, that seems like a decent fix. In a container flow that's basically just podman run <input> test -d /mntpoint, with ostree commits as input there's ostree ls etc that work on a fetched commit.

say-paul commented 10 months ago

but rather detect that the user is requesting filesystem customizations in the deployment blueprint and add the required service file for those to work internally in a transparent way.

The transparent way - I imagine is to magically make it appear in the blueprint show when user adds the fs customization. OR can be added to logs when creating the image- somewhat translucent.

.there's also the approach of moving the partition creation to firstboot,

That's the idea, to be implement by a unit file that creates the mountpoint.

7flying commented 10 months ago

but rather detect that the user is requesting filesystem customizations in the deployment blueprint and add the required service file for those to work internally in a transparent way.

The transparent way - I imagine is to magically make it appear in the blueprint show when user adds the fs customization. OR can be added to logs when creating the image- somewhat translucent.

No, just adding it internally programatically, if we show a blueprint that is different from what the user has requested that is going to cause problems too. When we see in the blueprint that file customizations are added, we also include the service file internally.

runcom commented 10 months ago

@achilleas-k @ondrejbudai just FYI if you have any inputs/recommendations given the way we build commit/disk in osbuild and given the above suggestions?

say-paul commented 10 months ago

So the approach we decided is,

When user adds fs-customization in deployment blueprint, osbuild will add and enable unit file , something like this in background

The workaround mentioned in 337 works , Adding the following customization to upgrade blueprint.

[[customizations.files]]
path = "/etc/systemd/system/remount-lvm.service"
data = "[Unit]\nDescription=remount lvm\nDefaultDependencies=no\n[Service]\nType=oneshot\nRemainAfterExit=yes\nExecStartPre=chattr -i /\nExecStart=mkdir -p /foo/bar\nExecStopPost=chattr +i /\n[Install]\nWantedBy=remote-fs.target\n"

[customizations.services]
enabled = ["remount-lvm.service"]

It needs to be embedded inside osbuild-composer to ensure lvs are mounted correctly

Question is how we enable it,

  1. having an [install] section - can be easily disabled by systemd
  2. remove the [install] section and create a symlink to remote-fs.target.wants/ - seems bit more robust
achilleas-k commented 10 months ago

I would say doing it "properly" (with an install section) is preferable. I agree, it means it can be disabled and break the system, but it also means it can be disabled the right way if necessary (like if the user decides they don't want/need the partition anymore). We would have to document this of course but I'd rather it behaved like any other unit.

On the topic of creating the unit itself: I would be in favour of making an osbuild stage for this. A very specific stage that creates this service file for a set of mountpoints. But we can do it with file customizations in this repository first, for development, testing, and to get it out quickly, and move it down to osbuild later.

achilleas-k commented 10 months ago

there's also the approach of moving the partition creation to firstboot, and injecting logic to do that into the commit. This is the approach taken by Ignition and systemd-repart. In general I prefer that approach because it keeps the disk image simpler - or stated another way, it more strongly enforces the decoupling of the commit (container image) and disk image.

I think we should do this. On the composer level it would mean enabling filesystem customizations for ostree commit types. But we would have to consider what to do with the same customizations for deployment types (disks). Do we keep them? Do we handle conflicts? Or should we deprecate them and inform users that they should be putting their partitioning info in the commit?

runcom commented 10 months ago

Or should we deprecate them and inform users that they should be putting their partitioning info in the commit?

this, to me, is really counter intuitive :/ I understand it could be the way to go, but I kind of expect partitioning decisions to be made in deployment right? anything, I don't dislike it completely, if this is something the osbuild team is in favor of doing we can adapt I think cc @mrguitar

achilleas-k commented 10 months ago

I understand it could be the way to go, but I kind of expect partitioning decisions to be made in deployment right?

Don't get me wrong, I'm more on this side too. It makes more sense to me as well that given a base ostree commit or container, you can deploy it in any number of scenarios with different partitioning layouts.

I'm trying to reconcile the user experience with the technology decisions we're making here (so I too would love to hear what Ben has to say). I admit I'm not completely aware of what expectations we've created for users and how we can change that (or even if we should). But I do think we should be thinking ahead about what it means when we make these changes.

So, more concretely: If the decision here is "partitioning decisions should be made at ostree base image creation time", what does that mean for the existing image building process of:

  1. Create ostree commit
  2. Create disk image
  3. Create upgrade commit
  4. Upgrade system

Questions:

  1. Is partitioning at step 2 still supported? Will it be something like "Choose mountpoints in step 1, but create partition sizes and filesystems in step2"? How do we reconcile those two steps to make sure builds don't fail when the two configurations are incompatible?
  2. Should the mountpoints be preserved in the step 3 build configuration? Do mountpoints get removed if they're not included? (Similar to what happens with users)
  3. Do we need to figure out a mechanism that preserves mountpoints in step 3 given the original commit from step 1 as a "parent" (again, like we do with uids and gids)?

I could probably come up with more qs but that's a good starter for now I think :laughing:

cgwalters commented 10 months ago

The thing is: We can't change user partitions across upgrades by default no matter what. So remember, even if partitioning is specified in a disk image today, in-place upgrades won't get any changes made there.

This architecture is pretty clear with e.g. Anaconda and kickstart - Anaconda is only used once, and not thereafter. Ignition is also designed to run just once partly for this reason, but Ignition also does support "initialize idempotently" - i.e. if the partition already exists in the expected format, it is reused (and hence data is preserved).

The x-systemd.makefs option has existed for a really long time in systemd .mount units that has a similar semantic. What only exists much more recently is systemd-repart which pairs with that to handle partition creation dynamically. Now, systemd-repart is its own universe. In theory today, both could be used today without osbuild doing anything by just having the user inject the configuration to do so into the filesystem tree.

I don't have a strong opinion about this; if we were to take a dependency on repart it'd need some analysis. It could also be done by injecting custom systemd units to create the partitions, and mount units using x-systemd.makefs. I don't have a really strong opinion here myself.

Is partitioning at step 2 still supported?

It will probably have to continue to be if it was before, right?

Will it be something like "Choose mountpoints in step 1, but create partition sizes and filesystems in step2"?

No, all that state would best be computed on firstboot (as is done with ignition and repart).

How do we reconcile those two steps to make sure builds don't fail when the two configurations are incompatible?

In the end, moving the logic to firstboot necessarily implies some difficulty in trying to make that a "build" time check. Honestly I would focus more on making it convenient for people to have an edit-compile-debug cycle (where "debug" means booting for real).

Should the mountpoints be preserved in the step 3 build configuration? Do we need to figure out a mechanism that preserves mountpoints in step 3 given the original commit from step 1 as a "parent" (again, like we do with uids and gids)?

You're asking what happens if a user specifies a mountpoint a blueprint for creating a commit at one point, and then removes it later? I'd expect the mountpoint to drop out, yes.

say-paul commented 10 months ago

Having this the filesystem customization in commit comes with difficulties:

  1. can not be used for multiple applications requiring different fs

you can deploy it in any number of scenarios with different partitioning layouts.

wont be poosible

  1. Decoupling initial commit and upgrade commit blueprints wont be possible as any mismatch in mount points will cause the system to not boot

I don't have a strong opinion about this; if we were to take a dependency on repart it'd need some analysis. It could also be done by injecting custom systemd units to create the partitions, and mount units using x-systemd.makefs. I don't have a really strong opinion here myself.

I would prefer repart or x-systemd.makefs over custom systemd unit, probably they are more tried and tested , repart's implementation seems to be straightforward but analysis will be to see if the other mountpoints remains intact, modifying /etc/fstab can be done too but I am not sure about all the params - need some more learning for me.

say-paul commented 10 months ago

399 should fix the issue.