osbuild / osbuild

Build-Pipelines for Operating System Artifacts
https://www.osbuild.org
Apache License 2.0
210 stars 113 forks source link

Creating ostree images with Unified Core #1457

Open achilleas-k opened 12 months ago

achilleas-k commented 12 months ago

While adding --unified-core to osbuild, there are selinux issues running rpm-ostree compose postprocess with --unified-core enabled. I suspect it's related to how we run stages in containers.

 org.osbuild.ostree.preptree: 58f81d8fea42b90480c4eaa80a401d57428732980802b45c79603ba136d1da05 {
   "etc_group_members": [
     "wheel",
     "docker"
   ],
   "unified-core": true
 }
 /usr/lib/tmpfiles.d/journal-nocow.conf:26: Failed to resolve specifier: uninitialized /etc/ detected, skipping.
 All rules containing unresolvable specifiers will be skipped.
 Failed to open file "/sys/fs/selinux/checkreqprot": Read-only file system
 ostree: machineid-compat: True
 Moving tree to temporary root
 Initializing new root filesystem
 Moving data back from temporary root
 No embedded whiteouts found
 Recompiling policy
 libsemanage.semanage_make_sandbox: Could not copy files to sandbox /var/lib/selinux/targeted/tmp. (Read-only file system).
 semodule:  Could not begin transaction:  Read-only file system
 error: Finalizing rootfs: bwrap(semodule): Child process killed by signal 1
 Traceback (most recent call last):
   File "/run/osbuild/bin/org.osbuild.ostree.preptree", line 186, in <module>
     r = main(args["tree"], args["options"])
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/run/osbuild/bin/org.osbuild.ostree.preptree", line 179, in main
     subprocess.run(["rpm-ostree", "compose", "postprocess", *args,
   File "/usr/lib64/python3.12/subprocess.py", line 571, in run
     raise CalledProcessError(retcode, process.args,
 subprocess.CalledProcessError: Command '['rpm-ostree', 'compose', 'postprocess', '--unified-core', '/run/osbuild/tree', '/tmp/tmp8phicw60.json']' returned non-zero exit status 1.

The relevant bits are:

 Recompiling policy
 libsemanage.semanage_make_sandbox: Could not copy files to sandbox /var/lib/selinux/targeted/tmp. (Read-only file system).
 semodule:  Could not begin transaction:  Read-only file system
 error: Finalizing rootfs: bwrap(semodule): Child process killed by signal 1

/var/lib/... being read-only is clearly an issue but what I don't know is if this is a var that's created and managed by rpm-ostree or our own stage /var.

From what I understand, one of the things that unified core does is run certain tasks in containers, so there's clearly an overlap and source of potential conflicts there.

achilleas-k commented 11 months ago

Looking more into Unified Core and I realise now (which I should have realised sooner) that this option doesn't create different system trees or artifacts, but it does things differently, more safely and sanely. A big part of that (as I understand it) is containerising some of the actions performed when composing the ostree commit, which we already do and is the source of the issue. It makes me wonder if this is worth the effort if we're already doing things correctly in osbuild.

I understand that there's a desire to drop the old code paths from rpm-ostree and make UC the only way to compose commits, but is it worth locking out (or at least making it difficult) to run it in cases where the environment is already sane?

I'm going to keep looking into this, maybe I misunderstood some things and this is in fact not hard to get working with the way we do things in osbuild. But given the above, I don't think this should be a high priority unless the non-UC way of composing is going away very soon.

@cgwalters @runcom wdyt?

runcom commented 11 months ago

there's a now old issue somewhere about unified-core and I thought the culprit of the issue was around rpms, perhaps related https://github.com/osbuild/osbuild/issues/1134 (?) - I may completely mistaken - also, @paulwhalen while switching fedora iot to unified core experienced some selinux issues too

also cc @travier

miabbott commented 11 months ago

...I don't think this should be a high priority unless the non-UC way of composing is going away very soon.

My fear is exactly that; the non-UC mode of composing ostree commits is nearly deprecated and isn't being fully tested (or at all?) as the ostree stack changes. Based on discussions with the CoreOS folks, they aren't interested in continuing to support the non-UC path and have a desire to unify the ostree-based operating systems around the UC path. (Using the UC path also allows ostree-based operating system to adopt the use of bootupd for bootloader updates, which is a large gap in Fedora IoT/R4E right now)

The right folks from the CoreOS team have been tagged here, so I'll let them weigh in with any corrections or additional context.

travier commented 11 months ago

The unified core mode goes from RPMs to ostree commit without any steps in the middle, so you can not install the RPMs, do some postprocess, etc. like the current ostree mode does. It needs to go directly from the rpm-ostree manifest + RPMs to the ostree commit.

Ideally, we would directly support only rpm-ostree compose image instead of doing this in multiple steps.

cgwalters commented 11 months ago

Looking more into Unified Core and I realise now (which I should have realised sooner) that this option doesn't create different system trees or artifacts, but it does things differently, more safely

Right, like running each %post script in its own container. Now honestly, it's a lot of reinvention of stuff. It's had some benefits but clearly is a huge maintenance burden and code not shared with dnf or osbuild.

(Using the UC path also allows ostree-based operating system to adopt the use of bootupd for bootloader updates,

There's not some proprietary secret sauce though in rpm-ostree; I think ultimately the only load-bearing thing unified core mode does differently here (related to https://github.com/coreos/bootupd/issues/441 ) is basically that in unified core mode rpm-ostree ensures that the shim/grub RPM files that install into /boot are moved into /usr/lib/ostree-boot instead. Adding a bootupd stage into the existing osbuild logic would likely literally just be mkdir -p /usr/lib/ostree-boot && mv /boot/* /usr/lib/ostree-boot or so after the RPM installs are done.

Or I guess it'd likely make even more sense to just stick that code into bootupd too.

achilleas-k commented 11 months ago

Looking more into Unified Core and I realise now (which I should have realised sooner) that this option doesn't create different system trees or artifacts, but it does things differently, more safely

Right, like running each %post script in its own container. Now honestly, it's a lot of reinvention of stuff. It's had some benefits but clearly is a huge maintenance burden and code not shared with dnf or osbuild.

And that is essentially what my question/comment hinges on. It sounds like in osbuild's case there's no technical need for unified core, and it seems bootupd doesn't strictly depend on it either. So the main issue remains that the non-unified-core paths are deprecated, untested, and soon to be removed (?), which brings me to a follow-up question: Is there any potential for a more fine-grained/separated workflow that would make composing with rpm-ostree with unified core easier to work with in environments that don't need (or are incompatible with) some of the steps in the whole process?

Or maybe, a more general question: does this ability already exist when running things using the "Granular tree compose" (https://github.com/coreos/rpm-ostree/blob/main/docs/compose-server.md#granular-tree-compose-with-installpostprocesscommit) like we do in osbuild? I'd love to get an overview of what unified core does differently so we can figure out where the issue I'm seeing here is coming from, but also what's happening differently at every stage. For reference, when we're composing ostree commits in osbuild we do (very roughly):

  1. rpm install into a tree.
  2. rpm-ostree compose postprocess on the tree.
  3. rpm-ostree compose commit

The postrpocess stage fails with --unified-core, but the commit stage doesn't (if postprocess is first run without it).

I feel like we're having a similar conversation to what was discussed here (since Antonio linked it earlier): https://github.com/osbuild/osbuild/issues/1134#issuecomment-1281459308

cgwalters commented 11 months ago

So the main issue remains that the non-unified-core paths are deprecated, untested,

We can also just choose to start testing the osbuild path more in upstream rpm-ostree, it's just work but that'd probably close off the main concerns.

and soon to be removed (?)

We can't drop it in the near future I'd say. And ultimately the "unified core" path is hard to replicate in an existing container. What seems most sustainable long term is to drive functionality we need more directly into dnf/libdnf/rpm and the scripts, and have a common "postprocess the filesystem tree" tool.

achilleas-k commented 11 months ago

Given all that, and the current priorities, I'm understanding that we can push Unified Core way down the priority list for osbuild. And it also seems like UC really isn't a requirement for making images "properly" or making them compatible with bootupd, which was the source of all the major concerns for me.

What seems most sustainable long term is to drive functionality we need more directly into dnf/libdnf/rpm and the scripts, and have a common "postprocess the filesystem tree" tool.

Now that would be great!