containers / bootc

Boot and upgrade via container images
https://containers.github.io/bootc/
Apache License 2.0
496 stars 65 forks source link

post-pull pre-stage stage #640

Open cgwalters opened 2 weeks ago

cgwalters commented 2 weeks ago

This relates to:

Basically...I think it'd be a powerful general feature if we supported a flow that did:

One simple way to do this would be to define a new bootc-pre-stage.target systemd unit that we run in between what currently happens when one types bootc upgrade.

This would be a clearly very powerful general mechanism that would allow implementing things like "logically bound container images" outside of bootc core code itself. A base image (or user code) could define which container images to pull via whatever mechanisms and file format it wants.

One could do arbitrary things like check compatibility (relates to #632 and #610)

The downside of course is that being so general, it'd be easy to use for things that would probably be best done elsewhere. I think we'd still eventually want higher level and more declarative/opinionated mechanisms for some of the problems here (especially the container binding one).

(This also tangentially relates to https://github.com/containers/bootc/issues/2 in that it'd probably be a bit more elegant if we internally split up bits of the bootc upgrade process internally into units)

But...in ostree we already merged e.g. https://github.com/ostreedev/ostree/pull/2569 which is currently a very special case.

Actually, a notable detail here is that bootc-pre-stage.target as proposed would get mutable access to the current /etc and the global /var, i.e. it'd be ordered before ostree-finalize-staged.target.

ckyrouac commented 2 weeks ago

Interesting idea! It makes a lot of sense to me in general. One concern I have is around the shared state in /var and /etc. This might not be a significant issue in practice. In theory, the new image might have incompatibilities with the existing state in /var, e.g. a major version upgrade of podman changes how images are stored in /var/lib/containers. I can also envision a scenario where an upgrade is staged, the bootc-pre-stage code runs, creates some new state in /var, then the upgrade is cancelled. This would leave some possibly broken state in /var without some kind of a teardown/cleanup script.

cgwalters commented 2 weeks ago

I can also envision a scenario where an upgrade is staged, the bootc-pre-stage code runs, creates some new state in /var, then the upgrade is cancelled. This would leave some possibly broken state in /var without some kind of a teardown/cleanup script.

I think anything executed here should be idempotent (being executed > 1 times is equivalent to 1 time) and crash safe (i.e. it can be killed at any point). That's how everything in the bootc/ostree stack is designed; i.e. if the kernel panics in the middle of an OS update, you only have image A or B, never a half-written mess. This can be a tricky property to achieve with arbitrary code. Actually, historically what happens during podman pull had bugs in this respect, and we probably need to re-audit its code in light of work on https://github.com/containers/bootc/pull/215

vrothberg commented 1 week ago

I think we should consider using additional image stores.

All "pinned" images of one given bootc image can be pulled in a separate store. The store can then be made accessible to Podman by adding it to the storage.conf (which now supports drop-in files).

When bootc-updating to a new bootc image, the new pinned images can be pulled to yet another store. After a successful pull (or during boot), bootc can switch the additional image storage in storage.conf to the new one.

That should give us the desired idempotent behavior along with benefit of the "pinnged" images being read-only for Podman.

Removing old images would then boil down to rm -rf-ing the old store.

cgwalters commented 1 week ago

(This issue is about the generic "run arbitrary code in new root" topic - WDYT about keeping https://github.com/containers/bootc/issues/128 being focused on the "logically bound" container topic?)

I cleaned up #128 to clarify it's now just about "logically bound", as distinct from physically bound.

vrothberg commented 1 week ago

SGTM