podman machine alignment with bootc install

cgwalters commented 1 year ago

Feature request description

Basically switch podman machine to:

Download a minimal live ISO once
When we want to deploy a custom image, we boot the ISO, then it runs podman run --privileged <container> bootc install /dev/vda (after a locally invoked qemu-img create e.g.)

rhatdan commented 1 year ago

@baude @vrothberg @mheon @umohnani8 @n1hility Thoughts?

n1hility commented 1 year ago

@cgwalters interesting. are you guys thinking of dropping platform/hypervisor image builds in fcos long term, in favor of this containerized fetch model?

cgwalters commented 1 year ago

Well...perhaps more that this would be a new thing that isn't FCOS, actually.

rhatdan commented 1 year ago

We want to make it easy for users to be able to use different kinds of podman machines. RHIVOS for one, potentially RCOS, and many home grown ones. Currently it is difficult to build these and we want to make it simpler.

cgwalters commented 1 year ago

@baude also re having external OS updates https://groups.google.com/g/kubevirt-dev/c/K-jNJL_Y9bA/m/ZTH78OqFBAAJ

cgwalters commented 1 year ago

We had a realtime chat about a bunch of issues related to this. A few notes:

Short term I think podman needs to grow a download for e.g. Mac that includes both the podman binary and the qcow2 as a single unit; this would help with "offline mirroring" situations and improve user perception of latency.

Long term, I think it'd be better to switch to a more minimal (Sagano) derived image and the ISO per above.

rhatdan commented 1 year ago

We need to start a requirements doc for what Podman Machine needs. A primary requirement would be to make sure the podman client is using a machine with a matching podman service. podman 4.6 talks to podman-service 4.6 within the VM.

This means that as soon as podman 4.7 is released, podman-machine 4.7 is available.

Similar podman machine/podman desktop need a mechanism to warn users if they are using an out of date podman-machine. Currently users are only updating ad-hoc, when they destroy and re-add the machine. This means users could run the same podman-machine for years without updating.

rhatdan commented 1 year ago

@containers/podman-desktop-maintainers FYI @containers/podman-maintainers FYI

cgwalters commented 1 year ago

Right sorry, the other thing I meant to say is that right now, podman could/should switch to shipping quay.io/podman/fcos-podman:4.7 e.g. that is a server side derived image. Then we can drop the ExecStart=/usr/bin/rpm-ostree override remove moby-engine in Ignition for example, and switch to just doing a rebase on firstboot.

(We could continue to apply-live those changes...but there will very likely be more skew between the disk image and container image, and doing an apply-live flow on that may not always be reliable)

cgwalters commented 1 year ago

What's the infrastructure to build containers for the github.com/containers org today? Is it https://github.com/containers/automation_images ?

(Hmm...one thing definitely related to this of course is that that tooling could itself be switched over to build derived bootable images too for the server side builders...)

rhatdan commented 1 year ago

@cevich PTAL

cevich commented 1 year ago

What's the infrastructure to build containers for the github.com/containers org today?

What's in containers/automation_images today is 90% geared toward building CI VM images. I'm toying with some container building stuff there, but honestly it's not really a great place for it. We should have a dedicated repo w/ clean/fresh/un-muddied/simple .cirrus.yml otherwise it may quickly become too complex to maintain.

Anyway, deciding where to build should be driven largely by how the builds need to be triggered:

If it absolutely must happen directly as the result of podman branch-push, the least-worst option today is github-actions. However, it's ssslllloooowwwwww even w/o emulation. We're at Microsoft's mercy (and generosity) because I'm loathe to maintain custom (safe and secure) runners somewhere :vomiting_face:
If it can be triggered by webhook, Cirrus-API call, or periodically (5m granularity), we can build using Cirrus-CI on literally whatever infra. we throw at it. The caveat being, no actual build (for public-use) can happen from the podman/buildah/skopeo repos because (as I found :cry:) the push-secrets aren't secure there.
There are probably other CI systems that can handle either/both of the above, but they bring an added up-front setup & implementation effort. Not ruling them out, just saying the extra effort may not be compatible with timing needs (is it ever not ASAP?) :smile:

Bonus chapter:

WRT testing the image in PRs (does it build) and do podman-machine tests break with it. That probably has to happen in our existing Cirrus-CI setup. Though this is a somewhat easier problem to solve, since (hopefully/probably) the test-built images don't need to be pushed anywhere. Point is, how the images are built may be important if it needs to run in multiple contexts. i.e. we cannot easily re-use github-actions under Cirrus-CI. So bash-driven build scripts would be preferred.

cgwalters commented 1 year ago

What's in containers/automation_images today is 90% geared toward building CI VM images.

But it is the thing that builds https://quay.io/repository/buildah/stable - no?

cevich commented 1 year ago

But it is the thing that builds https://quay.io/repository/buildah/stable - no?

Maybe by the end of the week? Hopefully? I'm not thrilled with doing it there since it complicates and already complex .cirrus.yml.

So esp. if there are other image builds needed, I'd prefer to have a fresh and clean repo. Maybe with a nice/helpful README and some PR-triggered test-builds.

The only reason I can think of against this (as mentioned in my book above) is if there's a need to trigger exactly 1:1 based on merges to podman main.

cgwalters commented 1 year ago

The only reason I can think of against this (as mentioned in my book above) is if there's a need to trigger exactly 1:1 based on merges to podman main.

I can't canonically speak for podman but I am pretty sure there's no such requirement; the main goal would be tags for stable releases and a rolling main+latest pair.

What's in containers/automation_images today is 90% geared toward building CI VM images. I'm toying with some container building stuff there, but honestly it's not really a great place for it.

OK, right. I had that impression. On the coreos side we also have these Jenkins jobs of which there are two that build container images (but for multi-arch!) e.g. this one which is so much code to just build a basic container image. AFAICS the containers/automation_images uses full emulation, but the CoreOS team has set up dedicated multi-arch workers (for a few reasons; but basically the arch is a lot more relevant for the base OS).

As we (ideally) look to align the containers/ and coreos/ github organizations and teams a bit more, it probably makes sense to figure out how to share infrastructure and tooling here.

cevich commented 1 year ago

which is so much code to just build a basic container image

IKR! It's kind of absurd how much is needed.

AFAICS the containers/automation_images uses full emulation.

Yes, emulated builds are for-sure less than ideal for many reasons.

As we (ideally) look to align the containers/ and coreos/ github organizations and teams a bit more, it probably makes sense to figure out how to share infrastructure and tooling here.

Urvashi is finishing up a podman farm feature, I'm sure that would be super-duper helpful in doing distributed builds. Though it's pretty new and I'm not certain what automation pit-falls there might be, for example colliding base-image pulls.

Anyway, are your Jenkins jobs a good place for this image build, or were you also looking to offload that somewhere? Unless your builders are accessible from the internet, I'd be limited to x86 and arm builders. I'm slightly okay with setting up/maintaining a build-farm, but would prefer not to.

cgwalters commented 1 year ago

Anyway, are your Jenkins jobs a good place for this image build, or were you also looking to offload that somewhere?

Given the toplevel goal here is that container images are generated that are lifecycled to podman, under its own control and tooling, it'd be rather ironic if we said it should be built in the CoreOS jenkins :smile:

ISTM what we more really want is for the non-boot containers (e.g. coreos-assembler today) to be generated by a container build system that's shared with other teams. Then, the podman-boot image that derives from FCOS can be built in that place too.

n1hility commented 1 year ago

So esp. if there are other image builds needed, I'd prefer to have a fresh and clean repo. Maybe with a nice/helpful README and some PR-triggered test-builds.

+1 IMO new builds are much easier to advance rapidly on when you can isolate them with their own setup. You can always merge them back wherever if needed.

The only reason I can think of against this (as mentioned in my book above) is if there's a need to trigger exactly 1:1 based on merges to podman main.

I think there is a 1:1 need, but it can be distilled into ideally when testing runs you want the machine used for testing to include the CI main built version of podman. This doesnt necessarily equate though to requiring a full image, it could be appropriate to just apply a freshly built podman package on top of the image, or some other manual layer override for the purpose of testing (at the end of the day you are really just testing a podman linux binary + a podman host binary).

It sounds like a full stream update comes into play more on a less frequent basis (perhaps daily/hourly)

n1hility commented 1 year ago

Right sorry, the other thing I meant to say is that right now, podman could/should switch to shipping quay.io/podman/fcos-podman:4.7 e.g. that is a server side derived image. Then we can drop the ExecStart=/usr/bin/rpm-ostree override remove moby-engine in Ignition for example, and switch to just doing a rebase on firstboot.

(We could continue to apply-live those changes...but there will very likely be more skew between the disk image and container image, and doing an apply-live flow on that may not always be reliable)

BTW one slight tangent I should bring up is our WSL backend is package-based Fedora and not FCOS derived. The primary reason, at the time, is that WSL distributions have a lifecycle that is independent of kernel boot, and are not even in control of their own init. For all intents and purposes, you can view a WSL distribution as a privileged container (and in fact it's implemented using linux namespaces) The entry point is either a manual on-demand script (where we control the lifecycle via machine start, bootstrap systemd etc), or, in very recent versions, systemd units (they now finally have built-in systemd support - only really usable in preview builds but it is there now). At the time I did hack up an experimental in-place ostree bootstrap, but doing it properly would have meant requiring a formal WSL bootstrap and distribution to land in fcos, so in the interest of time we opted for just straight usage of package-based Fedora initially. It sounds like this bootc related work may be close to what would be required under WSL. The major difference being no initial OS, you just need a first boot container like image that does the image fetch install to the new ostree. wdyt @cgwalters ? (Edit: to be clear, the kernel replacement aspects would also be skipped as part of install since WSL is in control of the kernel)

cgwalters commented 1 year ago

(I'd s/fedora/package-based fedora/ there - fcos is Fedora too)

n1hility commented 1 year ago

(I'd s/fedora/package-based fedora/ there - fcos is Fedora too)

good point, I always found that awkward to say, changed it!

cgwalters commented 1 year ago

Bigger picture, the bootc/ostree style flow is most valuable when one wants transactional in-place OS updates - i.e. when the OS upgrades itself. In the podman machine case, that's not actually on by default even! And it sounds like the WSL case is much like that.

So yes, I don't see a big deal in using a package-based flow there and ignoring ostree/bootc.

The neat thing of course is at least now that FCOS is a container already you can just...run it that way and ignore the ostree stuff that doesn't run when it's executed as a container. IOW, still getting the benefits of the larger CI/integration story around it at least.

cevich commented 1 year ago

So if I'm following correctly. We'd have a periodic job building FCOS images from 'main'.

For PRs (potentially doing breaking podman-machine changes), we'd do basically the same as we do today for other CI testing. Take the "latest" main-build of FCOS image, then somehow/simply graft the freshly built podman-remote binary (named 'podman') into it. Adjusting tests to use the "correct" binaries/image as needed.

Have I got that right?

Also, I've been assuming but should confirm: Is this bootc FCOS image we're talking about different from the one Lokesh setup to build in a GHA workflow (on main-push)? IIRC that one is for podman-desktop testing, but maybe that's the same use-case here?

cgwalters commented 1 year ago

then somehow/simply graft the freshly built podman-remote binary (named 'podman') into it.

I was thinking we'd install an updated RPM in the container to avoid confusion (stale rpmdb).

Also, I've been assuming but should confirm: Is this bootc FCOS image we're talking about different from the one Lokesh setup to build in a GHA workflow (on main-push)? IIRC that one is for podman-desktop testing, but maybe that's the same use-case here?

Nope it's that exact use case. So I think this about either extending that Github Action, or migrating it to the same code that's pushing other podman images.

cevich commented 1 year ago

I was thinking we'd install an updated RPM in the container to avoid confusion (stale rpmdb).

We have some packit test-builds of RPMs but AFAIK, synchronizing that activity with Cirrus-CI could be difficult. I s'pose it's more complex than simply copying a binary, there's also some default configs and other dependencies could change in a PR.

Nope it's that exact use case.

Ahh okay, so that was done in GHA specifically because it needs to be 1:1 synchronized with what's happening on main so podman-desktop know immediately when we break something.

Maybe @lsm5 has an idea here: Is there a wait-for-packit tool you know of, or some similar way we could go into a wait-with-timeout loop until the packit build finishes?

I guess it could be something simple, like repeatedly try to curl it until it works or times out :thinking:

cevich commented 1 year ago

Also note: It's entirely possible something could get merged, and a new main-based FCOS image isn't built in time for an otherwise breaking CI run in a PR. This repo. DOES NOT require PR-rebasing before merge. So there's definitely a (maybe small?) risk a breaking change merging w/o being noticed: "/lgtm CI appears green".

lsm5 commented 1 year ago

Maybe @lsm5 has an idea here: Is there a wait-for-packit tool you know of, or some similar way we could go into a wait-with-timeout loop until the packit build finishes?

I guess it could be something simple, like repeatedly try to curl it until it works or times out 🤔

@cevich wait-for-copr has --max-tries and --interval to change the defaults. Not sure if I'm reading you right, but until the packit build finishes could sometimes mean many hours depending on builder availability, jobs being stuck for whatever reason. So, we shouldn't wait on them forever.

Nope it's that exact use case. So I think this about either extending that Github Action, or migrating it to the same code that's pushing other podman images.

@cevich remind me please, will you be switching our non-fcos podman image builds from Cirrus to GHA? If that's the case I think this part gets taken care of. /cc @cgwalters

lsm5 commented 1 year ago

don't know why my comments are getting posted twice. I only hit the button once. Second time it's happened today. Sorry about the repeat pings if any.

lsm5 commented 1 year ago

Also note: It's entirely possible something could get merged, and a new main-based FCOS image isn't built in time for an otherwise breaking CI run in a PR. This repo. DOES NOT require PR-rebasing before merge. So there's definitely a (maybe small?) risk a breaking change merging w/o being noticed: "/lgtm CI appears green".

we don't do merge queues yet, do we? Maybe we should?

cevich commented 1 year ago

until the packit build finishes could sometimes mean many hours depending on builder availability

Darn, that doesn't sound like something we should risk subjecting PR authors to.

@cevich remind me please, will you be switching our non-fcos podman image builds from Cirrus to GHA

I was thinking about it, and looked into it briefly. Then I realized it would be faster to get the old script working. Also as you've seen, GHA is simply a PITA to work with, esp. on branch-only jobs w/ lots of secrets.

we don't do merge queues yet, do we? Maybe we should?

IIRC the openshift-ci bot does this to a limited (short queue depth) extent.

So I think testing-wise, the proper thing is probably to move the per-PR FCOS build under Cirrus-CI, so that we can easily feed that image into podman-machine tests with matching bits. I think that's the simplest PR-level solution that will also avoid most surprises at the branch-level. Lokesh, would it be easy-ish to add a Cirrus-CI build task to produce an RPM (x86_64 only for now, possibly other arches later) using the same spec/scripts consumed by packit?

lsm5 commented 1 year ago

Lokesh, would it be easy-ish to add a Cirrus-CI build task to produce an RPM (x86_64 only for now, possibly other arches later) using the same spec/scripts consumed by packit?

make rpm should do it.

lsm5 commented 1 year ago

Lokesh, would it be easy-ish to add a Cirrus-CI build task to produce an RPM (x86_64 only for now, possibly other arches later) using the same spec/scripts consumed by packit?

make rpm should do it.

But the build NVR currently is not the same format as what we see from packit. Packit builds use changes from .packit.sh which uses packit's own envvars.

cevich commented 1 year ago

Thanks Lokesh. I don't think the NVR should make any difference at all, it's just for CI-use and should never leave that context.

cgwalters commented 1 year ago

To be clear, I think the medium/long term for podman machine should look like:

quay.io/podman/podman-machine-os:$version
Use e.g. osbuild to accept that container image as input and generate a disk image (e.g. qcow2)
The podman machine download offering should include a bundle of the native (e.g. MacOS) binary and the disk image
The binary can be updated independently without redownloading the disk image, allowing for some skew
The binary warns in the skew case, and offers to do in-place update (more efficient) of the disk image the exact tagged container corresponding to the client binary

cgwalters commented 1 year ago

There is an entirely different flow where we try to decouple podman and the base OS by default; treating podman as a floating "extension" that can be applied dynamically. It basically works this way today with e.g. Fedora Cloud precisely because podman isn't installed by default. We'd split into 3 things:

Native host podman binary (the "root")
Base OS disk image and container image that don't have versioned podman binaries
Tooling which actually just e.g. synchronizes RPMs on top of that base image dynamically, and e.g. decouples the base OS image updates from the layers

However as we know there's a lot of hooks/dependencies podman has into the base OS (e.g. selinux, kernel bug/feature exposure) and in practice I think there'd need to be something in podman-machine which manages tested pairs.

baude commented 1 year ago

Use e.g. osbuild to accept that container image as input and generate a disk image (e.g. qcow2)

i did some light reading. it is not obvious to me that osbuild would support vhdx and other image needs? if not, any idea on possible interest here?

cgwalters commented 1 year ago

The plan is to make osbuild effectively be "disk image builder" - if it doesn't support something today we'll make it do so. We will drain all disk image building logic out of coreos-assembler to use this.

github-actions[bot] commented 11 months ago

A friendly reminder that this issue had no activity for 30 days.

containers / podman

podman machine alignment with bootc install #19899

Feature request description