coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
262 stars 59 forks source link

container-native CoreOS release engineering #828

Open cgwalters opened 3 years ago

cgwalters commented 3 years ago

This builds on https://github.com/coreos/fedora-coreos-tracker/issues/812

Basically the idea here is we ship two container images:

Where fedora-coreos-images is a new container that would use https://github.com/cgwalters/coreos-diskimage-rehydrator

For the images, we would continue releasing stream metadata. However, the pipeline would be changed to invoke the coreos-diskimage-rehydrator build dehydrate after stream metadata was published, and push that as an updated fedora-coreos-images container.

An example benefit to FCOS users is making it ergonomic for users to mirror the OS along with the rest of their container images. Anyone who wants to deploy on e.g. disconnected OpenStack can just pull container images.

Also, by encapsulating our disk images in a container we can start making use of e.g. container image signing which would address https://github.com/coreos/fedora-coreos-tracker/issues/774 for example.

More generally the benefit to us is that we consistently live and breathe container images. Our CI tooling would become more oriented around that, etc.

cgwalters commented 3 years ago

To elaborate a bit, there's just one issue left on the 0.1 milestone for the rehydrator: https://github.com/cgwalters/coreos-diskimage-rehydrator/issues/3

After that, I think we could consider basically polishing it and ship it.

I'd like to work on https://github.com/cgwalters/coreos-diskimage-rehydrator/issues/4 which would help shrink the the image down by probably 1GiB but that's a lot more involved.

bgilbert commented 3 years ago

From a user perspective, this wouldn't affect existing functionality, right? We'd still ship the existing artifacts and stream metadata; this would just be an alternative for those who want to obtain multiple artifacts efficiently.

darkmuggle commented 3 years ago

A couple of things that I really like about this plan:

So :100: let's do this

cgwalters commented 3 years ago

Currently, this proposal does not call for changes in e.g. coreos-assembler. We would continue to build e.g. the qemu and iso and AWS ami etc. images in the same way we do today. We would continue to run kola to test those images in the same way we do.

The dehydrator is glued on at the end of all that - using the stream metadata, deduplicate the images. This is in a mostly naive way, without e.g. mounting them or knowing anything about ostree/ignition or even much about coreos-assembler at all.

That said, there are definitely possibilities to do things along the lines that you're talking about - better splitting up the ostree build from "generate all the disk images" would indeed be very valuable I think. But it also seems orthogonal to this proposal as is right?

That said...one thing we absolutely could implement is a flow that takes fedora-coreos-images:stable and does an offline update to the latest ostree commit. I think shipping that would be the key thing that would unlock https://github.com/openshift/enhancements/pull/201 without actually shipping all of coreos-assembler.

cgwalters commented 3 years ago

That also said, I think in order to make significant further improvements to the dehydrator it's likely that some changes to coreos-assembler and our build flow would be required. See https://github.com/cgwalters/coreos-diskimage-rehydrator/issues/4

cgwalters commented 3 years ago

No updates to this recently. I think it's not a bad idea, but will it prove valuable enough to maintain over time in parallel to stream metadata? Not sure.

In any case I think https://github.com/coreos/fedora-coreos-tracker/issues/812 is a lot more valuable and so I'm focusing on that first.

cgwalters commented 2 years ago

I had a chat with Alex Flom around this and https://discussion.fedoraproject.org/t/oci-based-host-provisioning-baremetal-virt/36034

One thing that became clear to me - OCI artifacts are the thing that should be replacing our custom "images stored in s3 with metadata" and stream metadata longer term.

It's possible today for an OCI artifact to reference other artifacts, as well as container images. Then it seems clear to me that e.g. we have an OCI artifact quay.io/fedora/coreos-release:stable which points to the ISO, qcow2, and stream metadata JSON for things like AMIs and points to quay.io/fedora/coreos:stable i.e. our runtime container image (not OCI artifact).