containers / bootc

Boot and upgrade via container images
https://containers.github.io/bootc/
Apache License 2.0
476 stars 65 forks source link

remote config via configmap and secrets #22

Open cgwalters opened 1 year ago

cgwalters commented 1 year ago

This overlaps with https://github.com/containers/bootc/issues/7 some.

Here, the basic idea is something like:

bootc config add [--root=/etc] https://examplecorp.com/config.yml

(OR with support for OCI Artifacts we support bootc config add [--root=/etc] registry:quay.io/examplecorp/config-server-base:latest)

Where config.yml is a standard Kubernetes ConfigMap. By default, we "mount" the keys to /etc. Then, bootc upgrade looks for updates to all provided configmaps - if any change, it triggers the same upgrade logic as the base image.

We also fetch and handle secret objects in the same way. It'd be cool though to support something like handling encrypted secrets (and configmaps) which need to be decrypted via a key (which could be in a TPM or so).

We also need to think carefully about file permissions; mode 0644 for all configmap files and 0600 for secrets may make sense. In addition we could support special annotations to override these.

(This should also work to be invoked immediately after bootc install to have it ready on the first boot, i.e. we also have a --root argument or so)

cgwalters commented 1 year ago

To flesh this out a bit more, bootc upgrade would check for updates to these things too.

Also, another variant of this is that we should support fetching all this stuff from the cloud user data too.

cgwalters commented 1 year ago

So this design would work in an obvious way outside of Kubernetes - we're using a standard API type, but we're not depending on an API server or kubelet, etc.

But, in a Kubernetes context, it would make total sense to fetch this data from the API server instead of just running http. (And potentially use proper watches to react to events instead of polling). I'm thinking we probably want to make the mechanism to fetch this data pluggable; ultimately it might be that there's /var/lib/bootc/config.d and a privileged container image could e.g. use kubelet's credentials to fetch from the API server instead, and just hand things down to bootc.

This intersects with https://github.com/openshift/machine-config-operator/issues/3327 too.

cgwalters commented 1 year ago

This intersects with https://github.com/openshift/machine-config-operator/issues/3327 too.

A hugely interesting topic here of course is whether we should have a declarative way to set this stuff up. Today for example we have an imperative interface in bootc switch to specify the root image.

But...following the kubelet "static pod" model, and building on the pod analogy above, perhaps we have /etc/bootc/host.yaml which is a subset of the pod spec, and allows declaratively writing the attached configmaps.

(This topic also relates to https://github.com/coreos/rpm-ostree/issues/2326 )

cgwalters commented 1 year ago

It may also be interesting here to try to support something like an annotation which says the configmap changes should be live-applied by default - in the general case we'd probably need some sort of "post-apply hook" to ask affected services to reload. Or, such services could use inotify of course (at the risk of seeing partial states).

cgwalters commented 1 year ago

To implement this I'm thinking:

Now...what many people will want is a read-only /etc; if we have that, it becomes logically easier to have each configmap be a mounted tmpfs or whatever, which we can more easily implement individual live-apply semantics.

cgwalters commented 1 year ago

Trying to play around with this, the ergonomics are definitely somewhat annoying because configmaps can't have / in keys. If one wanted to e.g. write drop a a CA certificate into /etc/pki/ca-trust/source/anchors and also a local registry mirror config in /etc/containers/registries.conf.d it'd require separate configmaps with separate mountpoints. Maybe we just accept that for now.

Edit: Or, we could accept a List object of configmaps.

cgwalters commented 10 months ago

Also, we should add support for kernel arguments via ConfigMap too...it'd be a really good match.

cgwalters commented 8 months ago

Dug into this a bit more, and actually we have all the bits we need for client-side support for configmaps-in-registry today, which is awesome.

The main gap is documenting support for uploading them (xref https://github.com/containers/buildah/issues/5091 )

fabiendupont commented 8 months ago

@cgwalters, apart from being closer to existing Kubernetes objects, what ConfigMap/Secret bring compared to systemd-confext?

Systemd expects to find the config extensions in a specific folder, e.g. /run/confexts/. It is possible to ship config OCI images in the OS image and mount them under /run/confexts/ with ComposeFS via systemd.mount units before the systemd-confext.service unit.

cgwalters commented 8 months ago

This is a complex topic. First, I think it makes sense to enable confext use on top of bootc.

However the other case here is I think many will want support for "lifecycle binding" configs and the OS, i.e. bootc update intends to update both as a transactional unit (and integrated with rollbacks by default); confext is an independent thing today from OS updates by default (which is good, we also want something like that).

And yes, there is the Kubernetes orientation versus the DDI stuff.

cgwalters commented 8 months ago

Tangential to this I am increasingly thinking we may want first-class support for pushing something like a "bootc bundle" as an OCI artifact to a registry. This bundle would reference a combination of a base image, bound infra/app images as well as attached configmaps.

I think we'd probably have bootc pack build --base quay.io/examplecorp/baseos:latest --config quay.io/examplecorp/someconfig:latest --config quay.io/examplecorp/otherconfig:latest --to quay.io/examplecorp/baseos-bundle:latest and (here we'd traverse into the container filesystem to find any embedded app image references).

When passed non-digested pull specs, we'd resolve them to their digested spec at that time by default.

bootc install would learn how to unpack this bundle into its component parts. bootc upgrade would default to updating everything, but part of the point is we do want to support independent updates of configs too.

jmarrero commented 8 months ago

Would this look like for the user something like this?

FROM scratch
COPY krb5.conf /etc/krb5.conf

and they pass: --config quay.io/examplecorp/otherconfig:latest we just scrape whatever they put there. I imagine scraping from a full filesystem with a bunch of default config files could be painful.

cgwalters commented 8 months ago

In this proposal we'd use OCI artifacts to host the configmaps directly instead of (ab)using container images. See https://github.com/containers/buildah/issues/5091

(OCI artifacts are already used today for e.g. WASM and things like that) - basically config data is architecture-independent (like WASM) and so involving container images would just be weird in that sense.

Also more logically we should try to avoid having executable code in there at all.

cgwalters commented 8 months ago

Random thought, we could support something like /usr/lib/boot/translators.d/vnd.coreos.ignition+json which would be a drop-in binary, and when we find a config from a registry in a mime type that matches one of those translator binaries, we'd pipe it through and get a configmap out. The user story here would then support re-using Ignition configs more directly without translating them.

We could take this out even to supporting parts of kickstart that way which would be pretty neat.

cgwalters commented 8 months ago

@jmarrero the FROM scratch is more of a "fallback" path if we don't have OCI artifacts support. One big reason for this is that configuration by nature is almost always architecture independent, and having to do multi-arch builds for it would be annoying.

But, OTOH the FROM scratch approach does natively work as part of a multi-stage build to scrape in content.

So perhaps we should support both.

jmarrero commented 8 months ago

Interesting, I did not think about the multistage build affecting the FROM scratch for configuration... I wonder if we could ignore the architecture and use virtualization to grab anything that is text... but that probably is a huge pain. I agree it would be nice to be able to use multistage builds when someone wants to plug the config directly in the build.

cdoern commented 8 months ago

@cgwalters trying to wrap my head around this from a potential ocp use-case perspective. Would this enable a workflow in which users can (not saying we necessarily should) modify on disk state by simply creating one of these configmaps in the mco namespace or something? bootc, executed by the MCO, can then grab and live apply these configs to a os container? This could enable a lot of user behavior that the MCO doesn't neccesarily allow at the moment but could really change, in a good way, the user experience of openshift.

Currently we almost forbid users from debugging into a node and modifying OS Files directly, and creating a new machineConfig for this process isn't painful but isn't necessarily easy or intuitive. So if a user could just create a k8s CM that the mco handles and regularly watches for changes.... I feel like this could be cool?

I am writing up a doc about what I have been thinking from the ocp side of things, sorry if this is incorrect in any way!

cgwalters commented 8 months ago

The way I'm thinking about this from a Kubernetes/OpenShift perspective is that in most cases, we would want to operate in terms of pools of nodes (as we do today) - and we'd fetch images and configuration that apply to all nodes.

However, there is still the case of per-machine state - think static IP addresses. Today, that ends up as "unmanaged state" that may be injected as kernel arguments or changes to the "pointer ignition config" (xref https://github.com/openshift/machine-config-operator/issues/1720 )

With bootc understanding configmaps, we could have a world in which these configmaps could be loaded into a registry (just pull secret required, perhaps encrypted) and at machine provisioning time (whether that's via bare metal PXE/Anaconda or in the cloud via instance metadata today) we bootc install --attach-configmap quay.io/examplecorp/host01-state:latest - the static IP addressing (or hostname, etc.) metadata would be fetched and applied before the boot into the target image.

A big change here though is that this state is now much more "visible" on the system instead of just being random files without a "day 2" management story.

From an OCP perspective what I think would happen here is that the MCO could do one of two things:

I can see cases for both.

cgwalters commented 8 months ago

bootc, executed by the MCO, can then grab and live apply these configs to a os container?

Sorry I didn't really answer your original question - but yes exactly this for the default case indeed. Imagine that instead of having a single giant "rendered machineconfig" we actually attach configs to the node and pass them to bootc individually to track/diff and possibly live apply.

cgwalters commented 8 months ago

The other thing that seems obvious to me to support is actually supporting per node state, i.e. associate configmaps with a specific node. This couldn't be used for pre-kubelet state (static IPs, hostname) but I can imagine it being useful for things like dynamic tuning.

That said though, there's definitely a coordination issue for any config changes that imply disruption (hence require drains etc.); so ultimately there'd need to be something MCO-like that is managing rollout I would say.

fabiendupont commented 8 months ago

For clarity, when you say a ConfigMap, what would it look like? Currently, it's a single file or list of files with their content base64-encoded. So, I see it as just a way to store the data, but it could as well be the files in an OCI image that is mounted as an overlay for /. The result would be the same. And outside of a Kubernetes API, the notion of ConfigMap doesn't exist, so I don't see how it applies practically to a standalone machine running Fedora.

I like the idea of the bootc pack command to assemble multiple layers into a final image. But it sounds almost like a Buildah feature, possibly https://github.com/containers/buildah/issues/5091.

cgwalters commented 8 months ago

So, I see it as just a way to store the data, but it could as well be the files in an OCI image that is mounted as an overlay for /.

Yes, though a key difference here is: if you're shipping binaries, you need to care about multi-arch in the general case. For uploading generic configuration, you don't. See https://github.com/containers/bootc/issues/22#issuecomment-1781724776

And outside of a Kubernetes API, the notion of ConfigMap doesn't exist, so I don't see how it applies practically to a standalone machine running Fedora.

https://docs.podman.io/en/v4.2/markdown/podman-play-kube.1.html exists and is usable without an API server. See also https://www.redhat.com/en/blog/running-containers-cars which argues for reusing container-ecosystem-adjacent tooling without an api server.

cgwalters commented 8 months ago

https://docs.podman.io/en/v4.2/markdown/podman-play-kube.1.html exists and is usable without an API server.

Also these things are connected, because it'd make sense to support attaching bootc configmaps which contain podman-play-kube definitions!

jlebon commented 8 months ago

@jmarrero the FROM scratch is more of a "fallback" path if we don't have OCI artifacts support. One big reason for this is that configuration by nature is almost always architecture independent, and having to do multi-arch builds for it would be annoying.

But, OTOH the FROM scratch approach does natively work as part of a multi-stage build to scrape in content.

So perhaps we should support both.

Yeah, I definitely see the argument for trying to keep this clean and using OCI artifacts, which seem like a better fit. That said, UX-wise having configs just be a regular container image too is very tempting. Building them is trivial in any container image build infra and it's more intuitive to write since the files live as files as they would on the target system rather than compiled into one big YAML file. So :+1: to supporting both.

Re. multi-arch, since bootc knows that it's just config files (given that you passed it to e.g. bootc config add), it doesn't have to care about trying to match arches. So maybe it can do something like "if it's manifest-listed, then try to match arch, otherwise just take the default".

cgwalters commented 8 months ago

and it's more intuitive to write since the files live as files as they would on the target system rather than compiled into one big YAML file.

Yes though we could easily add bootc config build|push

alexlarsson commented 8 months ago

https://docs.podman.io/en/v4.2/markdown/podman-play-kube.1.html exists and is usable without an API server.

Just for the record, quadlet supports .kube files for running play-kube things easily too.

fabiendupont commented 8 months ago

And outside of a Kubernetes API, the notion of ConfigMap doesn't exist, so I don't see how it applies practically to a standalone machine running Fedora.

https://docs.podman.io/en/v4.2/markdown/podman-play-kube.1.html exists and is usable without an API server. See also https://www.redhat.com/en/blog/running-containers-cars which argues for reusing container-ecosystem-adjacent tooling without an api server.

Do you plan to support other APIs? I mean, Podman kube play supports Pod, Deployment, PersistentVolumeClaim, ConfigMap, Secret and DaemonSet. Which ones would you add to bootc?

Also, the Kubernetes doc says that ConfigMap are limited to 1MiB:

A ConfigMap is not designed to hold large chunks of data. The data stored in a ConfigMap cannot exceed 1 MiB. If you need to store settings that are larger than this limit, you may want to consider mounting a volume or use a separate database or file service.

cgwalters commented 7 months ago

Podman kube play supports Pod, Deployment, PersistentVolumeClaim, ConfigMap, Secret and DaemonSet. Which ones would you add to bootc?

I think bootc's role is just scoped to host management, so Pod/Deployment/DaemonSet are clearly out of scope, right? Those things should be handled by podman. Whether we do mapping around PersistentVolume for the host seems more open, but my instincts say that's just confusing and we should point administrators to systemd .mount units. ConfigMap and Secret to me are the most obvious things to support.

cheesesashimi commented 6 months ago

Files have a lot of metadata such as file mode, paths, owners, etc. Rather than encode all of this into an arbitrary JSON blob and stuff it inside a ConfigMap, why not use the metadata part of the ConfigMap to store these values as annotations?

Here's a rough idea of some annotations that could be used for this purpose:

Annotation Name: Description:
bootc.coreos.io/baseDir The base directory where each file in the ConfigMap will be created, e.g. /etc. This is required because the ConfigMap spec does not allow slashes in the name portion of the files it creates.
bootc.coreos.io/baseDirMode The octal file mode (0755, 0644, etc.) that the deepest directory in the path should have. Given /etc/a/deeply/nested/base, this would affect the ./base portion of the directory mode.
bootc.coreos.io/fileMode The octal file mode that each file in the ConfigMap should be created with (e.g., 0755, 0644, etc.). Additional file modes such as sticky bit, executable, et. al. should be supported. If multiple files within the same directory need to have different modes, more than one ConfigMap will be needed.
bootc.coreos.io/userOwner The username or UID of the file's owner.
bootc.coreos.io/groupOwner The group name or GID of the file's owner.
cgwalters commented 6 months ago

(From an organizational perspective note bootc is part of the containers/ GH org, not coreos/)

bootc.coreos.io/baseDir

Right, this one already exists in the WIP code...I think maybe what we should do is make a shared hackmd doc and refine it into a spec, then update the first comment here or so?

As far as the other metadata...maybe. So far, there hasn't been a use case for it in Kubernetes, right? I think specifically having executables in configmaps is probably something we should think of as an anti-pattern - we have container images for that. This reduces the need significantly for fileMode. The use cases for user/group owners also seem like they need very specific motivation.

arewm commented 3 weeks ago

To flesh this out a bit more, bootc upgrade would check for updates to these things too.

Would it make sense to enable only the configmap to be updated without updating the bootc image as well or would these actions/lifecycles be bound together somehow?

Would you be able to use an image manifest to store the config maps with the config descriptor could be used to describe how all of the layers should be applied to the file system?

These could then be customized per-architecture and distributed as an image index so a configuration file could be applied to the bootc installations regardless of the architecture.