flatcar / Flatcar

Flatcar project repository for issue tracking, project documentation, etc.
https://www.flatcar.org/
Apache License 2.0
746 stars 32 forks source link

Flatcar Podman extension #112

Open dadux opened 4 years ago

dadux commented 4 years ago

Build systemd-sysext image as part of image build and publish as release artifact and as signed update payload. /etc/subuid/ and /etc/subgid should be provided by the Flatcar base image.

For usage after the boot, add a oneshot service (set remainafterexit, with a drop-in for multi-user.target that Upholds= this helper service) which prepares the system. The helper should copy a default content to /etc/containers/policy.json if it doesn't exist, maybe issue udevadm control --reload-rules, udevadm trigger, and (another) systemctl daemon-reload to trigger the quadlet systemd generator and then reevaluate the common targets to start the enabled quadlet (podman generator) units: systemctl restart --no-block sockets.target timers.target multi-user.target. Once we set up the extensions from the initrd, we can say that only initrd activation is supported and then skip the udevadm and daemon-reload workarounds.

Old content:

libpod (podman) Libpod provides a library for applications looking to use the Container Pod concept, popularized by Kubernetes. Libpod also contains the Pod Manager tool (Podman). Podman manages pods, containers, container images, and container volumes.

Impact of adding this package to the Flatcar OS image

The package will increase the OS image by: ?

The package will potentially increase Flatcar’s attack surface:

Benefits of adding this package to the Flatcar OS image CRI compatible container runtime make a lot of sense when you're running something to orchestrate your containers, Kubernetes namely.

Having a tool to run containers not in a k8s environment, would help adoption of FlatCar for all other use cases. podman integrates easily with systemd unit, and also enable users to run rootless containers, increasing security.

ahrkrak commented 4 years ago

Presumably "It should decrease attack surface by removing the container runtime's daemon" is only true if docker is removed at the same time? I suspect that would be controversial with a lot of users, which means we'd probably have to enable both.

dadux commented 4 years ago

You're right, that statement wasn't clear. The intent was never to remove docker or any container runtime. What I meant by that was - if you're running container with podman, you don't have to run a daemon as root. The users can disable it.

4ad commented 2 years ago

Any progress on this?

jepio commented 2 years ago

I recently prototyped getting podman working as a sysext, in this repo https://github.com/jepio/flatcar-podman-overlay/releases/tag/v1.0. The podman.raw file can be dropped on a flatcar 3200.0.0 system at /etc/extensions/podman.raw and after rebooting (or refreshing sysexts) one can use podman.

pfremm commented 2 years ago

@jepio I was trying your repo for podman on 3227.2.2 and it seems to work refreshing context, but then if I reboot systemd seems to hang during boot process. Updated if I mask ensure-sysext .service reboots are fine and the systemd extension works.

goochjj commented 2 years ago

@pfremm I don't have a problem with ensure-sysext - it's not causing lockups on reboot, using the version that I've modified in my PR to exclude the /opt/ folder.

Can you provide further diagnostics? I realize not being able to run that service at boot makes that difficult. If you unmask after boot and run the service, does it fail?

I've also found that all-service OnFailure= statements can cause that unit to fail at boot. See https://github.com/flatcar-linux/Flatcar/issues/710.

In my case I could flip back to USR-B to turn off the ensure-sysext unit... so I could hit up journalctl to get diagnostics - maybe if you stage your USR-A and USR-B so one of them predates sysext (i.e. before 3185.1.0) you'd be able to get more diagnostics.

goochjj commented 2 years ago

@pfremm Try my release here https://github.com/goochjj/flatcar-podman-overlay/releases/tag/v1.0.1

goochjj commented 2 years ago

@jepio Take a look at https://github.com/jepio/flatcar-podman-overlay/pull/1 https://github.com/goochjj/flatcar-podman-overlay/releases/tag/v1.0.1

jepio commented 2 years ago

@jepio I was trying your repo for podman on 3227.2.2 and it seems to work refreshing context, but then if I reboot systemd seems to hang during boot process. Updated if I mask ensure-sysext .service reboots are fine and the systemd extension works.

I don't know why this would cause the boot process to hang, what's the last boot message? ensure-sysext.service is what we use to try to get systemd files from the sysext to apply to the currently running boot (doesn't work for everything). It's kind of hack but the alternative requires more work (activating sysexts from the initramfs)

goochjj commented 2 years ago

(response copied from merged PR @jepio )

In my repo, I included a torcx package with the podman sysext.

The problem I ran into with just the sysext was that docker still tries to call docker. (/usr/bin/docker -> /run/torcx/bin/docker). If you don't use the torcx, you could use docker too - and have a dual setup.

In my case, I want docker to call podman, so my options as I see it are:

  1. Expand the sysext to also replace the /usr/bin/docker file

The problem with this method is other things, like runc, containerd, containerd-shim, etc are symlinked to /usr/bin/docker, so replacing it with something other than a torcx redirector would cause other issues. (And I didn't want to replace ALL the binaries, that seemed dumb)

  1. Roll my own replacement docker-torcx package, that only provides a docker binary (which calls podman)... So /usr/bin/docker finds bin/docker in the torcx package which calls podman. The others, dockerd, runc, etc just say "this torcx package doesn't contain a runc binary" or whatever, which is also good.

I've since expanded the torcx package (which should, perhaps, be a separate repo entirely... it is in my CI) to borrow the /lib/systemd/network/ files that make veth interfaces and br- (and now cni-) interfaces not be managed by networkd...

It just seemed like the most elegant way to solve it. As long as the torcx shims are there in /usr/bin, solving this with torcx makes sense. If docker moves from torcx to a sysext in the future, (in a way that I can remove), then I'd solve that a different way. (put a docker wrapper in the podman sysext, or a podman-only.raw) But I'm not sure putting /usr/lib/systemd/network/ files in a sysext is the right way at this point, it might be with ensure-sysext. /shrug/

TL;DR - you only need the torcx if you want to remove docker from your system, and you want docker to call podman

Other question, should the podman sysext be defined with a VERSION_ID instead of SYSEXT_LEVEL=1.0? We are linking binaries against /usr... My anecdotal evidence says that binaries usually work between flatcar versions, with very few exceptions... And as far as i know the sysext systemupdate helpers/infrastructure don't exist yet.

pothos commented 2 years ago

Thanks for documenting the status quo here; we hoped to be further on the roadmap to phase out torcx but it's taking some more time.

Other question, should the podman sysext be defined with a VERSION_ID instead of SYSEXT_LEVEL=1.0? We are linking binaries against /usr... My anecdotal evidence says that binaries usually work between flatcar versions, with very few exceptions... And as far as i know the sysext systemupdate helpers/infrastructure don't exist yet.

The best would be to get static linking working and use SYSEXT_LEVEL until we have a path forward with Podman officially in Flatcar. Currently it looks like we could eventually include it as (optional?) sysext in the image so that it gets auto-updated together with Flatcar. Other options would be to have an official Flatcar extension that gets auto-updated through update-engine but that's even further down the road. Actually I'm very open to include it unconditionally because I see some issues with having it as sysext and podman would also help to fulfill some internal dependencies that Flatcar components have currently on Docker but can't speak for the team in this matter.

goochjj commented 2 years ago

I am moving forward with using Podman 4.2 (which isn't officially "stable" in gentoo just yet, PR forthcoming) as my baseline going forward and will report back and/or fix any problems as I run into them - however, I'm not going full boar on "rootless", just "a container engine that works better withsystemd and cgroupsv2 and doesn't require ugly hacks like systemd-docker".

The squashfs podman extension clocks in at 77MB, (Which is actually smaller than the included docker torcx file, which is 80MB), so including it in the base image is a great addition IMHO.

I don't know how to do golang static binaries off the top of my head, but flatcar glibc versions have been fairly stable w.r.t. compiled binaries... And golang binaries have been fairly universally compatible in my experience. But compiling it as part of the builds would be pretty cool.

Right now I'm dumping versioned .raw files out of my CI to an internal website, with a signature, that I have a flatcar helper download and dump into /var/lib/extensions/ and that's working for me at the moment - but I'd love having something I don't have to maintain out-of-band.

Even If it's included in the build in a different folder (i.e. /usr/share/flatcar/extensions/) that users can symlink into /var/lib/extensions would be great. But if it becomes a "first class citizen" in flatcar, all thumbs up here from me.

goochjj commented 2 years ago

Podman v4.2 https://github.com/goochjj/flatcar-podman-overlay/releases/tag/v4.2 https://github.com/jepio/flatcar-podman-overlay/pull/2

Dockerless podman torcx https://github.com/goochjj/flatcar-podman-docker-torcx/releases/tag/v1.0

goochjj commented 2 years ago

Further FYI - the binaries above (built against 3277.1.2) work with current alpha 3346.0.0 without recompilation.

pothos commented 1 year ago

Thanks for the explorations! Because @tormath1 wants to look into using the quadlet podman systemd generator, I had a quick look at the sysext image and found that it misses some setup actions. I suggested to add a oneshot service (remainafterexit, with a drop-in for multi-user.target that Upholds= this helper service. The helper should copy a default content to /etc/containers/policy.json if it doesn't exist, same for /etc/subuid//etc/subgid, then issue udevadm control --reload-rules, udevadm trigger, systemctl daemon-reload to trigger the quadlet systemd generator and then reevaluate the common targets to start the enabled quadlet units: systemctl restart --no-block sockets.target timers.target multi-user.target.

tormath1 commented 1 year ago

I gave a try with the following Butane config to use quadlet to generate systemd unit from *.kube configuration file:

variant: flatcar
version: 1.0.0
storage:
  files:
    - path: /etc/containers/policy.json
      contents:
        source: https://raw.githubusercontent.com/containers/podman/main/test/policy.json
    - path: /etc/containers/systemd/nginx.kube
      contents:
        inline: |
          [Unit]
          Description=A simple Nginx pod
          Before=local-fs.target

          [Kube]
          Yaml=/etc/kubernetes/nginx.yaml

          [Install]
          # Start by default on boot
          WantedBy=multi-user.target default.target
    - path: /etc/kubernetes/nginx.yaml
      contents:
        inline: |
          apiVersion: v1
          kind: Pod
          metadata:
            name: nginx
          spec:
            containers:
            - name: nginx
              image: docker.io/nginx:latest
              ports:
              - containerPort: 80
    - path: /etc/extensions/podman.raw
      contents:
        source: https://github.com/tormath1/flatcar-podman-overlay/releases/download/v4.4.1/podman.raw
        verification:
          hash: sha512-34cdb308417603fef15a18c8c69ec9eb6d14ef9b4c7b5d8b9a252755167bc612c6145b83a00cb25beb83e472e505bb942f058a4e11bbb95711580b9617d30e7a

And that worked out of the box (just needed to fix that issue: https://github.com/gentoo/gentoo/pull/30264):

core@localhost ~ $ systemd-sysext list
NAME   TYPE PATH                       TIME
podman raw  /etc/extensions/podman.raw Mon 2023-03-20 14:54:51 UTC
core@localhost ~ $ curl --head localhost
HTTP/1.1 200 OK
Server: nginx/1.23.3
Date: Mon, 20 Mar 2023 15:01:42 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Tue, 13 Dec 2022 15:53:53 GMT
Connection: keep-alive
ETag: "6398a011-267"
Accept-Ranges: bytes

Here's the release: https://github.com/tormath1/flatcar-podman-overlay/releases/tag/v4.4.1 (based on previous work from @goochjj and @jepio)

Note: it's not perfect: policy.json can be provided by the systemd sysext image (it's already available from the ebuild actually) and /etc/{subuid, subgid} should be provided. I was just curious to see how far we could go with the current setup.

kastl-ars commented 6 months ago

What is the current status of this issue?

I see that @tormath1 has a rather up-to-date sysext with podman 4.5. Are people just using that or will there be a "official" sysext?

tormath1 commented 6 months ago

@kastl-ars It's planned to have an official image (similar to the ZFS sysext or the incoming Incus) but to my knowledge no one has started to work on this yet. If you want to contribute and bring the first official Podman images, feel free to jump in! Here's some example: https://github.com/flatcar/scripts/pull/1742 and https://github.com/flatcar/scripts/pull/1655.

pothos commented 6 months ago

PR https://github.com/flatcar/scripts/pull/1964 merged but we don't have docs and the kola tests yet, see https://github.com/flatcar/scripts/pull/1964#pullrequestreview-2037448109 for links to the locations.