virtual-kubelet / systemk

Systemk is a systemd backend for the virtual-kubelet. Instead of starting containers, you start systemd units.
Apache License 2.0
160 stars 13 forks source link

enhancement: systemd-nspawn to launch real Image= container #75

Open rektide opened 3 years ago

rektide commented 3 years ago

using systemd-nspawn would let us run real containers, under systemd. from the man page's description:

systemd-nspawn may be used to run a command or OS in a light-weight namespace container. In many ways it is similar to chroot(1), but more powerful since it fully virtualizes the file system hierarchy, as well as the process tree, the various IPC subsystems and the host and domain name.).

it would be neat to add a mode to systemk that allows us to use systemd-nspawn to run full containers. Image= could be real images. systemd-nspawn supports a variety of network modes, such as macvlan and ipvlan. there is an machinectl shell which would enable runInContainer, a systemk feature request tracked in #37. pretty specific/fancy stuff but there's also things like the ability to clone a btrfs subvolume & launch that.

to run a systemd-nspawn container is a couple step process:

  1. create a filesystem in /var/lib/machines/ with the expanded image. systemd's machinectl tool includes import-tar and import-fs helpers which could help load images. to fetch an image, we could use docker save, docker export, or something like what nspawn-oci does to load the image ( using skopeo to get & oci-image-tools to expand an image).
  2. run the container. this can be done ephemerally, or we can create a unit. a. use systemd-nspawn to run the container once off, or b. create a /etc/systemd/nspawn/my-service.nspawn unit file to configure a container then run machinectl start to start it, in a managed way.

note that two years ago systemd seemed interested in becoming a runc compatible runtime, and if that happens my understanding is we could just run containerd directly against it, which might be a better idea than adapting systemk.

background: current systemk architecture

some notes i took, investigating how systemk works now

kubernetes pods are backed by systemd .service units created by unit manager.

these systemd services use the local system to run. Image= refers to debian packages that are run on the system.

pires commented 3 years ago

TL;DR is that's interesting but then why not just run a kubelet + CRI-compatible container runtime, eg containerd?

Philosophical question aside, I do think the feature requested above is pretty doable. However, we would raise a problem where systemk provides different UX for containerized workloads vs non-containerized, eg RunInContainer which is not possible (yet?) in the latter form of a workload.

miekg commented 3 years ago

Can you still run root-less in the above scenario?