Closed worldofgeese closed 1 year ago
[...] but I'd like for kind to create a cluster without checking for Delegate=yes, which is a systemd-only step.
So what actually happens here is we're checking what container features are reported supported by podman/docker and detecting that this must not be set if they're missing.
The code is here: https://github.com/kubernetes-sigs/kind/blob/ac28d7fb19b4f353369d889b3900a7a9dd46f4c1/pkg/cluster/internal/create/create.go#L252-L254
While the error message is systemd-oriented because that's what is well supported in the ecosystem, the check isn't actually systemd aware and the problem isn't systemd specific either, just the recommended fix.
I do not recommend running containers on other init systems, the init system and container runtime should ideally cooperate to manage cgroups. Guix sheperd is realistically not tested or integrated with by any of the ecosystem container tools.
You might be able to resolve this issue, but there are probably more. xref #3277 (openrc instead)
@BenTheElder
Interesting! And thank you for the detailed response. Guix developers have done a lot of work to get Podman running (and rootlessly). I'll start a conversation with them and see what plumbing needs to be done and if it's something I'm capable of implementing.
For what it's worth, we do have cgroupsv2 working and as far as I can tell, I am limiting all required values in /etc/cgconfig.conf
group kind {
perm {
admin {
uid = worldofgeese;
}
task {
uid = worldofgeese;
}
}
cpuset {
cpuset.mems="0";
cpuset.cpus="0-5";
}
memory {
memory.limit_in_bytes = 5000000000;
}
cpu {
cpu.shares = 1024;
}
pids {
pids.max = 1000;
}
}
cgroups.controllers
shows I should have access to these
cpuset cpu io memory hugetlb pids misc
Result of podman info
:
As-is I don't really have the bandwidth to debug podman on systemd hosts, but the code is here:
The problem is your podman info
is not reporting the availability of these controllers, in fact it's reporting none:
cgroupControllers: []
I don't expect anyone to respond: I just wanted to wrap my investigation in anticipation of eventually bringing this to Guix System's mailing list.
I was able to force detection of these cgroup features by running, echo "+cpu +cpuset +memory +pids" >> /sys/fs/cgroup/cgroup.subtree_control
at which point podman info
reports
cgroupControllers:
- cpuset
- cpu
- memory
- pids
Creating a cluster still eluded me, as we can see in the below error.
KIND_EXPERIMENTAL_PROVIDER=podman kind create cluster --retain
using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.27.3) 🖼
✗ Preparing nodes 📦
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"
In plain English, the error states neither Reached target .*Multi-User System.
(indicating a successful cluster creation) or detected cgroup v1
could be found in the control plane's logs. Here's the function in kind
's code that returns the error message: https://github.com/kubernetes-sigs/kind/blob/ac28d7fb19b4f353369d889b3900a7a9dd46f4c1/pkg/cluster/internal/providers/common/cgroups.go#L44.
The function that calls it waits for 30 seconds for either message to appear
Logs from the control plane below:
→ podman logs kind-control-plane
INFO: running in a user namespace (experimental)
INFO: ensuring we can execute mount/umount even with userns-remap
INFO: remounting /sys read-only
mount: /sys: permission denied.
INFO: UserNS: ignoring mount fail
INFO: making mounts shared
INFO: detected cgroup v2
INFO: clearing and regenerating /etc/machine-id
Initializing machine ID from random generator.
INFO: faking /sys/class/dmi/id/product_name to be "kind"
INFO: faking /sys/class/dmi/id/product_uuid to be random
INFO: faking /sys/devices/virtual/dmi/id/product_uuid as well
INFO: setting iptables to detected mode: legacy
INFO: detected IPv4 address: 10.89.0.5
INFO: detected IPv6 address: fc00:f853:ccd:e793::5
INFO: starting init
Failed to look up module alias 'autofs4': Function not implemented
systemd 247.3-7+deb11u2 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified)
Detected virtualization podman.
Detected architecture x86-64.
Welcome to Debian GNU/Linux 11 (bullseye)!
Set hostname to <kind-control-plane>.
Failed to create /init.scope control group: Permission denied
Failed to allocate manager object: Permission denied
[!!!!!!] Failed to allocate manager object.
Exiting PID 1...
My conjecture is kind
is just too tightly wound up in systemd
making it difficult to work around these issues. Thanks again, Ben, for hopping in to discuss, even with limited time. I appreciate you!
I was able to get kind
working finally by taking the following steps:
echo "+cpu +cpuset +memory +pids" | sudo tee /sys/fs/cgroup/cgroup.subtree_control
g=users && sudo chgrp -R ${g} /sys/fs/cgroup/
u=$USER && sudo chown -R ${u}: /sys/fs/cgroup
system.scm
:
;; Rootless Podman requires the next 5 services
;; we're using the iptables service purely to make its resources available to minikube and kind
(service iptables-service-type
(iptables-configuration
(ipv4-rules (plain-file "iptables.rules" "*filter
:INPUT ACCEPT
:FORWARD ACCEPT
:OUTPUT ACCEPT
COMMIT
"))
(ipv6-rules (plain-file "ip6tables.rules" "*filter
:INPUT ACCEPT
:FORWARD ACCEPT
:OUTPUT ACCEPT
COMMIT
"))))
(simple-service 'etc-subuid etc-service-type
(list `("subuid" ,(plain-file "subuid" (string-append "root:0:65536\n" username ":100000:65536\n")))))
(simple-service 'etc-subgid etc-service-type
(list `("subgid" ,(plain-file "subgid" (string-append "root:0:65536\n" username ":100000:65536\n")))))
(service pam-limits-service-type
(list
(pam-limits-entry "*" 'both 'nofile 100000)))
(simple-service 'etc-container-policy etc-service-type
(list `("containers/policy.json", (plain-file "policy.json" "{\"default\": [{\"type\": \"insecureAcceptAnything\"}]}"))))
%my-services
sudo guix system reconfigure
then restarting my systemKIND_EXPERIMENTAL_PROVIDER=podman kind create cluster --retain
I have created a package definition for running kind on Guix System, which does not use systemd. I have all the other requirements running but I'd like for kind to create a cluster without checking for
Delegate=yes
, which is a systemd-only step. Unfortunately, I'm unable to bypass: