Open rgilton opened 2 years ago
I also ran into this issue and was able to make some progress.
It looks like on Fedora 36 a non-root user does not have the cpuset
delegation by default:
$ cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers
cpu io memory pids
For reference: Enabling CPU, CPUSET, and I/O delegation
Once I enabled the cpuset
delegation (as outlined in the above) success!
$ cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers
cpuset cpu io memory pids
$ k3d cluster create
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-k3s-default'
INFO[0000] Created image volume k3d-k3s-default-images
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-k3s-default-tools'
INFO[0001] Creating node 'k3d-k3s-default-server-0'
INFO[0001] Creating LoadBalancer 'k3d-k3s-default-serverlb'
INFO[0001] Using the k3d-tools node to gather environment information
INFO[0001] HostIP: using network gateway 10.89.0.1 address
INFO[0001] Starting cluster 'k3s-default'
INFO[0001] Starting servers...
INFO[0001] Starting Node 'k3d-k3s-default-server-0'
INFO[0005] All agents already running.
INFO[0005] Starting helpers...
INFO[0005] Starting Node 'k3d-k3s-default-serverlb'
INFO[0012] Injecting records for hostAliases (incl. host.k3d.internal) and for 2 network members into CoreDNS configmap...
INFO[0014] Cluster 'k3s-default' created successfully!
INFO[0014] You can now use it like this:
kubectl cluster-info
Although the initial cluster creation is successful, I noticed that the k3d-k3s-default-server-0
was actually having issues staying up, unfortunately. There are some hints in the log about what the kubelet is not happy about:
E0615 19:04:02.504643 2 container_manager_linux.go:457] "Updating kernel flag failed (Hint: enable KubeletInUserNamespace feature flag to ignore the error)" err="open /proc/sys/kernel/panic: permission denied" flag="kernel/panic"
E0615 19:04:02.504726 2 container_manager_linux.go:457] "Updating kernel flag failed (Hint: enable KubeletInUserNamespace feature flag to ignore the error)" err="open /proc/sys/kernel/panic_on_oops: permission denied" flag="kernel/panic_on_oops"
E0615 19:04:02.504878 2 container_manager_linux.go:457] "Updating kernel flag failed (Hint: enable KubeletInUserNamespace feature flag to ignore the error)" err="open /proc/sys/vm/overcommit_memory: permission denied" flag="vm/overcommit_memory"
E0615 19:04:02.504972 2 kubelet.go:1431] "Failed to start ContainerManager" err="[open /proc/sys/kernel/panic: permission denied, open /proc/sys/kernel/panic_on_oops: permission denied, open /proc/sys/vm/overcommit_memory: permission denied]"
So without giving it too much thought I recreated the cluster like so:
$ k3d cluster create --k3s-arg '--kubelet-arg=feature-gates=KubeletInUserNamespace=true@server:*'
Seems OK but haven't dug much deeper to verify:
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k3d-k3s-default-server-0 Ready control-plane,master 55s v1.23.6+k3s1 10.89.0.2 <none> K3s dev 5.17.13-300.fc36.x86_64 containerd://1.5.11-k3s2
$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system local-path-provisioner-6c79684f77-wv88d 1/1 Running 0 2m24s
kube-system coredns-d76bd69b-rbsw2 1/1 Running 0 2m24s
kube-system helm-install-traefik-crd-c64r4 0/1 Completed 0 2m24s
kube-system metrics-server-7cd5fcb6b7-qfclf 1/1 Running 0 2m24s
kube-system helm-install-traefik-w74nc 0/1 Completed 2 2m24s
kube-system svclb-traefik-bgcz8 2/2 Running 0 104s
kube-system traefik-df4ff85d6-xf2nm 1/1 Running 0 104s
I hope this helps!
Cheers,
@radikaled Thanks a lot!
I've been facing same issue on Oracle Linux 8 with CGroupsV2 and rootless podman. The following command helped:
k3d cluster create --k3s-arg '--kubelet-arg=feature-gates=KubeletInUserNamespace=true@server:*'
It might be cool k3d manages this automatically or at least prints a hint if rootless environment is detected.
How could a rootless environment be detected?
Well, assuming rootless environment is defined as an environment running under non-root user on host level and container's root user is mapped to the host level (non-root) user, the check should be the user Id of the current process (on host level — the k3d
itself) is not 0.
This is (modified) example how I detect root mode in my utility (C++):
#include <unistd.h>
………
const auto uid = getuid();
if (uid > 0) {
// root-less mode
} else {
// root-full mode
}
Running k3d
inside container should be another exercise — I don't know if it is even supported feature.
To detect the CGroupV1 vs CGroupV2 is more tricky. I have two Oracle Linux 8 systems here. Oracle Linux 8 is capable of running in both modes but the CGroup V1 is the default. The easiest way to check in what version the system currently runs is by checking mounted filesystem name:
CGroup V1:
[opc@ipa ~]$ stat -fc %T /sys/fs/cgroup/
tmpfs
CGroup V2:
[opc@sws ~]$ stat -fc %T /sys/fs/cgroup/
cgroup2fs
If the result for the stat
command is cgroup2fs
then the system runs in CGroup V2 mode. Otherwise CGroup V1.
P.S.: Please, excuse me if I miss some crucial points here. I'm really new to this kind of stuff.
I had the same issue on Debian 11 today on Alibaba Cloud instance.
I added following lines to /etc/default/grub
under GRUB_CMDLINE_LINUX variable
cgroup_memory=1 cgroup_enable=memory
and rebooted the instance from console. Now error gone and systemd service starts correctly.
on Fedora 36 a non-root user does not have the cpuset delegation by default
same on bookworm/sid
but following https://rootlesscontaine.rs/getting-started/common/cgroup2/#enabling-cpu-cpuset-and-io-delegation indeed fixed it for me too
[root@localhost ~]# cat /etc/systemd/system/user@.service.d/delegate.conf [Service] Delegate=cpu cpuset io memory pids
[admin@localhost ~]$ cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers cpuset io memory pids
[admin@localhost ~]$ stat -fc %T /sys/fs/cgroup/ cgroup2fs
[root@localhost ~]# docker logs -f k3d-k3s-default-server-0 ...... time="2024-03-08T06:42:47.390547887Z" level=fatal msg="failed to find cpu cgroup (v2)"
help pls!
if the os is redhat os like and u have the same problem , u can visite the link below https://access.redhat.com/solutions/6582021 https://access.redhat.com/solutions/737243 https://support.hpe.com/hpesc/public/docDisplay?docId=sf000082729en_us&docLocale=en_US&page=index.html sovle : disable rtkit-daemon
What did you do
I followed the instructions on using rootless podman from the k3d documentation.
k3d registry create --default-network podman hive-registry
k3d cluster create --registry-use hive-registry hive
What did you expect to happen
The cluster to start.
Screenshots or terminal output
Spying on the logs from one of the 'server' containers, the last few lines are:
This machine is using cgroups v2 as far as I can see (it is the Fedora 36 default):
Which OS & Architecture
All in a fresh Fedora 36 VM.
Which version of
k3d
Which version of docker
Using podman here: