k3d-io / k3d

Little helper to run CNCF's k3s in Docker
https://k3d.io/
MIT License
5.46k stars 462 forks source link

[BUG] rootless docker -> k3d blocks forever (k3s boot loops) #585

Open shoffmeister opened 3 years ago

shoffmeister commented 3 years ago

What did you do

Baseline:

k3d:

Problem: Command hangs after having emitted

INFO[0000] Prep: Network                                
INFO[0000] Created network 'k3d-mycluster' (4f944e1b21bff3718107f3843216e9a69288b3579dce77377732a1417e82370f) 
INFO[0000] Created volume 'k3d-mycluster-images'        
INFO[0001] Creating node 'k3d-mycluster-server-0'       
INFO[0001] Creating LoadBalancer 'k3d-mycluster-serverlb' 
INFO[0001] Starting cluster 'mycluster'                 
INFO[0001] Starting servers...                          
INFO[0001] Starting Node 'k3d-mycluster-server-0'   

After considerable time, it starts spewing

WARN[0204] Node 'k3d-mycluster-server-0' is restarting for more than a minute now. Possibly it will recover soon (e.g. when it's waiting to join). Consider using a creation timeout to avoid waiting forever in a Restart Loop. 

which is somewhat understandable given that docker logs k3d-mycluster-server-0 is unhappy with

I0501 17:24:44.193897       7 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I0501 17:24:44.193931       7 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
time="2021-05-01T17:24:44.209114066Z" level=info msg="Running kube-scheduler --address=127.0.0.1 --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --leader-elect=false --port=10251 --profiling=false --secure-port=0"
time="2021-05-01T17:24:44.209273499Z" level=info msg="Waiting for API server to become available"
time="2021-05-01T17:24:44.209489318Z" level=info msg="Running kube-controller-manager --address=127.0.0.1 --allocate-node-cidrs=true --bind-address=127.0.0.1 --cluster-cidr=10.42.0.0/16 --cluster-signing-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --cluster-signing-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --configure-cloud-routes=false --controllers=*,-service,-route,-cloud-node-lifecycle --kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --leader-elect=false --port=10252 --profiling=false --root-ca-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --secure-port=0 --service-account-private-key-file=/var/lib/rancher/k3s/server/tls/service.key --use-service-account-credentials=true"
time="2021-05-01T17:24:44.211128448Z" level=info msg="Node token is available at /var/lib/rancher/k3s/server/token"
time="2021-05-01T17:24:44.211182001Z" level=info msg="To join node to cluster: k3s agent -s https://172.22.0.2:6443 -t ${NODE_TOKEN}"
time="2021-05-01T17:24:44.214298925Z" level=info msg="Wrote kubeconfig /output/kubeconfig.yaml"
time="2021-05-01T17:24:44.215290745Z" level=info msg="Run: k3s kubectl"
time="2021-05-01T17:24:44.215494947Z" level=fatal msg="failed to find cpu cgroup (v2)"

Note: I have not tried running k3s without the k3d wrapper (yet) - i.e. neither under root nor rootless.

shoffmeister commented 3 years ago

From https://github.com/k3s-io/k3s/issues?q=is%3Aissue+is%3Aopen++rootless I cannot tell whether this is a k3s challenge or whether k3d driving k3s needs to be adapted?

iwilltry42 commented 3 years ago

Hi @shoffmeister , thanks for opening this issue! Interesting things you're doing here :wink: So there are several points to note here:

shoffmeister commented 3 years ago

I am rather innocently naïve (AKA ruthless) when it comes to doing interesting things 😛 It's software after all, and it's running inside a VM, to top that off even more ;)

Many thanks for the input! I will revisit this issue here once the stars have aligned on the next versions of k3s, k3d.

I have taken good note of the explicit --rootless into k3s.

shoffmeister commented 2 years ago

https://rancher.com/docs/k3s/latest/en/advanced/#running-k3s-with-rootless-mode-experimental now documents steps for running k3s rootless (possibly as the result of https://github.com/k3s-io/k3s/pull/4086)

Alas, I am unable to translate the stern note

Don’t try to run k3s server --rootless on a terminal, as it doesn’t enable cgroup v2 delegation. If you really need to try it on a terminal, prepend systemd-run --user -p Delegate=yes --tty to create a systemd scope.

i.e., systemd-run --user -p Delegate=yes --tty k3s server --rootless

into something that would fit into the execution environment constructed by k3d (there is no systemd inside docker)

So, in trying to make progress on this issue here, I wonder whether it is possible at all to run k3s --rootless "inside" k3d on a rootless docker?

FWIW, I have yet to look into running k3s rootless proper.

SanjayVas commented 1 year ago
  • you have to tell k3s (inside k3d) to run rootless: `--k3s-server-arg "--rootless" --k3s-agent-arg "--rootless"

I don't see --k3s-server-arg and --k3s-agent-arg options for k3d cluster create. Is running in rootless Docker now supported some other way? Given that there are instructions for rootless Podman, I assumed rootless Docker would work similarly.

irizzant commented 1 year ago

I'm having problems with this too.

After enabling cpu / cpuset delegation (https://rootlesscontaine.rs/getting-started/common/cgroup2/#enabling-cpu-cpuset-and-io-delegation) I launched the cluster creation with: k3d cluster create --k3s-arg "--rootless@server:0"

I got the following message in the log: time="2023-03-21T08:43:13Z" level=fatal msg="expected sysctl value \"net.ipv4.ip_forward\" to be \"1\", got \"0\"; try adding \"net.ipv4.ip_forward=1\" to /etc/sysctl.conf and runningsudo sysctl --system"