k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
28.08k stars 2.35k forks source link

k3s fails to start after upgrading to Fedora Server 33 #2453

Closed pschmitt closed 3 years ago

pschmitt commented 4 years ago

Environmental Info: K3s Version: 6fa97306

k3s version v1.18.10+k3s1 (6fa97306)

Node(s) CPU architecture, OS, and Version:

Linux lrz 5.8.16-300.fc33.x86_64 #1 SMP Mon Oct 19 13:18:33 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

3 nodes cluster (Fedora Server 32 - AMD64) with external etcd datastore. Longhorn for persistent storage.

$ kubectl get nodes -o wide
NAME   STATUS     ROLES    AGE   VERSION         INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                     KERNEL-VERSION           CONTAINER-RUNTIME
fnuc   Ready      master   51d   v1.18.10+k3s1   10.7.0.20     <none>        Fedora 32 (Thirty Two)       5.8.15-201.fc32.x86_64   containerd://1.3.3-k3s2
lrz    NotReady   master   51d   v1.18.10+k3s1   10.7.0.21     <none>        Fedora 33 (Server Edition)   5.8.16-300.fc33.x86_64   containerd://1.3.3-k3s2
ntn    Ready      master   37d   v1.18.10+k3s1   10.7.0.22     <none>        Fedora 32 (Server Edition)   5.8.15-201.fc32.x86_64   containerd://1.3.3-k3s2

Describe the bug:

After upgrading to Fedora 33 the k3s service is not starting anymore. At first I thought this was due to Fedora's switch to systemd-resolved but this seems to be handled fine judging by the k3s docs on resolv-conf. Nevertheless tried adding my nodes to /etc/hosts and a static resolv-conf file. Same result. According to the journalctl output (see below) this is another issue anyway.

Steps To Reproduce:

/etc/systemd/system/k3s.service ``` [Unit] Description=Lightweight Kubernetes Documentation=https://k3s.io Wants=network-online.target [Install] WantedBy=multi-user.target [Service] Type=notify EnvironmentFile=/etc/systemd/system/k3s.service.env KillMode=process Delegate=yes # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=1048576 LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity TimeoutStartSec=0 Restart=always RestartSec=5s ExecStartPre=-/sbin/modprobe br_netfilter ExecStartPre=-/sbin/modprobe overlay ExecStart=/usr/local/bin/k3s \ server \ '--no-deploy' \ 'traefik' \ --resolv-conf "/etc/resolv-k3s.conf" ```
/etc/systemd/system/k3s.service.env ``` K3S_DATASTORE_ENDPOINT=http://fnuc.lan:2379,http://lrz.lan:2379,http://ntn.lan:2379 ```

Expected behavior:

The k3s node should come up again after the OS upgrade.

Actual behavior:

The k3s service never starts successfully and the node stays NotReady.

Additional context / logs:

# journalctl -xlf -u k3s [Click to expand] ``` time="2020-10-29T09:44:30.703168172+01:00" level=info msg="Starting k3s v1.18.10+k3s1 (6fa97306)" time="2020-10-29T09:44:30.703395028+01:00" level=info msg="Cluster bootstrap already complete" time="2020-10-29T09:44:30.715491294+01:00" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=unknown --authorization-mode=Node,RBAC --basic-auth-file=/var/lib/rancher/k3s/server/cred/passwd --bind-address=127.0.0.1 --cert-dir=/var/lib/rancher/k3s/server/tls/temporary-certs --client-ca-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --enable-admission-plugins=NodeRestriction --etcd-servers=http://fnuc.lan:2379,http://lrz.lan:2379,http://ntn.lan:2379 --insecure-port=0 --kubelet-certificate-authority=/var/lib/rancher/k3s/server/tls/server-ca.crt --kubelet-client-certificate=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt --kubelet-client-key=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.key --proxy-client-cert-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt --proxy-client-key-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.key --requestheader-allowed-names=system:auth-proxy --requestheader-client-ca-file=/var/lib/rancher/k3s/server/tls/request-header-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6444 --service-account-issuer=k3s --service-account-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-cluster-ip-range=10.43.0.0/16 --storage-backend=etcd3 --tls-cert-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt --tls-private-key-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.key" Flag --basic-auth-file has been deprecated, Basic authentication mode is deprecated and will be removed in a future release. It is not recommended for production environments. I1029 09:44:30.716545 102504 server.go:645] external host was not specified, using 10.7.0.21 I1029 09:44:30.716748 102504 server.go:162] Version: v1.18.10+k3s1 I1029 09:44:30.720071 102504 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook. I1029 09:44:30.720092 102504 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota. I1029 09:44:30.720809 102504 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook. I1029 09:44:30.720821 102504 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota. panic: context deadline exceeded goroutine 140 [running]: github.com/rancher/k3s/vendor/k8s.io/apiextensions-apiserver/pkg/registry/customresourcedefinition.NewREST(0xc000427d50, 0x484a420, 0xc0008f6000, 0xc0008f6228) /go/src/github.com/rancher/k3s/vendor/k8s.io/apiextensions-apiserver/pkg/registry/customresourcedefinition/etcd.go:56 +0x3e7 github.com/rancher/k3s/vendor/k8s.io/apiextensions-apiserver/pkg/apiserver.completedConfig.New(0xc0012ac6c0, 0xc0013aeac8, 0x4945b60, 0x6fb0890, 0x10, 0x0, 0x0) /go/src/github.com/rancher/k3s/vendor/k8s.io/apiextensions-apiserver/pkg/apiserver/apiserver.go:145 +0x14ef github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app.createAPIExtensionsServer(0xc0013aeac0, 0x4945b60, 0x6fb0890, 0x1, 0x4849f20, 0xc000dfb420) /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app/apiextensions.go:102 +0x59 github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app.CreateServerChain(0xc000371600, 0xc000c82ea0, 0x3f33bb4, 0xc, 0xc0012e1ca8) /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app/server.go:200 +0x34d github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app.Run(0xc000371600, 0xc000c82ea0, 0x0, 0x0) /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app/server.go:164 +0x101 github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app.NewAPIServerCommand.func1(0xc0003e8f00, 0xc00000c960, 0x0, 0x1e, 0x0, 0x0) /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app/server.go:124 +0x109 github.com/rancher/k3s/vendor/github.com/spf13/cobra.(*Command).execute(0xc0003e8f00, 0xc000022a00, 0x1e, 0x20, 0xc0003e8f00, 0xc000022a00) /go/src/github.com/rancher/k3s/vendor/github.com/spf13/cobra/command.go:826 +0x460 github.com/rancher/k3s/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc0003e8f00, 0x4, 0x3f647e8, 0x19) /go/src/github.com/rancher/k3s/vendor/github.com/spf13/cobra/command.go:914 +0x2fb github.com/rancher/k3s/vendor/github.com/spf13/cobra.(*Command).Execute(...) /go/src/github.com/rancher/k3s/vendor/github.com/spf13/cobra/command.go:864 github.com/rancher/k3s/pkg/daemons/control.apiServer.func1(0xc000022a00, 0x1e, 0x20, 0xc0003e8f00) /go/src/github.com/rancher/k3s/pkg/daemons/control/server.go:220 +0xbb created by github.com/rancher/k3s/pkg/daemons/control.apiServer /go/src/github.com/rancher/k3s/pkg/daemons/control/server.go:218 +0xf54 k3s.service: Main process exited, code=exited, status=2/INVALIDARGUMENT k3s.service: Failed with result 'exit-code'. k3s.service: Unit process 4497 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 5105 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 5181 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 5326 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 5337 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 6670 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 6708 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 6724 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 6741 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 6813 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 7062 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 7186 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 7244 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 7317 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 7389 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 7485 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 7544 (containerd-shim) remains running after unit stopped. k3s.service: Unit process 4550 (pause) remains running after unit stopped. k3s.service: Unit process 5203 (pause) remains running after unit stopped. k3s.service: Unit process 5281 (pause) remains running after unit stopped. k3s.service: Unit process 5387 (pause) remains running after unit stopped. k3s.service: Unit process 5394 (pause) remains running after unit stopped. k3s.service: Unit process 5445 (entry) remains running after unit stopped. k3s.service: Unit process 5494 (entry) remains running after unit stopped. k3s.service: Unit process 6052 (speaker) remains running after unit stopped. k3s.service: Unit process 6785 (pause) remains running after unit stopped. k3s.service: Unit process 6803 (pause) remains running after unit stopped. k3s.service: Unit process 6858 (pause) remains running after unit stopped. k3s.service: Unit process 6867 (pause) remains running after unit stopped. k3s.service: Unit process 6884 (pause) remains running after unit stopped. k3s.service: Unit process 6992 (kured) remains running after unit stopped. k3s.service: Unit process 7073 (csi-node-driver) remains running after unit stopped. k3s.service: Unit process 7075 (k8s-device-plug) remains running after unit stopped. k3s.service: Unit process 7107 (k8s-node-labell) remains running after unit stopped. k3s.service: Unit process 7170 (pause) remains running after unit stopped. k3s.service: Unit process 7257 (pause) remains running after unit stopped. k3s.service: Unit process 7299 (pause) remains running after unit stopped. k3s.service: Unit process 7351 (pause) remains running after unit stopped. k3s.service: Unit process 7461 (pause) remains running after unit stopped. k3s.service: Unit process 7518 (entry) remains running after unit stopped. k3s.service: Unit process 7616 (pause) remains running after unit stopped. k3s.service: Unit process 7699 (entry) remains running after unit stopped. k3s.service: Unit process 7719 (pause) remains running after unit stopped. k3s.service: Unit process 7731 (entry) remains running after unit stopped. k3s.service: Unit process 7732 (entry) remains running after unit stopped. k3s.service: Unit process 7749 (entry) remains running after unit stopped. k3s.service: Unit process 7761 (bash) remains running after unit stopped. k3s.service: Unit process 7933 (sleep) remains running after unit stopped. k3s.service: Unit process 7939 (entry) remains running after unit stopped. k3s.service: Unit process 7960 (entry) remains running after unit stopped. k3s.service: Unit process 8007 (entry) remains running after unit stopped. k3s.service: Unit process 8018 (entry) remains running after unit stopped. k3s.service: Unit process 8086 (entry) remains running after unit stopped. k3s.service: Unit process 8103 (entry) remains running after unit stopped. k3s.service: Unit process 8188 (entry) remains running after unit stopped. k3s.service: Unit process 8226 (entry) remains running after unit stopped. k3s.service: Unit process 8296 (entry) remains running after unit stopped. k3s.service: Unit process 8339 (entry) remains running after unit stopped. k3s.service: Unit process 14546 (longhorn-manage) remains running after unit stopped. Failed to start Lightweight Kubernetes. k3s.service: Scheduled restart job, restart counter is at 13. Stopped Lightweight Kubernetes. k3s.service: Found left-over process 4497 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5105 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5181 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5326 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5337 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6670 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6708 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6724 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6741 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6813 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7062 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7186 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7244 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7317 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7389 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7485 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7544 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 4550 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5203 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5281 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5387 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5394 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5445 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5494 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6052 (speaker) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6785 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6803 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6858 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6867 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6884 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6992 (kured) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7073 (csi-node-driver) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7075 (k8s-device-plug) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7107 (k8s-node-labell) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7170 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7257 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7299 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7351 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7461 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7518 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7616 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7699 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7719 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7731 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7732 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7749 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7761 (bash) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7933 (sleep) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7939 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7960 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8007 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8018 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8086 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8103 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8188 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8226 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8296 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8339 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 14546 (longhorn-manage) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. Starting Lightweight Kubernetes... k3s.service: Found left-over process 4497 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5105 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5181 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5326 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5337 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6670 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6708 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6724 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6741 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6813 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7062 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7186 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7244 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7317 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7389 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7485 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7544 (containerd-shim) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 4550 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5203 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5281 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5387 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5394 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5445 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 5494 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6052 (speaker) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6785 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6803 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6858 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6867 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6884 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 6992 (kured) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7073 (csi-node-driver) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7075 (k8s-device-plug) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7107 (k8s-node-labell) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7170 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7257 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7299 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7351 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7461 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7518 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7616 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7699 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7719 (pause) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7731 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7732 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7749 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7761 (bash) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7933 (sleep) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7939 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 7960 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8007 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8018 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8086 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8103 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8188 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8226 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8296 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 8339 (entry) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. k3s.service: Found left-over process 14546 (longhorn-manage) in control group while starting unit. Ignoring. This usually indicates unclean termination of a previous run, or service implementation deficiencies. time="2020-10-29T09:44:55.963494997+01:00" level=info msg="Starting k3s v1.18.10+k3s1 (6fa97306)" time="2020-10-29T09:44:55.963677740+01:00" level=info msg="Cluster bootstrap already complete" time="2020-10-29T09:44:55.975459225+01:00" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=unknown --authorization-mode=Node,RBAC --basic-auth-file=/var/lib/rancher/k3s/server/cred/passwd --bind-address=127.0.0.1 --cert-dir=/var/lib/rancher/k3s/server/tls/temporary-certs --client-ca-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --enable-admission-plugins=NodeRestriction --etcd-servers=http://fnuc.lan:2379,http://lrz.lan:2379,http://ntn.lan:2379 --insecure-port=0 --kubelet-certificate-authority=/var/lib/rancher/k3s/server/tls/server-ca.crt --kubelet-client-certificate=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt --kubelet-client-key=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.key --proxy-client-cert-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt --proxy-client-key-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.key --requestheader-allowed-names=system:auth-proxy --requestheader-client-ca-file=/var/lib/rancher/k3s/server/tls/request-header-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6444 --service-account-issuer=k3s --service-account-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-cluster-ip-range=10.43.0.0/16 --storage-backend=etcd3 --tls-cert-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt --tls-private-key-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.key" Flag --basic-auth-file has been deprecated, Basic authentication mode is deprecated and will be removed in a future release. It is not recommended for production environments. I1029 09:44:55.976341 102603 server.go:645] external host was not specified, using 10.7.0.21 I1029 09:44:55.976519 102603 server.go:162] Version: v1.18.10+k3s1 I1029 09:44:55.979616 102603 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook. I1029 09:44:55.979633 102603 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota. I1029 09:44:55.980227 102603 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook. I1029 09:44:55.980235 102603 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota. ```
pschmitt commented 4 years ago

Since "it's always DNS"™ I dug around a little more. It turns out that disabling the stub-resolver (and rebooting!) fixes the issue. It's a bit sad having to deactivate one of the main new features of F33 but it works.

sudo ln -sfv /run/systemd/resolve/resolv.conf /etc/resolv.conf

Ref: https://fedoraproject.org/wiki/Changes/systemd-resolved#Opting_out_of_.2Fetc.2Fresolv.conf_that_points_to_the_localhost_stub_resolver

Edit: I updated the 2 remaining nodes as well and I confirm: you need to disable the stub resolver and reboot for k3s to work on F33.

brandond commented 4 years ago

You can always create an alternative resolv.conf and point k3s at it with the --resolv-conf flag. This would allow you to keep the default systemd-managed file in place.

pschmitt commented 4 years ago

You can always create an alternative resolv.conf and point k3s at it with the --resolv-conf flag. This would allow you to keep the default systemd-managed file in place.

I precisely tried that. That did not work.

brandond commented 4 years ago

You're saying that k3s itself continuously crashes with panic: context deadline exceeded if you leave the default resolv.conf in place?

pschmitt commented 4 years ago

You're saying that k3s itself continuously crashes with panic: context deadline exceeded if you leave the default resolv.conf in place?

Yes.

jbertozzi commented 3 years ago

I also have issue with k3s on fedora 33 (fresh install).

Just run the quickstart install script:

curl -sfL https://get.k3s.io | sh -
$ uname -a
Linux xps13 5.10.11-200.fc33.x86_64 #1 SMP Wed Jan 27 20:21:22 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/fedora-release 
Fedora release 33 (Thirty Three)

I noticed the following error before the stack trace:

/usr/local/bin/k3s server --debug
[...]
Feb 10 17:00:00 xps13 k3s[24353]: F0210 17:00:00.600788   24353 server.go:181] cannot set feature gate SupportPodPidsLimit to false, feature is locked to true
[...]

After some googling, I disabled cgroupv2 and I was able to start k3s.

I though support for cgroupv2 was supported as per: https://github.com/k3s-io/k3s/pull/2584

Is there some config to enable it?

brandond commented 3 years ago

@jbertozzi it was not properly fixed until #2844 which isn't in a released version yet. You can install a master build from CI until then.

dlouzan commented 3 years ago

Per the released versions, I guess this can be closed?