k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
26.82k stars 2.26k forks source link

Cannot start K3s with docker container runtime on centos7.8 #2548

Closed kingsd041 closed 3 years ago

kingsd041 commented 3 years ago

Environmental Info: K3s Version: v1.19.3+k3s3 (0e4fbfef)

Node(s) CPU architecture, OS, and Version: amd64, Centos 7.8 (Using the same steps, it works fine on centos7.6)

Linux iZ6weix7w7e0sy67ak2vt0Z 3.10.0-1127.19.1.el7.x86_64 rancher/k3s#1 SMP Tue Aug 25 17:23:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: 1 master

Describe the bug:

Cannot start K3s with docker container runtime on centos7.8

Steps To Reproduce:

Expected behavior:

Expect k3s to start successfully Actual behavior:

k3s failed to start

[root@iZ6weix7w7e0sy67ak2vt0Z ~]# kubectl get nodes
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?

Additional context / logs:

Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z systemd: k3s.service holdoff time over, scheduling restart.
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:04.950638268+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:04.951202912+08:00" level=info msg="Cluster bootstrap already complete"
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:04.962970507+08:00" level=info msg="Configuring sqlite3 database connection pooling: maxIdleConns=2, maxOpenConns=0, connMaxLifetime=0s"
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:04.963001178+08:00" level=info msg="Configuring database table schema and indexes, this may take a moment..."
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:04.963089752+08:00" level=info msg="Database tables and indexes are up to date"
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:04.963862502+08:00" level=info msg="Kine listening on unix://kine.sock"
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:04.963954470+08:00" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=unknown --authorization-mode=Node,RBAC --bind-address=127.0.0.1 --cert-dir=/var/lib/rancher/k3s/server/tls/temporary-certs --client-ca-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --enable-admission-plugins=NodeRestriction --etcd-servers=unix://kine.sock --insecure-port=0 --kubelet-certificate-authority=/var/lib/rancher/k3s/server/tls/server-ca.crt --kubelet-client-certificate=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt --kubelet-client-key=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.key --profiling=false --proxy-client-cert-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt --proxy-client-key-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.key --requestheader-allowed-names=system:auth-proxy --requestheader-client-ca-file=/var/lib/rancher/k3s/server/tls/request-header-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6444 --service-account-issuer=k3s --service-account-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-cluster-ip-range=10.43.0.0/16 --storage-backend=etcd3 --tls-cert-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt --tls-private-key-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.key"
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:04.964869    5531 server.go:652] external host was not specified, using 172.16.0.85
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:04.965089    5531 server.go:177] Version: v1.19.3+k3s3
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:04.972739    5531 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:04.972757    5531 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:04.975600    5531 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:04.975614    5531 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
Nov 18 20:31:04 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:04.992685    5531 master.go:271] Using reconciler: lease
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.298234    5531 genericapiserver.go:412] Skipping API batch/v2alpha1 because it has no resources.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.311365    5531 genericapiserver.go:412] Skipping API discovery.k8s.io/v1alpha1 because it has no resources.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.325280    5531 genericapiserver.go:412] Skipping API node.k8s.io/v1alpha1 because it has no resources.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.341164    5531 genericapiserver.go:412] Skipping API rbac.authorization.k8s.io/v1alpha1 because it has no resources.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.344341    5531 genericapiserver.go:412] Skipping API scheduling.k8s.io/v1alpha1 because it has no resources.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.358712    5531 genericapiserver.go:412] Skipping API storage.k8s.io/v1alpha1 because it has no resources.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.399779    5531 genericapiserver.go:412] Skipping API apps/v1beta2 because it has no resources.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.399798    5531 genericapiserver.go:412] Skipping API apps/v1beta1 because it has no resources.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.411350    5531 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.411368    5531 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.419323096+08:00" level=info msg="Running kube-scheduler --address=127.0.0.1 --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --leader-elect=false --port=10251 --profiling=false --secure-port=0"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.419627    5531 registry.go:173] Registering SelectorSpread plugin
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.419641    5531 registry.go:173] Registering SelectorSpread plugin
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.419963515+08:00" level=info msg="Running kube-controller-manager --address=127.0.0.1 --allocate-node-cidrs=true --bind-address=127.0.0.1 --cluster-cidr=10.42.0.0/16 --cluster-signing-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --cluster-signing-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --leader-elect=false --port=10252 --profiling=false --root-ca-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --secure-port=0 --service-account-private-key-file=/var/lib/rancher/k3s/server/tls/service.key --use-service-account-credentials=true"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.421563967+08:00" level=info msg="Node token is available at /var/lib/rancher/k3s/server/token"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.421590659+08:00" level=info msg="To join node to cluster: k3s agent -s https://172.16.0.85:6443 -t ${NODE_TOKEN}"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.422972956+08:00" level=info msg="Wrote kubeconfig /etc/rancher/k3s/k3s.yaml"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.422997161+08:00" level=info msg="Run: k3s kubectl"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.423075240+08:00" level=info msg="Module overlay was already loaded"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.423089943+08:00" level=info msg="Module nf_conntrack was already loaded"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.423103849+08:00" level=info msg="Module br_netfilter was already loaded"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.423116579+08:00" level=info msg="Module iptable_nat was already loaded"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.423504108+08:00" level=info msg="Waiting for API server to become available"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.450236582+08:00" level=info msg="Cluster-Http-Server 2020/11/18 20:31:05 http: TLS handshake error from 127.0.0.1:52502: remote error: tls: bad certificate"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.453852175+08:00" level=info msg="Cluster-Http-Server 2020/11/18 20:31:05 http: TLS handshake error from 127.0.0.1:52512: remote error: tls: bad certificate"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.460236221+08:00" level=info msg="certificate CN=iz6weix7w7e0sy67ak2vt0z signed by CN=k3s-server-ca@1605702556: notBefore=2020-11-18 12:29:16 +0000 UTC notAfter=2021-11-18 12:31:05 +0000 UTC"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.462631766+08:00" level=info msg="certificate CN=system:node:iz6weix7w7e0sy67ak2vt0z,O=system:nodes signed by CN=k3s-client-ca@1605702556: notBefore=2020-11-18 12:29:16 +0000 UTC notAfter=2021-11-18 12:31:05 +0000 UTC"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.470206249+08:00" level=info msg="Connecting to proxy" url="wss://127.0.0.1:6443/v1-k3s/connect"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.471715591+08:00" level=info msg="Handling backend connection request [iz6weix7w7e0sy67ak2vt0z]"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.472080499+08:00" level=warning msg="Disabling CPU quotas due to missing cpu.cfs_period_us"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.472135278+08:00" level=info msg="Running kubelet --address=0.0.0.0 --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=cgroupfs --client-ca-file=/var/lib/rancher/k3s/agent/client-ca.crt --cloud-provider=external --cluster-dns=10.43.0.10 --cluster-domain=cluster.local --cni-bin-dir=/var/lib/rancher/k3s/data/223e6420f8db0d8828a8f5ed3c44489bb8eb47aa71485404f8af8c462a29bea3/bin --cni-conf-dir=/var/lib/rancher/k3s/agent/etc/cni/net.d --cpu-cfs-quota=false --eviction-hard=imagefs.available<5%,nodefs.available<5% --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10% --fail-swap-on=false --healthz-bind-address=127.0.0.1 --hostname-override=iz6weix7w7e0sy67ak2vt0z --kubeconfig=/var/lib/rancher/k3s/agent/kubelet.kubeconfig --kubelet-cgroups=/systemd/system.slice --network-plugin=cni --node-labels= --pod-infra-container-image=docker.io/rancher/pause:3.1 --pod-manifest-path=/var/lib/rancher/k3s/agent/pod-manifests --read-only-port=0 --resolv-conf=/etc/resolv.conf --runtime-cgroups=/systemd/system.slice --tls-cert-file=/var/lib/rancher/k3s/agent/serving-kubelet.crt --tls-private-key-file=/var/lib/rancher/k3s/agent/serving-kubelet.key"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.473352064+08:00" level=info msg="Running kube-proxy --cluster-cidr=10.42.0.0/16 --healthz-bind-address=127.0.0.1 --hostname-override=iz6weix7w7e0sy67ak2vt0z --kubeconfig=/var/lib/rancher/k3s/agent/kubeproxy.kubeconfig --proxy-mode=iptables"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: Flag --cloud-provider has been deprecated, will be removed in 1.23, in favor of removing cloud provider code from Kubelet.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.482389    5531 server.go:407] Version: v1.19.3+k3s3
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.485429    5531 server.go:226] WARNING: all flags other than --config, --write-config-to, and --cleanup are deprecated. Please begin using a config file ASAP.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.488553    5531 dynamic_cafile_content.go:167] Starting client-ca-bundle::/var/lib/rancher/k3s/agent/client-ca.crt
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.495183391+08:00" level=info msg="Waiting for node iz6weix7w7e0sy67ak2vt0z: nodes \"iz6weix7w7e0sy67ak2vt0z\" not found"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: E1118 20:31:05.495275    5531 node.go:125] Failed to retrieve node info: nodes "iz6weix7w7e0sy67ak2vt0z" is forbidden: User "system:kube-proxy" cannot get resource "nodes" in API group "" at the cluster scope
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.585590    5531 nvidia.go:61] NVIDIA GPU metrics will not be available: no NVIDIA devices found
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.622626    5531 server.go:640] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.622876    5531 container_manager_linux.go:289] container manager verified user specified cgroup-root exists: []
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.622888    5531 container_manager_linux.go:294] Creating Container Manager object based on Node Config: {RuntimeCgroupsName:/systemd/system.slice SystemCgroupsName: KubeletCgroupsName:/systemd/system.slice ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:false CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none Rootless:false}
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.622973    5531 topology_manager.go:126] [topologymanager] Creating topology manager with none policy
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.622980    5531 container_manager_linux.go:324] [topologymanager] Initializing Topology Manager with none policy
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.622984    5531 container_manager_linux.go:329] Creating device plugin manager: true
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.623053    5531 client.go:77] Connecting to docker on unix:///var/run/docker.sock
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.623063    5531 client.go:94] Start docker client with request timeout=2m0s
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.634650    5531 docker_service.go:565] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.634672    5531 docker_service.go:241] Hairpin mode set to "hairpin-veth"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.658371    5531 docker_service.go:256] Docker cri networking managed by cni
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.669527    5531 docker_service.go:261] Docker Info: &{ID:O34P:FQNT:EZBC:VKNS:GUIJ:U4J6:HHMB:NAI3:4AC5:AVJ4:GDLM:F3DI Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:0 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:22 OomKillDisable:true NGoroutines:35 SystemTime:2020-11-18T20:31:05.659464651+08:00 LoggingDriver:json-file CgroupDriver:cgroupfs NEventsListener:0 KernelVersion:3.10.0-1127.19.1.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc009b2b960 NCPU:2 MemTotal:1818824704 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:iZ6weix7w7e0sy67ak2vt0Z Labels:[] ExperimentalBuild:false ServerVersion:19.03.13 ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:runc Args:[]}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil>} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:8fba4e9a7d01810a393d5d25a3621dc101981175 Expected:8fba4e9a7d01810a393d5d25a3621dc101981175} RuncCommit:{ID:dc9208a3303feef5b3839f4323d9beb36df0a9dd Expected:dc9208a3303feef5b3839f4323d9beb36df0a9dd} InitCommit:{ID:fec3683 Expected:fec3683} SecurityOptions:[name=seccomp,profile=default]}
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.669606    5531 docker_service.go:274] Setting cgroupDriver to cgroupfs
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.685004    5531 kubelet.go:261] Adding pod path: /var/lib/rancher/k3s/agent/pod-manifests
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.685046    5531 kubelet.go:273] Watching apiserver
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.721585    5531 kuberuntime_manager.go:214] Container runtime docker initialized, version: 19.03.13, apiVersion: 1.40.0
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.721868    5531 server.go:1148] Started kubelet
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: E1118 20:31:05.724479    5531 kubelet.go:1218] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.724981    5531 fs_resource_analyzer.go:64] Starting FS ResourceAnalyzer
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.728914    5531 server.go:152] Starting to listen on 0.0.0.0:10250
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.729462    5531 server.go:424] Adding debug handlers to kubelet server.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.731997    5531 volume_manager.go:265] Starting Kubelet Volume Manager
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.734936    5531 desired_state_of_world_populator.go:139] Desired state populator starts to run
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.767085    5531 status_manager.go:158] Starting to sync pod status with apiserver
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.767109    5531 kubelet.go:1741] Starting kubelet main sync loop.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: E1118 20:31:05.767138    5531 kubelet.go:1765] skipping pod synchronization - [container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: E1118 20:31:05.789444    5531 controller.go:228] failed to get node "iz6weix7w7e0sy67ak2vt0z" when trying to set owner ref to the node lease: nodes "iz6weix7w7e0sy67ak2vt0z" not found
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: E1118 20:31:05.868422    5531 kubelet.go:1765] skipping pod synchronization - container runtime status check may not have completed yet
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: E1118 20:31:05.896006    5531 kubelet.go:2183] node "iz6weix7w7e0sy67ak2vt0z" not found
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.960211    5531 kubelet_node_status.go:70] Attempting to register node iz6weix7w7e0sy67ak2vt0z
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.972573    5531 cpu_manager.go:184] [cpumanager] starting with none policy
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.972585    5531 cpu_manager.go:185] [cpumanager] reconciling every 10s
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.972609    5531 state_mem.go:36] [cpumanager] initializing new in-memory state store
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.972753    5531 state_mem.go:88] [cpumanager] updated default cpuset: ""
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.972761    5531 state_mem.go:96] [cpumanager] updated cpuset assignments: "map[]"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.972775    5531 policy_none.go:43] [cpumanager] none policy: Start
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: W1118 20:31:05.973305    5531 manager.go:596] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.973470    5531 plugin_manager.go:114] Starting Kubelet Plugin Manager
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: E1118 20:31:05.974601    5531 eviction_manager.go:260] eviction manager: failed to get summary stats: failed to get node info: node "iz6weix7w7e0sy67ak2vt0z" not found
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.988515514+08:00" level=info msg="Proxy done" err="context canceled" url="wss://127.0.0.1:6443/v1-k3s/connect"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.988746208+08:00" level=info msg="Kube API server is now running"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.988764034+08:00" level=info msg="k3s is up and running"
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: Flag --address has been deprecated, see --bind-address instead.
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.991655    5531 controllermanager.go:175] Version: v1.19.3+k3s3
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: I1118 20:31:05.991921    5531 deprecated_insecure_serving.go:53] Serving insecurely on 127.0.0.1:10252
Nov 18 20:31:05 iZ6weix7w7e0sy67ak2vt0Z k3s: time="2020-11-18T20:31:05.992604473+08:00" level=fatal msg="server stopped: http: Server closed"
Nov 18 20:31:06 iZ6weix7w7e0sy67ak2vt0Z systemd: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 18 20:31:06 iZ6weix7w7e0sy67ak2vt0Z systemd: Unit k3s.service entered failed state.
Nov 18 20:31:06 iZ6weix7w7e0sy67ak2vt0Z systemd: k3s.service failed.
janeczku commented 3 years ago

Reproduced on 7.8 using the following steps:

1. curl https://releases.rancher.com/install-docker/19.03.sh | sh
2. systemctl enable docker (and disable firewalld if that's enabled)
3. sudo setenforce 0
4. sudo curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_START=true INSTALL_K3S_EXEC="server --disable traefik --docker" sh
5. Add "Wants=docker.service", "After=docker.service" to /etc/systemd/system/k3s.service
6. sudo systemctl daemon-reload
7. sudo systemctl start k3s

Environment Info:

[root@test-centos-k3s-docker ~]# rpm -q centos-release
centos-release-7-8.2003.0.el7.centos.x86_64
[root@test-centos-k3s-docker ~]# uname -a
Linux test-centos-k3s-docker 3.10.0-1127.el7.x86_64

Log:

Nov 18 19:27:14 test-centos-k3s-docker k3s[4588]: I1118 19:27:14.806232    4588 plugin_manager.go:114] Starting Kubelet Plugin Manager
Nov 18 19:27:14 test-centos-k3s-docker k3s[4588]: E1118 19:27:14.808811    4588 eviction_manager.go:260] eviction manager: failed to get summary stats: failed to get node info: node "test-centos-k3s-docker" not found
Nov 18 19:27:14 test-centos-k3s-docker k3s[4588]: time="2020-11-18T19:27:14.831356438Z" level=info msg="Proxy done" err="context canceled" url="wss://127.0.0.1:6443/v1-k3s/connect"
Nov 18 19:27:14 test-centos-k3s-docker k3s[4588]: time="2020-11-18T19:27:14.832251208Z" level=info msg="Kube API server is now running"
Nov 18 19:27:14 test-centos-k3s-docker k3s[4588]: time="2020-11-18T19:27:14.832299870Z" level=info msg="k3s is up and running"
Nov 18 19:27:14 test-centos-k3s-docker k3s[4588]: Flag --address has been deprecated, see --bind-address instead.
Nov 18 19:27:14 test-centos-k3s-docker k3s[4588]: time="2020-11-18T19:27:14.837703225Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
Nov 18 19:27:14 test-centos-k3s-docker k3s[4588]: time="2020-11-18T19:27:14.838636951Z" level=fatal msg="server stopped: http: Server closed"
Nov 18 19:27:14 test-centos-k3s-docker systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 18 19:27:14 test-centos-k3s-docker systemd[1]: Failed to start Lightweight Kubernetes.
Nov 18 19:27:14 test-centos-k3s-docker systemd[1]: Unit k3s.service entered failed state.
Nov 18 19:27:14 test-centos-k3s-docker systemd[1]: k3s.service failed.
Nov 18 19:27:16 test-centos-k3s-docker systemd[1]: Stopped Lightweight Kubernetes.

Full log:

k3s-crash.log.gz

brandond commented 3 years ago

Changing the systemd unit from type=notify to type=exec allows it to start properly. I think there may be some issue with the notification socket. Server also runs an agent by default, and they both attempt to send a notification:

https://github.com/rancher/k3s/blob/master/pkg/agent/run.go#L168 https://github.com/rancher/k3s/blob/master/pkg/cli/server/server.go#L251

If I start k3s with --disable-agent, it starts successfully.

I suspect there is some sort of timing issue, where k3s starts up faster when using docker (since it doesn't have to start containerd) and the duplicate notifications are confusing systemd. This is a hunch; I would have to test a build with the agent SdNotify call disabled in order to confirm this.

mkoperator commented 3 years ago

Hitting the issue with --docker, CentOS 7.9, k3s 1.19.3-k3s3. Switching to type=exec seems to have mitigated it. (had an issue earlier due to error between keyboard and chair)

mkoperator commented 3 years ago

The issue appears to not affect the 1.18 releases, having tested both 1.18.9+k3s1, 1.18.12+k3s1. While effecting both 1.19.4-rc1+k3s1 and 1.19.3-k3s3.

brandond commented 3 years ago

I am honestly not sure why server needs to notify separately - it feels like it should only need to do that if we're running with --disable-agent since in that case the agent would not notify.

janeczku commented 3 years ago

The notify race seems to be a red herring: It's only called once during a failed run:

[root@test-centos-k3s-docker ~]# sudo sysdig -s812 "(fd.type in (unix) and proc.name in (k3s-server))"|grep "notify" -A1 -B1
264402 15:45:30.014031936 1 k3s-server (18483) > connect fd=7(<u>)
264403 15:45:30.014553118 1 k3s-server (18483) < connect res=0 tuple=ffff8df799600440->ffff8df797ddd100 /run/systemd/notify
264410 15:45:30.014576714 1 k3s-server (18483) > write fd=7(<u>ffff8df799600440->ffff8df797ddd100 /run/systemd/notify) size=8
264411 15:45:30.014583567 1 k3s-server (18483) < write res=8 data=READY=1.
264414 15:45:30.014588274 1 k3s-server (18483) > close fd=7(<u>ffff8df799600440->ffff8df797ddd100 /run/systemd/notify)
264415 15:45:30.014588997 1 k3s-server (18483) < close res=0
(SERVICE EXITS)
erikwilson commented 3 years ago

I wonder if this is a side-effect of using TimeoutStartSec=0 in the systemd unit file.

brandond commented 3 years ago

Something is cancelling the tunnel server context, I was assuming that systemd was sending a signal to the k3s process in response to the duplicate READY messages that was causing it to start shutting down. Can you check for that?

Oats87 commented 3 years ago

This appears to be a systemd issue. On a system running systemd-219-42.el7_4.4, I can reliably get k3s v1.19.3+k3s3 to start with docker. As soon as I updated to systemd-219-78.el7_9.2 k3s was no longer able to start

Oats87 commented 3 years ago

I walked through all of the various systemd packages available for CentOS 7 and it started failing once I hit systemd-219-73.el7.1.x86_64 which is the first `systemd-219-73 package available.

janeczku commented 3 years ago

There is nothing wrong with k3s until systemd decides to kill it's main process immediately after determining that the services "cgroup is empty". Setting KillMode=none in the unit file will still result in the service to be marked failed but the k3s process will continue to just run fine.

systemd-k3s-debug
Oats87 commented 3 years ago

@janeczku Yeah, as far as I can tell systemd >= 219-73 is killing the process for some reason. Still investigating this. I also tested by changing the Kill Signal to SIGABRT which promptly made k3s create a stack trace

Oats87 commented 3 years ago

I'm not so sure that the cgorup empty notification message is meaningful -- on 1.18.12+k3s2 (which works) it still shows up:

Nov 20 17:42:02 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:02.938030   20388 plugin_manager.go:114] Starting Kubelet Plugin Manager
Nov 20 17:42:02 ip-172-31-34-64.us-west-2.compute.internal systemd[1]: Got cgroup empty notification for: /system.slice/k3s.service
Nov 20 17:42:02 ip-172-31-34-64.us-west-2.compute.internal systemd[1]: k3s.service: cgroup is empty
Nov 20 17:42:02 ip-172-31-34-64.us-west-2.compute.internal systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/agent interface=org.freedesktop.systemd1.Agent member=Released cookie=299579 reply_cookie=0 error=n/a
Nov 20 17:42:02 ip-172-31-34-64.us-west-2.compute.internal systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/unit/k3s_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=299580 reply_cookie=0 error=n/a
Nov 20 17:42:02 ip-172-31-34-64.us-west-2.compute.internal systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/unit/k3s_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=299581 reply_cookie=0 error=n/a
Nov 20 17:42:02 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:02.975511   20388 node_ipam_controller.go:94] Sending events to api server.
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.423677   20388 reconciler.go:157] Reconciler: start to sync state
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.478050   20388 node.go:136] Successfully retrieved node IP: 172.31.34.64
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.478083   20388 server_others.go:187] Using iptables Proxier.
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.478381   20388 server.go:583] Version: v1.18.12+k3s1
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.478712   20388 conntrack.go:103] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.478735   20388 conntrack.go:52] Setting nf_conntrack_max to 131072
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.478778   20388 conntrack.go:103] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.478804   20388 conntrack.go:103] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.479051   20388 config.go:315] Starting service config controller
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.479059   20388 shared_informer.go:223] Waiting for caches to sync for service config
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.479611   20388 config.go:133] Starting endpoints config controller
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.479624   20388 shared_informer.go:223] Waiting for caches to sync for endpoints config
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.579288   20388 shared_informer.go:230] Caches are synced for service config
Nov 20 17:42:03 ip-172-31-34-64.us-west-2.compute.internal k3s[20388]: I1120 17:42:03.579826   20388 shared_informer.go:230] Caches are synced for endpoints config
erikwilson commented 3 years ago

Any luck with TimeoutStartSec @Oats87 ?

Oats87 commented 3 years ago

@erikwilson unfortunately not, I tried a bunch of different things like infinity (which doesn't work on RHEL/CentOS 7 systemd) and stupidly high values

Oats87 commented 3 years ago

Changing the systemd unit file KillMode to none shows that systemd is thinking that the process has failed for some reason, but the service is actually not dead.

Nov 20 11:53:27 ck-centos7-2 k3s[10599]: I1120 11:53:27.402436   10599 policy_none.go:43] [cpumanager] none policy: Start
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: W1120 11:53:27.405480   10599 manager.go:596] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: I1120 11:53:27.406903   10599 plugin_manager.go:114] Starting Kubelet Plugin Manager
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: E1120 11:53:27.407562   10599 eviction_manager.go:260] eviction manager: failed to get summary stats: failed to get node info: node "ck-centos7-2" not found
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Got cgroup empty notification for: /system.slice/k3s.service
Nov 20 11:53:27 ck-centos7-2 systemd[1]: k3s.service: cgroup is empty
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Unwatching 10599.
Nov 20 11:53:27 ck-centos7-2 systemd[1]: k3s.service changed start -> failed
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Job k3s.service/start finished, result=failed
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Failed to start Lightweight Kubernetes.
-- Subject: Unit k3s.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit k3s.service has failed.
--
-- The result is failed.
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=18 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=10070 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Unit k3s.service entered failed state.
Nov 20 11:53:27 ck-centos7-2 systemd[1]: k3s.service failed.
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/agent interface=org.freedesktop.systemd1.Agent member=Released cookie=10071 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/unit/k3s_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=19 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/unit/k3s_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=20 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/unit/k3s_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=10072 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/unit/k3s_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=10073 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Got unexpected auxiliary data with level=1 and type=2
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Got unexpected auxiliary data with level=1 and type=2
Nov 20 11:53:27 ck-centos7-2 polkitd[795]: Unregistered Authentication Agent for unix-process:10588:254223 (system bus name :1.43, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Got message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/k3s_2eservice interface=org.freedesktop.DBus.Properties member=Get cookie=4 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: SELinux access check scon=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcon=unconfined_u:object_r:systemd_unit_file_t:s0 tclass=service perm=status path=/etc/systemd/system/k3s.service cmdline=systemctl start k3s: 0
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Sent message type=method_return sender=n/a destination=n/a object=n/a interface=n/a member=n/a cookie=21 reply_cookie=4 error=n/a
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Got disconnect on private connection.
Nov 20 11:53:27 ck-centos7-2 systemd[1]: Got message type=signal sender=org.freedesktop.DBus destination=n/a object=/org/freedesktop/DBus interface=org.freedesktop.DBus member=NameOwnerChanged cookie=119 reply_cookie=0 error=n/a
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: E1120 11:53:27.461544   10599 kubelet.go:2183] node "ck-centos7-2" not found
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: E1120 11:53:27.562659   10599 kubelet.go:2183] node "ck-centos7-2" not found
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: E1120 11:53:27.627788   10599 node.go:125] Failed to retrieve node info: nodes "ck-centos7-2" is forbidden: User "system:kube-proxy" cannot get resource "nodes" in API group "" at the cluster scope
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: E1120 11:53:27.662826   10599 kubelet.go:2183] node "ck-centos7-2" not found
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: E1120 11:53:27.763034   10599 kubelet.go:2183] node "ck-centos7-2" not found
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: I1120 11:53:27.813637   10599 reconciler.go:157] Reconciler: start to sync state
Nov 20 11:53:27 ck-centos7-2 k3s[10599]: E1120 11:53:27.863496   10599 kubelet.go:2183] node "ck-centos7-2" not found
Oats87 commented 3 years ago

OK. As far as I can deduce, systemd appears to have an issue where it is not respecting TimeoutStartSec values for Type=notify services. I made a one line change on a custom build of K3s

[root@cannon01 k3s]# git diff pkg/cli/server/server.go
diff --git a/pkg/cli/server/server.go b/pkg/cli/server/server.go
index 2d80ca222f..e7aea5d9a7 100644
--- a/pkg/cli/server/server.go
+++ b/pkg/cli/server/server.go
@@ -39,6 +39,7 @@ func Run(app *cli.Context) error {
 }

 func run(app *cli.Context, cfg *cmds.Server) error {
+       systemd.SdNotify(true, "READY=1\n")
        var (
                err error
        )
[root@cannon01 k3s]#

and now K3s is starts reliably on multiple CentOS 7.8/7.9 test systems with stock settings. I can only assume that the reason for us hitting this bug now is due to changes to K8s 1.19 affecting start up time.

brandond commented 3 years ago

It appears that this is related to a commit backported by RHEL: https://github.com/systemd-rhel/rhel-7/commit/273d69011bf2a8abfcef71e33e6b6ae3323dfc34

This change appears to make it so that if any child processes exit (generating SIGCHLD) before we send our startup notification to systemd, systemd will mark the unit as failed and send termination signals to all the remaining child processes. There is also something else going on relating to the service_notify_cgroup_empty_event that's triggered by the sigchld handler, but I haven't bothered to track that down.

I think the --docker flag comes into play because we start forking things sooner when we don't have to wait for containerd to start up. Since we do a fair bit of forking out to subshells during various parts of our startup, we will either need to move the startup notification earlier in startup (which reduces its usefulness) or just switch to type=exec.

Oats87 commented 3 years ago

Summary of findings:

We believe that the change https://github.com/systemd-rhel/rhel-7/commit/273d69011bf2a8abfcef71e33e6b6ae3323dfc34#diff-88a01402a6feb6c8a13f32a718325c7fdbdd4ca47de52b147ff75eaace8e1903R2568 here is causing us problems, due to the way we handle the way we live under the various cgroups.

When we initially start, the k3s process lives under /system.slice/k3s.service. We eventually will end up living under /systemd/system.slice (i.e. we move). In the case of --docker, this leaves the k3s.service slice empty, hence triggering the cgroup is empty messages. When we start our own containerd, the various containers continue to live under the k3s.service cgroup, hence systemd sees that it is not empty and doesn't nix the main process. Note that we only move when we start the agent -- this is why --disable-agent allows a successful startup.

This seems to only matter when we are calling SdNotify, i.e. if we call SdNotify early on enough that we haven't yet moved, systemd will be happy and transition us to the Running state. If we are too late and have already moved, systemd says cgroup is empty and kills our process.

https://github.com/rancher/k3s/blob/master/pkg/daemons/agent/agent.go#L197 is of note

Instead of changing the way we are operating under cgroups (i.e. changing existing behavior), I will open a PR to change Type=notify to Type=exec and we can visit using systemd notifications again later on.

brandond commented 3 years ago

To be clear - the cgroup move has something to do with the kubelet, which is why running with --disable-agent also works - it prevents the server starting the kubelet and in turn vacating the cgroup. We only ever used type=notify for server, probably because we ran into similar problems on agents.

Possibly related: https://github.com/rancher/k3s/issues/2502

Oats87 commented 3 years ago

More context: systemd for EL7 appears to be missing the commit/PR that came out of https://github.com/systemd-rhel/rhel-8/commit/3c751b1bfaf734db09256a5631f1f9ce75cf0d35 (which makes sense because it was introduced past systemd 233

EL8 is running systemd 239 which includes this: https://github.com/systemd-rhel/rhel-8/commit/3c751b1bfaf734db09256a5631f1f9ce75cf0d35

Oats87 commented 3 years ago

Related: https://github.com/systemd/systemd/issues/6045

Oats87 commented 3 years ago

Filed an upstream issue with Red Hat on this here: https://bugzilla.redhat.com/show_bug.cgi?id=1900877

janeczku commented 3 years ago

I believe that https://github.com/systemd-rhel/rhel-8/commit/3c751b1bfaf734db09256a5631f1f9ce75cf0d35 just coincidentally mitigated an otherwise unsupported behavior of k3s with regards to how it manipulates the cgroup hierarchy for a service managed by systemd.

k3s basically escapes the systemd-managed cgroup by moving the controll process from /system.slice/k3s.service to /systemd/system.slice/k3s.service. It's not too surprising that this would confuse systemd and cause weird behaviour such as nixing the control PID or not terminating all processes on shutdown.

This seems to be in conflict with upstream requirements:

If [services] create and manipulate cgroups outside of units that have Delegate=yes set, they violate the access contract for control groups.

As @Oats87 noted above, the only reason probably that this didn't affect k3s installations using containerd was that when running with containerd the unit's cgroup would not be completely empty as only the k3s process is moved.

The fact that we need to set KillMode=process in the k3s.service configuration also actually seems to work around the problem that the default kill mode (control-group) doesn't work because we are removing the main process from the service's control group.

I wonder why we are setting up kubelet to move our PID to a different cgroup in the first place (by passing the parameter kubelet-cgroups=/systemd/system.slice). I have not seen this being done in any other distro and it seems wrong and causes inconsistency between the cgroup hierarchy of k3s running with and without agent.

Related issues:

https://github.com/rancher/k3s/issues/2587 https://github.com/rancher/k3s/issues/2583 https://github.com/rancher/k3s/issues/2502

Oats87 commented 3 years ago

Thanks @janeczku Looking through the prior changes related to this, I don't think we should be setting kubelet-cgroups and runtime-cgroups the way we are.

https://github.com/rancher/k3s/pull/133/commits/602f0d70b446abe3161cbe603dbda0089a8f877e

Setting the according --kubelet-arg runtime-cgroups=/system.slice/docker.service --kubelet-arg kubelet-cgroups=/system.slice/k3s.service allows K3s to start no problem on CentOS 7.8 and above.

Workaround:

1. curl https://releases.rancher.com/install-docker/19.03.sh | sh
2. systemctl enable docker (and disable firewalld if that's enabled)
3. sudo setenforce 0
4. sudo curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_START=true INSTALL_K3S_EXEC="server --disable traefik --docker --kubelet-arg runtime-cgroups=/system.slice/docker.service --kubelet-arg kubelet-cgroups=/system.slice/k3s.service" sh
5. Add "Wants=docker.service", "After=docker.service" to /etc/systemd/system/k3s.service
6. sudo systemctl daemon-reload
7. sudo systemctl start k3s
brandond commented 3 years ago

I agree that it seems weird. It might be worth asking @ibuildthecloud what he was doing there and if it's still needed.

Oats87 commented 3 years ago

Update:

We are configuring the kubelet with --cgroup-driver=cgroupfs. Our logic is looking to see that we're operating under the systemd hierarchy and is then setting the according --kubelet-cgroups and --runtime-cgroups flags for the Kubelet, with the assumption that we should be prepending /systemd to the front of the full path to the cgroup. The issue is that when we prepend /systemd, we end up with a redundant path i.e. /sys/fs/cgroup/systemd/systemd

On an Ubuntu 18.04 system, K3s ends up living under

ubuntu@ip-172-31-32-112:/sys/fs/cgroup/systemd/systemd/system.slice$ cat cgroup.procs
1265
ubuntu@ip-172-31-32-112:/sys/fs/cgroup/systemd/systemd/system.slice$ ps 1265
  PID TTY      STAT   TIME COMMAND
 1265 ?        Ssl    0:11 /usr/local/bin/k3s server
ubuntu@ip-172-31-32-112:/sys/fs/cgroup/systemd/systemd/system.slice$

On an EL7 system, K3s ends up in a similar case living under the redundant systemd path, but systemd does not "merge" the duplicate systemd into the correct hierarchy like the systemd on Ubuntu does.

[root@ck-centos7-0 system.slice]# pwd
/sys/fs/cgroup/systemd/systemd/system.slice
[root@ck-centos7-0 system.slice]# cat cgroup.procs
15776
[root@ck-centos7-0 system.slice]# ps 15776
  PID TTY      STAT   TIME COMMAND
15776 ?        Ssl    0:24 /usr/local/bin/k3s server
[root@ck-centos7-0 system.slice]#

Note that I ran both of these tests with a stock K3s configuration which is why you don't see the docker process.

I'm looking into how/why systemd on ubuntu is treating the redundant systemd differently.

Oats87 commented 3 years ago

Talking to Darren, it turns out that the reason why we are trying to prepend /systemd (and/or even set --kubelet-cgroups and --runtime-cgroups in the first place) is due to the case when we run in docker, i.e. with k3d. This is because a docker container by default lives under the /docker cgroup path, and when you are within the container, that /docker is not visible to the kubelet and it's constantly trying to check /docker/<cgroup> which doesn't exist

ShylajaDevadiga commented 3 years ago

Validation on k3s v1.19.5-rc1+k3s1, Backporting to earlier k3s versions will be tracked separately in the issue #2687
Stopping k3s/docker and bringing them back on works as expected.

[centos@ip-172-31-25-35 ~]$ sudo curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_START=true INSTALL_K3S_VERSION=v1.19.5-rc1+k3s1 INSTALL_K3S_EXEC="server --disable traefik --docker" sh

[centos@ip-172-31-25-35 ~]$ kubectl get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
ip-172-31-25-35.us-east-2.compute.internal   Ready    master   18s   v1.19.5-rc1+k3s1

[centos@ip-172-31-25-35 ~]$ kubectl get pods -A
NAMESPACE     NAME                                     READY   STATUS    RESTARTS   AGE
kube-system   local-path-provisioner-7ff9579c6-7mq7p   1/1     Running   0          5s
kube-system   metrics-server-7b4f8b595-wxjkq           1/1     Running   0          5s
kube-system   coredns-88dbd9b97-g5v9k                  1/1     Running   0          5s

[centos@ip-172-31-25-35 ~]$ sudo docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
60b2db61466f        9dd718864ce6        "/metrics-server"        2 minutes ago       Up 2 minutes                            k8s_metrics-server_metrics-server-7b4f8b595-wxjkq_kube-system_1de460de-f6c9-451e-9c97-e054f9cc3248_0
971eb3b5f3ae        e422121c9c5f        "local-path-provisio…"   2 minutes ago       Up 2 minutes                            k8s_local-path-provisioner_local-path-provisioner-7ff9579c6-7mq7p_kube-system_81f54d91-8f93-47bc-8bcf-e3cf7a150a30_0
22bd39a166ec        0a6cfbf7b0b6        "/coredns -conf /etc…"   2 minutes ago       Up 2 minutes                            k8s_coredns_coredns-88dbd9b97-g5v9k_kube-system_99848c93-906c-4444-ab76-0d09813b4f51_0
9a7a7fb1b5fe        rancher/pause:3.1   "/pause"                 2 minutes ago       Up 2 minutes                            k8s_POD_local-path-provisioner-7ff9579c6-7mq7p_kube-system_81f54d91-8f93-47bc-8bcf-e3cf7a150a30_1
847e17224abc        rancher/pause:3.1   "/pause"                 2 minutes ago       Up 2 minutes                            k8s_POD_metrics-server-7b4f8b595-wxjkq_kube-system_1de460de-f6c9-451e-9c97-e054f9cc3248_1
cfde06f1b53d        rancher/pause:3.1   "/pause"                 2 minutes ago       Up 2 minutes                            k8s_POD_coredns-88dbd9b97-g5v9k_kube-system_99848c93-906c-4444-ab76-0d09813b4f51_1

─system.slice
  ├─k3s.service
  │ └─3497 /usr/local/bin/k3s server                           
  ├─docker.service
  │ └─3349 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

[centos@ip-172-31-25-35 ~]$ cat /etc/redhat-release 
CentOS Linux release 7.8.2003 (Core)
discdisk commented 2 years ago

This problem seems still exist when i use the newest file download from https://get.k3s.io/

brandond commented 2 years ago

@discdisk can you open a new issue and provide the information requested by the template?