单机安装API Server未起来，kubelet也无法启动

zhiwu88 commented 1 year ago

Detailed description of the question.

0> sealos run labring/kubernetes:v1.25.0 labring/helm:v3.8.2 labring/calico:v3.24.1 --single 【省略】 [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf" W1228 17:46:54.407380 18738 kubeconfig.go:249] a kubeconfig file "/etc/kubernetes/controller-manager.conf" exists already but has an unexpected API Server URL: expected: https://10.22.4.139:6443, got: https://apiserver.cluster.local:6443 [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf" W1228 17:46:54.810546 18738 kubeconfig.go:249] a kubeconfig file "/etc/kubernetes/scheduler.conf" exists already but has an unexpected API Server URL: expected: https://10.22.4.139:6443, got: https://apiserver.cluster.local:6443 [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

Unfortunately, an error has occurred: timed out waiting for the condition

This error is likely caused by:

The kubelet is not running
The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:

'systemctl status kubelet'
'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime. To troubleshoot, list all containers using your preferred container runtimes CLI. Here is one example how you may list all running Kubernetes containers by using crictl:

'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause' Once you have found the failing container, you can inspect its logs with:
'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID' error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster To see the stack trace of this error execute with --v=5 or higher 2022-12-28T17:48:49 error Applied to cluster error: failed to init init master0 failed, error: exit status 1. Please clean and reinstall 2022-12-28T17:48:49 info

监听端口查询没有 6443 端口 0> netstat -lnpt Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:53 0.0.0.0: LISTEN 31548/dnsmasq tcp 0 0 0.0.0.0:22 0.0.0.0: LISTEN 1452/sshd tcp 0 0 127.0.0.1:19234 0.0.0.0: LISTEN 18299/containerd tcp6 0 0 :::53 ::: LISTEN 31548/dnsmasq tcp6 0 0 :::22 ::: LISTEN 1452/sshd tcp6 0 0 :::5000 ::: LISTEN 18654/registry tcp6 0 0 :::5001 :::* LISTEN 18654/registry

Some reference materials you see.

No response

fanux commented 1 year ago

sealos version

zhiwu88 commented 1 year ago

sealos version

0> sealos version {"gitVersion":"4.1.3","gitCommit":"b2ba9705","buildDate":"2022-09-06T06:04:14Z","goVersion":"go1.19","compiler":"gc","platform":"linux/amd64"}

fanux commented 1 year ago

https://github.com/labring/sealos/releases/tag/v4.1.4-rc3 用这个版本试试，如果还有问题 reopen issue 哈

zhiwu88 commented 1 year ago

非常感谢，不过问题还没有解决，我这次把所有日志贴一下。系统版本是 CentOS7 （3.10.0-514.el7.x86_64）不知道跟内核是否有关系？

由于是您close的问题，我没有权限打开。

> sealos run labring/kubernetes:v1.25.0 labring/helm:v3.8.2 labring/calico:v3.24.1 --single
2022-12-29T16:54:12 info Start to create a new cluster: master [10.22.4.139], worker [], registry 10.22.4.139
2022-12-29T16:54:12 info Executing pipeline Check in CreateProcessor.
2022-12-29T16:54:12 info checker:hostname [10.22.4.139:22]
2022-12-29T16:54:12 info checker:timeSync [10.22.4.139:22]
2022-12-29T16:54:12 info Executing pipeline PreProcess in CreateProcessor.
2022-12-29T16:54:12 info Executing pipeline RunConfig in CreateProcessor.
2022-12-29T16:54:12 info Executing pipeline MountRootfs in CreateProcessor.
2022-12-29T16:54:13 info Executing pipeline MirrorRegistry in CreateProcessor.
2022-12-29T16:54:14 info Executing pipeline Bootstrap in CreateProcessor
which: no docker in (/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
 WARN [2022-12-29 16:54:16] >> Replace disable_apparmor = false to disable_apparmor = true
 INFO [2022-12-29 16:54:16] >> check root,port,cri success
Created symlink from /etc/systemd/system/multi-user.target.wants/containerd.service to /etc/systemd/system/containerd.service.
 INFO [2022-12-29 16:54:18] >> Health check containerd!
 INFO [2022-12-29 16:54:18] >> containerd is running
 INFO [2022-12-29 16:54:18] >> init containerd success
Created symlink from /etc/systemd/system/multi-user.target.wants/image-cri-shim.service to /etc/systemd/system/image-cri-shim.service.
 INFO [2022-12-29 16:54:18] >> Health check image-cri-shim!
 INFO [2022-12-29 16:54:18] >> image-cri-shim is running
 INFO [2022-12-29 16:54:18] >> init shim success
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
* Applying /usr/lib/sysctl.d/00-system.conf ...
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
* Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
* Applying /usr/lib/sysctl.d/50-default.conf ...
kernel.sysrq = 16
kernel.core_uses_pid = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.all.promote_secondaries = 1
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
* Applying /etc/sysctl.d/90-omnibus-gitlab-kernel.sem.conf ...
kernel.sem = 250 32000 32 262
* Applying /etc/sysctl.d/90-omnibus-gitlab-kernel.shmall.conf ...
kernel.shmall = 4194304
* Applying /etc/sysctl.d/90-omnibus-gitlab-kernel.shmmax.conf ...
kernel.shmmax = 17179869184
* Applying /etc/sysctl.d/90-omnibus-gitlab-net.core.somaxconn.conf ...
net.core.somaxconn = 1024
* Applying /etc/sysctl.d/99-sysctl.conf ...
kernel.panic = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.ip_no_pmtu_disc = 1
net.ipv4.tcp_tw_reuse = 1
vm.swappiness = 1
sysctl: setting key "net.core.somaxconn": Invalid argument
net.core.somaxconn = 655350
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_synack_retries = 3
net.ipv4.ip_local_port_range = 10240 65000
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_syncookies = 1
* Applying /etc/sysctl.d/k8s.conf ...
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.conf.all.rp_filter = 0
* Applying /etc/sysctl.conf ...
kernel.panic = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.ip_no_pmtu_disc = 1
net.ipv4.tcp_tw_reuse = 1
vm.swappiness = 1
sysctl: setting key "net.core.somaxconn": Invalid argument
net.core.somaxconn = 655350
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_synack_retries = 3
net.ipv4.ip_local_port_range = 10240 65000
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 1
 INFO [2022-12-29 16:54:19] >> init kube success
 INFO [2022-12-29 16:54:19] >> init rootfs success
Created symlink from /etc/systemd/system/multi-user.target.wants/registry.service to /etc/systemd/system/registry.service.
 INFO [2022-12-29 16:54:19] >> Health check registry!
 INFO [2022-12-29 16:54:19] >> registry is running
 INFO [2022-12-29 16:54:19] >> init registry success
2022-12-29T16:54:19 info Executing pipeline Init in CreateProcessor.
2022-12-29T16:54:19 info start to copy kubeadm config to master0
2022-12-29T16:54:19 info start to generate cert and kubeConfig...
2022-12-29T16:54:19 info start to generator cert and copy to masters...
2022-12-29T16:54:20 info apiserver altNames : {map[apiserver.cluster.local:apiserver.cluster.local kubernetes:kubernetes kubernetes.default:kubernetes.default kubernetes.default.svc:kubernetes.default.svc kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local localhost:localhost n011-up.cms.mabc.xxg.abcnode.com:n011-up.cms.mabc.xxg.abcnode.com] map[10.103.97.2:10.103.97.2 10.22.4.139:10.22.4.139 10.96.0.1:10.96.0.1 127.0.0.1:127.0.0.1]}
2022-12-29T16:54:20 info Etcd altnames : {map[localhost:localhost n011-up.cms.mabc.xxg.abcnode.com:n011-up.cms.mabc.xxg.abcnode.com] map[10.22.4.139:10.22.4.139 127.0.0.1:127.0.0.1 ::1:::1]}, commonName : n011-up.cms.mabc.xxg.abcnode.com
2022-12-29T16:54:23 info start to copy etc pki files to masters
2022-12-29T16:54:23 info start to create kubeconfig...
2022-12-29T16:54:24 info start to copy kubeconfig files to masters
2022-12-29T16:54:24 info start to copy static files to masters
2022-12-29T16:54:24 info start to init master0...
2022-12-29T16:54:24 info registry auth in node 10.22.4.139:22
2022-12-29T16:54:24 info domain sealos.hub:10.22.4.139 append success
2022-12-29T16:54:24 info domain apiserver.cluster.local:10.22.4.139 append success
W1229 16:54:24.712927   32206 initconfiguration.go:119] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme "unix" to the "criSocket" with value "/run/containerd/containerd.sock". Please update your configuration!
[init] Using Kubernetes version: v1.25.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf"
W1229 16:54:56.405669   32206 kubeconfig.go:249] a kubeconfig file "/etc/kubernetes/controller-manager.conf" exists already but has an unexpected API Server URL: expected: https://10.22.4.139:6443, got: https://apiserver.cluster.local:6443
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf"
W1229 16:54:56.604789   32206 kubeconfig.go:249] a kubeconfig file "/etc/kubernetes/scheduler.conf" exists already but has an unexpected API Server URL: expected: https://10.22.4.139:6443, got: https://apiserver.cluster.local:6443
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
        - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
2022-12-29T16:56:51 error Applied to cluster error: failed to init init master0 failed, error: exit status 1. Please clean and reinstall
2022-12-29T16:56:51 info

cuisongliu commented 1 year ago

crictl ps -a 有数据么如果没有看一下journalctl -xeu kubelet 看一下kubelet的日志

zhiwu88 commented 1 year ago

运行 crictl ps 结果为空。

0> crictl ps -a
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD

使用 journalctl -xeu kubelet 看 kubelet 日志内容如下，失败的原因不是因为 apiserver.cluster.local:6443 没有成功启动？

Jan 02 20:58:24  systemd[1]: Starting kubelet: The Kubernetes Node Agent...
-- Subject: Unit kubelet.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has begun starting up.
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /usr/lib/sysctl.d/00-system.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.bridge.bridge-nf-call-ip6tables = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.bridge.bridge-nf-call-iptables = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.bridge.bridge-nf-call-arptables = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /usr/lib/sysctl.d/50-default.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: kernel.sysrq = 16
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: kernel.core_uses_pid = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.rp_filter = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.rp_filter = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.accept_source_route = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.accept_source_route = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.promote_secondaries = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.promote_secondaries = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: fs.protected_hardlinks = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: fs.protected_symlinks = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /etc/sysctl.d/90-omnibus-gitlab-kernel.sem.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: kernel.sem = 250 32000 32 262
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /etc/sysctl.d/90-omnibus-gitlab-kernel.shmall.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: kernel.shmall = 4194304
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /etc/sysctl.d/90-omnibus-gitlab-kernel.shmmax.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: kernel.shmmax = 17179869184
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /etc/sysctl.d/90-omnibus-gitlab-net.core.somaxconn.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.core.somaxconn = 1024
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /etc/sysctl.d/99-sysctl.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: kernel.panic = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.accept_redirects = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.accept_redirects = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.send_redirects = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.send_redirects = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.arp_ignore = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.arp_announce = 2
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.arp_ignore = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.arp_announce = 2
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.ip_no_pmtu_disc = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_tw_reuse = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: vm.swappiness = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: sysctl: setting key "net.core.somaxconn": Invalid argument
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.core.somaxconn = 655350
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_tw_recycle = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_keepalive_time = 1800
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_max_syn_backlog = 8192
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_synack_retries = 3
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.ip_local_port_range = 10240 65000
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_timestamps = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_syncookies = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /etc/sysctl.d/k8s.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.bridge.bridge-nf-call-ip6tables = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.bridge.bridge-nf-call-iptables = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.rp_filter = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: * Applying /etc/sysctl.conf ...
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: kernel.panic = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.accept_redirects = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.accept_redirects = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.send_redirects = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.send_redirects = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.arp_ignore = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.default.arp_announce = 2
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.arp_ignore = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.conf.all.arp_announce = 2
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.ip_no_pmtu_disc = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_tw_reuse = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: vm.swappiness = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: sysctl: setting key "net.core.somaxconn": Invalid argument
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.core.somaxconn = 655350
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_tw_recycle = 1
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_keepalive_time = 1800
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_max_syn_backlog = 8192
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_synack_retries = 3
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.ip_local_port_range = 10240 65000
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_timestamps = 0
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.tcp_syncookies = 1
Jan 02 20:58:24  systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished starting up.
--
-- The start-up result is done.
Jan 02 20:58:24  kubelet-pre-start.sh[36625]: net.ipv4.ip_forward = 1
Jan 02 20:58:24  kubelet[36648]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'
Jan 02 20:58:24  kubelet[36648]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Jan 02 20:58:24  kubelet[36648]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Jan 02 20:58:24  kubelet[36648]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'
Jan 02 20:58:24  kubelet[36648]: Flag --runtime-request-timeout has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.612152   36648 server.go:200] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet and should also be set in the remote runtime"
Jan 02 20:58:24  kubelet[36648]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'
Jan 02 20:58:24  kubelet[36648]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Jan 02 20:58:24  kubelet[36648]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Jan 02 20:58:24  kubelet[36648]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'
Jan 02 20:58:24  kubelet[36648]: Flag --runtime-request-timeout has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.616565   36648 server.go:413] "Kubelet version" kubeletVersion="v1.25.0"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.616593   36648 server.go:415] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.616985   36648 server.go:825] "Client rotation is on, will bootstrap in background"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.619350   36648 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.620515   36648 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.635307   36648 server.go:660] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.635832   36648 container_manager_linux.go:262] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.635993   36648 container_manager_linux.go:267] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.636037   36648 topology_manager.go:134] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.636064   36648 container_manager_linux.go:302] "Creating device plugin manager" devicePluginEnabled=true
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.636130   36648 state_mem.go:36] "Initialized new in-memory state store"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.639537   36648 kubelet.go:381] "Attempting to sync node with API server"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.639560   36648 kubelet.go:270] "Adding static pod path" path="/etc/kubernetes/manifests"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.639596   36648 kubelet.go:281] "Adding apiserver pod source"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.639616   36648 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.640116   36648 kuberuntime_manager.go:240] "Container runtime initialized" containerRuntime="containerd" version="v1.6.14" apiVersion="v1"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.640734   36648 server.go:1175] "Started kubelet"
Jan 02 20:58:24  kubelet[36648]: W0102 20:58:24.640830   36648 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.Node: Get "https://apiserver.cluster.local:6443/api/v1/nodes?fieldSelector=metadata.name%3D&limit=500&resourceVersion=0": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.640922   36648 server.go:155] "Starting to listen" address="0.0.0.0" port=10250
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.640946   36648 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://apiserver.cluster.local:6443/api/v1/nodes?fieldSelector=metadata.name%3D&limit=500&resourceVersion=0": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: W0102 20:58:24.640929   36648 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.Service: Get "https://apiserver.cluster.local:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.641017   36648 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://apiserver.cluster.local:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.641328   36648 event.go:276] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:".17367f52aa95c3ef", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"", UID:"", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"Starting", Message:"Starting kubelet.", Source:v1.EventSource{Component:"kubelet", Host:""}, FirstTimestamp:time.Date(2023, time.January, 2, 20, 58, 24, 640705519, time.Local), LastTimestamp:time.Date(2023, time.January, 2, 20, 58, 24, 640705519, time.Local), Count:1, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Post "https://apiserver.cluster.local:6443/api/v1/namespaces/default/events": dial tcp 10.22.4.139:6443: connect: connection refused'(may retry after sleeping)
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.642940   36648 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.643018   36648 volume_manager.go:293] "Starting Kubelet Volume Manager"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.643117   36648 desired_state_of_world_populator.go:149] "Desired state populator starts to run"
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.645175   36648 controller.go:144] failed to ensure lease exists, will retry in 200ms, error: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/?timeout=10s": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.647636   36648 cri_stats_provider.go:452] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs"
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.647678   36648 kubelet.go:1317] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
Jan 02 20:58:24  kubelet[36648]: W0102 20:58:24.647701   36648 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.CSIDriver: Get "https://apiserver.cluster.local:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.647768   36648 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: Get "https://apiserver.cluster.local:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.647738   36648 kubelet.go:2373] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.648025   36648 server.go:438] "Adding debug handlers to kubelet server"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.664309   36648 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv4
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.673838   36648 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv6
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.673870   36648 status_manager.go:161] "Starting to sync pod status with apiserver"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.673894   36648 kubelet.go:2010] "Starting kubelet main sync loop"
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.673944   36648 kubelet.go:2034] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
Jan 02 20:58:24  kubelet[36648]: W0102 20:58:24.674495   36648 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.RuntimeClass: Get "https://apiserver.cluster.local:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.674996   36648 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get "https://apiserver.cluster.local:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0": dial tcp 10.22.4.139:6443: connect: connection refused
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.701472   36648 cpu_manager.go:213] "Starting CPU manager" policy="none"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.701501   36648 cpu_manager.go:214] "Reconciling" reconcilePeriod="10s"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.701520   36648 state_mem.go:36] "Initialized new in-memory state store"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.701740   36648 state_mem.go:88] "Updated default CPUSet" cpuSet=""
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.701758   36648 state_mem.go:96] "Updated CPUSet assignments" assignments=map[]
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.701766   36648 policy_none.go:49] "None policy: Start"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.702251   36648 memory_manager.go:168] "Starting memorymanager" policy="None"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.702273   36648 state_mem.go:35] "Initializing new in-memory state store"
Jan 02 20:58:24  kubelet[36648]: I0102 20:58:24.702475   36648 state_mem.go:75] "Updated machine memory state"
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.705535   36648 node_container_manager_linux.go:61] "Failed to create cgroup" err="Cannot set property TasksAccounting, or unknown property." cgroupName=[kubepods]
Jan 02 20:58:24  kubelet[36648]: E0102 20:58:24.705554   36648 kubelet.go:1397] "Failed to start ContainerManager" err="Cannot set property TasksAccounting, or unknown property."
Jan 02 20:58:24  systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Jan 02 20:58:24  systemd[1]: Unit kubelet.service entered failed state.
Jan 02 20:58:24  systemd[1]: kubelet.service failed.

cuisongliu commented 1 year ago

:61] "Failed to create cgroup" err="Cannot set property TasksAccounting, or unknown property." cgroupName=[kubepods]

好像你的cgroup有点问题

zhiwu88 commented 1 year ago

非常感谢！升级systemd后解决问题。

升级后的systemd版本：

Updated:
  systemd.x86_64 0:219-78.el7_9.5

0> crictl ps
CONTAINER           IMAGE               CREATED              STATE               NAME                        ATTEMPT             POD ID              POD
0d006c2e9c7ad       792ec15461e78       About a minute ago   Running             calico-apiserver            0                   81c746dbc1081       calico-apiserver-79b5664994-lm2w2
3ff4799a7be44       792ec15461e78       About a minute ago   Running             calico-apiserver            0                   a747c17205578       calico-apiserver-79b5664994-h2j45
0856e39b62965       417ab3368bad1       About a minute ago   Running             csi-node-driver-registrar   0                   91931d8918baa       csi-node-driver-vc7v7
beb05bf481203       6a8c8f9f60dc6       About a minute ago   Running             calico-csi                  0                   91931d8918baa       csi-node-driver-vc7v7
43b09ddbd09d3       f9c3c1813269c       About a minute ago   Running             calico-kube-controllers     0                   15f696828c92e       calico-kube-controllers-85666c5b94-jf4v4
e5a53053b3117       5185b96f0becf       About a minute ago   Running             coredns                     1                   e3e2992bd2118       coredns-565d847f94-rt7w8
beb6647328948       5185b96f0becf       About a minute ago   Running             coredns                     0                   eb781025a3ece       coredns-565d847f94-dd8kt
00904f6c5a9e3       75392e3500e36       About a minute ago   Running             calico-node                 0                   5eacfa67cbdef       calico-node-bcszw
08ae4b6ec485b       068eca72ba120       2 minutes ago        Running             calico-typha                0                   5ed513787c1bc       calico-typha-69cbbb8d8c-g4n22
3d9083f4661ef       52468087127eb       2 minutes ago        Running             tigera-operator             0                   be518fa0e7f9e       tigera-operator-6675dc47f4-9qdqk
7ed72bc449f9f       58a9a0c6d96f2       2 minutes ago        Running             kube-proxy                  0                   13c944236401a       kube-proxy-vz64x
090ccfb85ac9f       a8a176a5d5d69       2 minutes ago        Running             etcd                        1                   d2052c8d9d908       etcd-n011-up.cms.mabc.xxg.abcnode.com
98dc638d71829       1a54c86c03a67       2 minutes ago        Running             kube-controller-manager     1                   88b8f29b7a7a1       kube-controller-manager-n011-up.cms.mabc.xxg.abcnode.com
2cf219bb739f9       4d2edfd10d3e3       2 minutes ago        Running             kube-apiserver              1                   afe4f81fcb103       kube-apiserver-n011-up.cms.mabc.xxg.abcnode.com
9d847c0d0f15a       bef2cf3115095       2 minutes ago        Running             kube-scheduler              1                   1fe655e352f92       kube-scheduler-n011-up.cms.mabc.xxg.abcnode.com

0> kubectl get node
NAME                                 STATUS   ROLES           AGE     VERSION
n011-up.cms.mabc.xxg.abcnode.com   Ready    control-plane   2m37s   v1.25.0

labring / sealos

单机安装API Server未起来，kubelet也无法启动 #2313

Detailed description of the question.

Some reference materials you see.