sealerio / sealer

Build, Share and Run Both Your Kubernetes Cluster and Distributed Applications (Project under CNCF)
http://sealer.cool
Apache License 2.0
2.06k stars 362 forks source link

Failed to deploy a single node cluster on ali-cloud ECS, "Cannot connect to the Docker daemon" #1703

Open cFireworks opened 2 years ago

cFireworks commented 2 years ago

遇到了相同的问题,在阿里云服务器中,系统Ubuntu 18.0.4,希望安装一个单节点的k8s集群(希望在本服务器安装,因此先后试过服务器ip和本地ip 127.0.0.1)。

OS: ubuntu 18.04 linux kernel: 4.15.0-192-generic sealer version: {"gitVersion":"v0.8.6","gitCommit":"884513e","buildDate":"2022-07-12 02:58:54","goVersion":"go1.16.15","compiler":"gc","platform":"linux/amd64"}

运行sealer run -m 127.0.0.1 -p xxx之后出现以下提示:


++ dirname ./init-registry.sh
+ cd .
+ REGISTRY_PORT=5000
+ VOLUME=/var/lib/sealer/data/my-cluster/rootfs/registry
+ REGISTRY_DOMAIN=sea.hub
+ container=sealer-registry
+++ pwd
++ dirname /var/lib/sealer/data/my-cluster/rootfs/scripts
+ rootfs=/var/lib/sealer/data/my-cluster/rootfs
+ config=/var/lib/sealer/data/my-cluster/rootfs/etc/registry_config.yml
+ htpasswd=/var/lib/sealer/data/my-cluster/rootfs/etc/registry_htpasswd
+ certs_dir=/var/lib/sealer/data/my-cluster/rootfs/certs
+ image_dir=/var/lib/sealer/data/my-cluster/rootfs/images
+ mkdir -p /var/lib/sealer/data/my-cluster/rootfs/registry
+ load_images
+ for image in "$image_dir"/*
+ '[' -f /var/lib/sealer/data/my-cluster/rootfs/images/registry.tar ']'
+ docker load -q -i /var/lib/sealer/data/my-cluster/rootfs/images/registry.tar
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
...
2022-09-13 16:04:20 [ERROR] [root.go:70] sealer-v0.8.6: failed to init master0: failed to execute command(cd /var/lib/sealer/data/my-cluster/rootfs/scripts && ./init-registry.sh 5000 /var/lib/sealer/data/my-cluster/rootfs/registry sea.hub) on host(127.0.0.1): error(Process exited with status 1)

以服务器公网ip作为master ip来安装时(卸载了自己安装的docker),出现以下提示:

++ dirname ./init-registry.sh
+ cd .
+ REGISTRY_PORT=5000
+ VOLUME=/var/lib/sealer/data/my-cluster/rootfs/registry
+ REGISTRY_DOMAIN=sea.hub
+ container=sealer-registry
+++ pwd
++ dirname /var/lib/sealer/data/my-cluster/rootfs/scripts
+ rootfs=/var/lib/sealer/data/my-cluster/rootfs
+ config=/var/lib/sealer/data/my-cluster/rootfs/etc/registry_config.yml
+ htpasswd=/var/lib/sealer/data/my-cluster/rootfs/etc/registry_htpasswd
+ certs_dir=/var/lib/sealer/data/my-cluster/rootfs/certs
+ image_dir=/var/lib/sealer/data/my-cluster/rootfs/images
+ mkdir -p /var/lib/sealer/data/my-cluster/rootfs/registry
+ load_images
+ for image in "$image_dir"/*
+ '[' -f /var/lib/sealer/data/my-cluster/rootfs/images/registry.tar ']'
+ docker load -q -i /var/lib/sealer/data/my-cluster/rootfs/images/registry.tar
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
2022-09-13 17:05:00 [DEBUG] [sshcmd.go:114] failed to execute command(cd /var/lib/sealer/data/my-cluster/rootfs/scripts && ./init-registry.sh 5000 /var/lib/sealer/data/my-cluster/rootfs/registry sea.hub) on host(xxx): error(failed to execute command(cd /var/lib/sealer/data/my-cluster/rootfs/scripts && ./init-registry.sh 5000 /var/lib/sealer/data/my-cluster/rootfs/registry sea.hub) on host(xxx): error(Process exited with status 1))

查看docker状态,显示如下:

root@iZbp14czvx1exbfxgr520cZ:/var/run/docker# systemctl status docker
● docker.service
   Loaded: masked (/dev/null; bad)
   Active: inactive (dead) since Tue 2022-09-13 15:56:09 CST; 1h 26min ago
 Main PID: 8991 (code=exited, status=0/SUCCESS)

Sep 13 15:52:40 iZbp14czvx1exbfxgr520cZ dockerd[8991]: time="2022-09-13T15:52:40.178732070+08:00" level=info msg="Attempting next endpoint for pull after error: Get https://sea.hub:5000/v2/: net/http: req
Sep 13 15:52:40 iZbp14czvx1exbfxgr520cZ dockerd[8991]: time="2022-09-13T15:52:40.178813138+08:00" level=error msg="Handler for POST /v1.40/images/create returned error: Get https://sea.hub:5000/v2/: net/h
Sep 13 15:52:55 iZbp14czvx1exbfxgr520cZ dockerd[8991]: time="2022-09-13T15:52:55.209106031+08:00" level=warning msg="Error getting v2 registry: Get https://sea.hub:5000/v2/: net/http: request canceled whi
Sep 13 15:52:55 iZbp14czvx1exbfxgr520cZ dockerd[8991]: time="2022-09-13T15:52:55.209148840+08:00" level=info msg="Attempting next endpoint for pull after error: Get https://sea.hub:5000/v2/: net/http: req
Sep 13 15:52:55 iZbp14czvx1exbfxgr520cZ dockerd[8991]: time="2022-09-13T15:52:55.209185375+08:00" level=error msg="Handler for POST /v1.40/images/create returned error: Get https://sea.hub:5000/v2/: net/h
Sep 13 15:56:09 iZbp14czvx1exbfxgr520cZ systemd[1]: Stopping Docker Application Container Engine...
Sep 13 15:56:09 iZbp14czvx1exbfxgr520cZ dockerd[8991]: time="2022-09-13T15:56:09.123579156+08:00" level=info msg="Processing signal 'terminated'"
Sep 13 15:56:09 iZbp14czvx1exbfxgr520cZ dockerd[8991]: time="2022-09-13T15:56:09.464433975+08:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.Task
Sep 13 15:56:09 iZbp14czvx1exbfxgr520cZ dockerd[8991]: time="2022-09-13T15:56:09.507618852+08:00" level=info msg="Daemon shutdown complete"
Sep 13 15:56:09 iZbp14czvx1exbfxgr520cZ systemd[1]: Stopped Docker Application Container Engine.

尝试重启docker服务,也出现和楼上相似的报错systemd[5169]: docker.service: Failed at step EXEC spawning /usr/sbin/iptables: No such file or directory,完整状态如下:

Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ dockerd[4962]: time="2022-09-13T17:23:44.154681261+08:00" level=info msg="API listen on /var/run/docker.sock"
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ systemd[5169]: docker.service: Failed to execute command: No such file or directory
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ systemd[5169]: docker.service: Failed at step EXEC spawning /usr/sbin/iptables: No such file or directory
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ systemd[1]: docker.service: Control process exited, code=exited status=203
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ dockerd[4962]: time="2022-09-13T17:23:44.159999418+08:00" level=info msg="Processing signal 'terminated'"
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ dockerd[4962]: time="2022-09-13T17:23:44.424053918+08:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.Task
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ dockerd[4962]: time="2022-09-13T17:23:44.454106725+08:00" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd name
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ dockerd[4962]: time="2022-09-13T17:23:44.454531797+08:00" level=info msg="Daemon shutdown complete"
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ systemd[1]: docker.service: Failed with result 'exit-code'.
Sep 13 17:23:44 iZbp14czvx1exbfxgr520cZ systemd[1]: Failed to start Docker Application Container Engine.

Originally posted by @cFireworks in https://github.com/sealerio/sealer/issues/1657#issuecomment-1245073966

kakaZhou719 commented 2 years ago

@cFireworks ,pls run ln -sf /sbin/iptables /usr/sbin/iptables and run again, we will enhance the logic here in the future.

cFireworks commented 2 years ago

thank you, this works, but another problem occured when i run again. It seems that cni not ready, what can i do to solve the problem?

2022-09-14 12:12:52 [INFO] [init.go:259] start to init master0...
W0914 12:52:53.049306    6789 strict.go:54] error unmarshaling configuration schema.GroupVersionKind{Group:"kubelet.config.k8s.io", Version:"v1beta1", Kind:"KubeletConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "shutdownGracePeriod"
W0914 12:52:53.116282    6789 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.8
[preflight] Running pre-flight checks
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
    [WARNING FileExisting-ebtables]: ebtables not found in system path
    [WARNING FileExisting-socat]: socat not found in system path
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf"
W0914 12:53:41.278905    6789 kubeconfig.go:242] a kubeconfig file "/etc/kubernetes/controller-manager.conf" exists already but has an unexpected API Server URL: expected: https://47.98.112.245:6443, got: https://apiserver.cluster.local:6443
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf"
W0914 12:53:41.339108    6789 kubeconfig.go:242] a kubeconfig file "/etc/kubernetes/scheduler.conf" exists already but has an unexpected API Server URL: expected: https://47.98.112.245:6443, got: https://apiserver.cluster.local:6443
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

    Unfortunately, an error has occurred:
        timed out waiting for the condition

    This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

    If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

    Additionally, a control plane component may have crashed or exited when started by the container runtime.
    To troubleshoot, list all containers using your preferred container runtimes CLI.

    Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

2022-09-14 12:12:57 [ERROR] [root.go:70] sealer-v0.8.6: failed to init master0: failed to init master0: [ssh][xx.xx.xx.xx]run command failed [kubeadm init --config=/var/lib/sealer/data/my-cluster/rootfs/etc/kubeadm.yml --upload-certs -v 0 --ignore-preflight-errors=SystemVerification]. Please clean and reinstall

The kubectl service status:

● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Wed 2022-09-14 12:53:41 CST; 8min ago
     Docs: http://kubernetes.io/docs/
  Process: 7309 ExecStartPre=/usr/bin/kubelet-pre-start.sh (code=exited, status=0/SUCCESS)
 Main PID: 7344 (kubelet)
    Tasks: 13 (limit: 2211)
   CGroup: /system.slice/kubelet.service
           └─7344 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=sea.hub:5000/pause:3.2

Sep 14 13:02:39 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:39.863882    7344 kubelet.go:449] kubelet nodes not sync
Sep 14 13:02:39 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:39.863905    7344 kubelet.go:449] kubelet nodes not sync
Sep 14 13:02:39 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:39.863910    7344 kubelet.go:449] kubelet nodes not sync
Sep 14 13:02:39 iZbp14czvx1exbfxgr520cZ kubelet[7344]: E0914 13:02:39.868585    7344 kubelet.go:2134] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Sep 14 13:02:39 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:39.946896    7344 kubelet.go:449] kubelet nodes not sync
Sep 14 13:02:40 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:40.061811    7344 kubelet.go:449] kubelet nodes not sync
Sep 14 13:02:40 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:40.863958    7344 kubelet.go:449] kubelet nodes not sync
Sep 14 13:02:40 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:40.863962    7344 kubelet.go:449] kubelet nodes not sync
Sep 14 13:02:40 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:40.946911    7344 kubelet.go:449] kubelet nodes not sync
Sep 14 13:02:41 iZbp14czvx1exbfxgr520cZ kubelet[7344]: I0914 13:02:41.061857    7344 kubelet.go:449] kubelet nodes not sync