kubesphere / kubekey

Install Kubernetes/K3s only, both Kubernetes/K3s and KubeSphere, and related cloud-native add-ons, it supports all-in-one, multi-node, and HA 🔥 ⎈ 🐳
https://kubesphere.io
Apache License 2.0
2.37k stars 550 forks source link

install k8s 1.23 failed #900

Open slzzz opened 2 years ago

slzzz commented 2 years ago

What is version of KubeKey has the issue?

1.17.3

What is your os environment?

centos7.6

KubeKey config file

No response

A clear and concise description of what happend.

images 404 17:22:00 CST [PullModule] Start to pull images on all nodes 17:22:00 CST message: [node1] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:00 CST message: [node2] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:02 CST message: [node2] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:02 CST retry: [node2] 17:22:02 CST message: [node1] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:02 CST retry: [node1] 17:22:07 CST message: [node2] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:07 CST message: [node1] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:08 CST message: [node2] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:08 CST retry: [node2] 17:22:08 CST message: [node1] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:08 CST retry: [node1] 17:22:13 CST message: [node2] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:13 CST message: [node1] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:14 CST message: [node2] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:14 CST retry: [node2] 17:22:14 CST message: [node1] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:14 CST retry: [node1] 17:22:14 CST failed: [node2] 17:22:14 CST failed: [node1] error: Pipeline[CreateClusterPipeline] execute failed: Module[PullModule] exec failed: failed: [node2] [PullImages] exec failed after 3 retires: pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 failed: [node1] [PullImages] exec failed after 3 retires: pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1

Relevant log output

No response

Additional information

No response

pixiake commented 2 years ago

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0
slzzz commented 2 years ago

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢

slzzz commented 2 years ago

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

pixiake commented 2 years ago

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/

If you want to use containerd as runtime, you can specify --container-manager containerd

slzzz commented 2 years ago

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/

If you want to use containerd as runtime, you can specify --container-manager containerd

Do I need to manually modify the kublet startup parameters? Will subsequent versions be automated?

24sama commented 2 years ago

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/ If you want to use containerd as runtime, you can specify --container-manager containerd

Do I need to manually modify the kublet startup parameters? Will subsequent versions be automated?

  1. Use the command to delete the current cluster ./kk delete cluster -f config.yaml
  2. Download the latest master branch source code and build it. (We fix a command-line flag bug yesterday)
  3. Use the command to create a new cluster that you wanted ./kk create cluster -f config.yaml --container-manager containerd
slzzz commented 2 years ago

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/ If you want to use containerd as runtime, you can specify --container-manager containerd

Do I need to manually modify the kublet startup parameters? Will subsequent versions be automated?

  1. Use the command to delete the current cluster ./kk delete cluster -f config.yaml
  2. Download the latest master branch source code and build it. (We fix a command-line flag bug yesterday)
  3. Use the command to create a new cluster that you wanted ./kk create cluster -f config.yaml --container-manager containerd

create cluster is failed.

wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
    timed out waiting for the condition

This error is likely caused by:
    - The kubelet is not running
    - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
    - 'systemctl status kubelet'
    - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.

Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl:
    - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
    Once you have found the failing container, you can inspect its logs with:
    - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1

slzzz commented 2 years ago

22643 server.go:205] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory

24sama commented 2 years ago

That looks like there are some residual files on your host. I try again on a clean host. And everything is OK. So, could you please install it again on a clean host like me?

chaunceyjiang commented 2 years ago

hi, pls check your containerd log. I'm facing the same issue if you find this error "apparmor_parser": executable file not found in $PATH in your log file, pls install apparmor-parser

slzzz commented 2 years ago

thks, I tried to clean up the containerd and reinstall it, it succeeded

FeynmanZhou commented 2 years ago

Hi @chaunceyjiang @24sama ,

I just downloaded the KubeKey v2.0.0-alpha.3 and installed it with Kubernetes 1.23, it returns the following error:

22:00:57 CST message: [LocalHost]
No SHA256 found for kubeadm. v1.23 is not supported.
22:00:57 CST retry: [LocalHost]
22:00:57 CST failed: [LocalHost]
error: Pipeline[CreateClusterPipeline] execute failed: Module[NodeBinariesModule] exec failed:
failed: [LocalHost] [DownloadBinaries] exec failed after 1 retires: No SHA256 found for kubeadm. v1.23 is not supported.

How to deal with it?

FeynmanZhou commented 2 years ago

Resolved, the specified K8s version should be v1.23.0 instead of v1.23.