oomichi / try-kubernetes

12 stars 5 forks source link

Try k8s v1.14.0 #79

Closed oomichi closed 5 years ago

oomichi commented 5 years ago

リリースされたばかりの最新版 k8s v1.14.0 を試す。

手順まとめ

# apt-get update && apt-get install -y apt-transport-https
# curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
# echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
# add-apt-repository ppa:gluster/glusterfs-4.1
# apt-get update
# apt-get install -y docker.io nfs-common glusterfs-client
# apt-get install -y kubelet kubeadm kubectl
# kubeadm init --pod-network-cidr=10.244.0.0/16
oomichi commented 5 years ago

下記のとおり、master ノードの初期化で失敗。

# kubeadm init --pod-network-cidr=10.244.0.0/1
[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
...
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
oomichi commented 5 years ago

kubelet が動いていないことが原因。

# systemctl status kubelet
? kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           mq10-kubeadm.conf
   Active: activating (auto-restart) (Result: exit-code) since Thu 2019-03-28 18:22:02 UTC; 970ms ago
     Docs: https://kubernetes.io/docs/home/
  Process: 6492 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
 Main PID: 6492 (code=exited, status=255)

Mar 28 18:22:02 k8s-master systemd[1]: kubelet.service: Unit entered failed state.
Mar 28 18:22:02 k8s-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
oomichi commented 5 years ago

syslog に原因が記載されている。 Docker APIが古すぎる。

Mar 28 18:15:29 k8s-master kubelet[5653]: I0328 18:15:29.305622    5653 container_manager_linux.go:286] Creating device plugin manager: true
Mar 28 18:15:29 k8s-master kubelet[5653]: I0328 18:15:29.305723    5653 state_mem.go:36] [cpumanager] initializing new in-memory state store
Mar 28 18:15:29 k8s-master kubelet[5653]: I0328 18:15:29.313132    5653 client.go:75] Connecting to docker on unix:///var/run/docker.sock
Mar 28 18:15:29 k8s-master kubelet[5653]: I0328 18:15:29.313159    5653 client.go:104] Start docker client with request timeout=2m0s
Mar 28 18:15:29 k8s-master kubelet[5653]: F0328 18:15:29.314653    5653 server.go:265] failed to run Kubelet: failed to create kubelet: docker API version is older than 1.26.0

https://github.com/kubernetes/kubernetes/issues/74658#issuecomment-467971101 のとおり、ubuntu 18.04 にベースイメージを切り替える必要あり。

oomichi commented 5 years ago

# echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list の xenial 部分を 18.04 向けの bionic にしてみたところ、

Err:6 https://packages.cloud.google.com/apt kubernetes-bionic Release
  404  Not Found [IP: 172.217.0.46 443]

でエラーになった。 https://packages.cloud.google.com/apt/dists 配下に kubernetes-bionic がまだ用意されていないため。 xenial のままで試してみる。

oomichi commented 5 years ago

apt-get install -y kubelet kubeadm kubectl kubernetes-cni1 で kubernetes-cni1 パッケージが存在しないエラーが発生。ひとまず外して進める。

oomichi commented 5 years ago

ubuntu 18.04 にしたにも関わらず、同じ問題が出ている。

Mar 28 19:02:22 k8s-master kubelet[8712]: F0328 19:02:22.604603    8712 server.go:265] failed to run Kubelet: failed to create kubelet: docker API version is older than 1.26.0

docker.io パッケージをインストールすると docker のバージョンが挙がる。 良くわからないけど、kubeadm 実行前にインストールしてみる。

# docker -v
Docker version 1.11.2, build b9f10c9
# sudo apt-get install docker.io
...
# docker -v
Docker version 18.09.2, build 6247962

docker-engine から docker.io に入れ替えたところ、kubeadm が動くようになった。

oomichi commented 5 years ago

ひとまず両方で kubeadm 実行まで通ったが、Ready にならない。

$ kubectl get nodes
NAME         STATUS     ROLES    AGE   VERSION
k8s-cpu01    NotReady   <none>   29s   v1.14.0
k8s-master   NotReady   master   13m   v1.14.0
oomichi commented 5 years ago

下記のエラーが大量に syslog に出ている。

Mar 28 19:34:47 k8s-master kubelet[10209]: W0328 19:34:47.793078   10209 cni.go:213] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 28 19:34:48 k8s-master kubelet[10209]: E0328 19:34:48.643688   10209 kubelet.go:2170] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

flannel の設定を忘れていた・・・

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml
oomichi commented 5 years ago

両方Readyになった。

$ kubectl get nodes
NAME         STATUS   ROLES    AGE     VERSION
k8s-cpu01    Ready    <none>   6m51s   v1.14.0
k8s-master   Ready    master   20m     v1.14.0