rancher / rancher

Complete container management platform
http://rancher.com
Apache License 2.0
23.36k stars 2.97k forks source link

Ha deployment failed #21926

Closed laizhouzhang closed 5 years ago

laizhouzhang commented 5 years ago

use rke up --config rancher-cluster.yml

etcd logs:2019-08-01 11:17:54.301389 I | embed: rejected connection from "192.168.3.2:60362" (error "tls: failed to verify client's certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kube-ca\")", ServerName "")

2019-08-02 02:51:59.585064 W | rafthttp: health check for peer 534fd57dd2179fd0 could not connect: remote error: tls: bad certificate (prober "ROUND_TRIPPER_SNAPSHOT")

webwiebe commented 5 years ago

I'm getting the same error message. I'll try to provide some context;

As I'm new to rancher I'm following the HA setup guide. Nodes are provisioned with ubuntu 16.04, starting from a clean slate.

rke version: rke version v0.2.6

lsb_release -a:

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:    16.04
Codename:   xenial

docker --version Docker version 19.03.1, build 74b1e89e8a

cluster config:

nodes:
  - address: xxx.xxx.xxx.138
    user: docker
    role: [controlplane,worker,etcd]
  - address: xxx.xxx.xxx.143
    user: docker
    role: [controlplane,worker,etcd]
  - address: xxx.xxx.xxx.146
    user: docker
    role: [controlplane,worker,etcd]

services:
  etcd:
    snapshot: true
    creation: 6h
    retention: 24h

output for rke up command

$ rke up
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [dialer] Setup tunnel for host [xxx.xxx.xxx.146]
INFO[0000] [dialer] Setup tunnel for host [xxx.xxx.xxx.138]
INFO[0000] [dialer] Setup tunnel for host [xxx.xxx.xxx.143]
INFO[0000] [state] Pulling image [rancher/rke-tools:v0.1.34] on host [xxx.xxx.xxx.138]
INFO[0007] [state] Successfully pulled image [rancher/rke-tools:v0.1.34] on host [xxx.xxx.xxx.138]
INFO[0007] [state] Successfully started [cluster-state-deployer] container on host [xxx.xxx.xxx.138]
INFO[0007] [state] Pulling image [rancher/rke-tools:v0.1.34] on host [xxx.xxx.xxx.143]
INFO[0014] [state] Successfully pulled image [rancher/rke-tools:v0.1.34] on host [xxx.xxx.xxx.143]
INFO[0014] [state] Successfully started [cluster-state-deployer] container on host [xxx.xxx.xxx.143]
INFO[0015] [state] Pulling image [rancher/rke-tools:v0.1.34] on host [xxx.xxx.xxx.146]
INFO[0022] [state] Successfully pulled image [rancher/rke-tools:v0.1.34] on host [xxx.xxx.xxx.146]
INFO[0022] [state] Successfully started [cluster-state-deployer] container on host [xxx.xxx.xxx.146]
INFO[0022] [certificates] Generating CA kubernetes certificates
INFO[0023] [certificates] Generating Kubernetes API server aggregation layer requestheader client CA certificates
INFO[0023] [certificates] Generating Kube Proxy certificates
INFO[0023] [certificates] Generating admin certificates and kubeconfig
INFO[0023] [certificates] Generating Kubernetes API server proxy client certificates
INFO[0023] [certificates] Generating etcd-xxx.xxx.xxx.138 certificate and key
INFO[0023] [certificates] Generating etcd-xxx.xxx.xxx.143 certificate and key
INFO[0023] [certificates] Generating etcd-xxx.xxx.xxx.146 certificate and key
INFO[0023] [certificates] Generating Kubernetes API server certificates
INFO[0024] [certificates] Generating Service account token key
INFO[0024] [certificates] Generating Kube Controller certificates
INFO[0024] [certificates] Generating Kube Scheduler certificates
INFO[0024] [certificates] Generating Node certificate
INFO[0024] Successfully Deployed state file at [./cluster.rkestate]
INFO[0024] Building Kubernetes cluster
INFO[0024] [dialer] Setup tunnel for host [xxx.xxx.xxx.146]
INFO[0024] [dialer] Setup tunnel for host [xxx.xxx.xxx.143]
INFO[0024] [dialer] Setup tunnel for host [xxx.xxx.xxx.138]
INFO[0024] [network] Deploying port listener containers
INFO[0025] [network] Successfully started [rke-etcd-port-listener] container on host [xxx.xxx.xxx.143]
INFO[0025] [network] Successfully started [rke-etcd-port-listener] container on host [xxx.xxx.xxx.146]
INFO[0025] [network] Successfully started [rke-etcd-port-listener] container on host [xxx.xxx.xxx.138]
INFO[0026] [network] Successfully started [rke-cp-port-listener] container on host [xxx.xxx.xxx.146]
INFO[0026] [network] Successfully started [rke-cp-port-listener] container on host [xxx.xxx.xxx.143]
INFO[0026] [network] Successfully started [rke-cp-port-listener] container on host [xxx.xxx.xxx.138]
INFO[0026] [network] Successfully started [rke-worker-port-listener] container on host [xxx.xxx.xxx.146]
INFO[0026] [network] Successfully started [rke-worker-port-listener] container on host [xxx.xxx.xxx.143]
INFO[0027] [network] Successfully started [rke-worker-port-listener] container on host [xxx.xxx.xxx.138]
INFO[0027] [network] Port listener containers deployed successfully
INFO[0027] [network] Running etcd <-> etcd port checks
INFO[0027] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.143]
INFO[0027] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.146]
INFO[0027] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.138]
INFO[0027] [network] Running control plane -> etcd port checks
INFO[0028] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.143]
INFO[0028] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.146]
INFO[0028] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.138]
INFO[0028] [network] Running control plane -> worker port checks
INFO[0029] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.143]
INFO[0029] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.138]
INFO[0029] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.146]
INFO[0029] [network] Running workers -> control plane port checks
INFO[0029] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.143]
INFO[0029] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.146]
INFO[0029] [network] Successfully started [rke-port-checker] container on host [xxx.xxx.xxx.138]
INFO[0030] [network] Checking KubeAPI port Control Plane hosts
INFO[0030] [network] Removing port listener containers
INFO[0030] [remove/rke-etcd-port-listener] Successfully removed container on host [xxx.xxx.xxx.138]
INFO[0030] [remove/rke-etcd-port-listener] Successfully removed container on host [xxx.xxx.xxx.143]
INFO[0030] [remove/rke-etcd-port-listener] Successfully removed container on host [xxx.xxx.xxx.146]
INFO[0030] [remove/rke-cp-port-listener] Successfully removed container on host [xxx.xxx.xxx.143]
INFO[0030] [remove/rke-cp-port-listener] Successfully removed container on host [xxx.xxx.xxx.146]
INFO[0031] [remove/rke-cp-port-listener] Successfully removed container on host [xxx.xxx.xxx.138]
INFO[0031] [remove/rke-worker-port-listener] Successfully removed container on host [xxx.xxx.xxx.146]
INFO[0031] [remove/rke-worker-port-listener] Successfully removed container on host [xxx.xxx.xxx.143]
INFO[0031] [remove/rke-worker-port-listener] Successfully removed container on host [xxx.xxx.xxx.138]
INFO[0031] [network] Port listener containers removed successfully
INFO[0031] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0037] [reconcile] Rebuilding and updating local kube config
INFO[0037] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0037] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0037] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0037] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0037] [reconcile] Reconciling cluster state
INFO[0037] [reconcile] This is newly generated cluster
INFO[0037] Pre-pulling kubernetes images
INFO[0037] [pre-deploy] Pulling image [rancher/hyperkube:v1.14.3-rancher1] on host [xxx.xxx.xxx.143]
INFO[0037] [pre-deploy] Pulling image [rancher/hyperkube:v1.14.3-rancher1] on host [xxx.xxx.xxx.146]
INFO[0037] [pre-deploy] Pulling image [rancher/hyperkube:v1.14.3-rancher1] on host [xxx.xxx.xxx.138]
INFO[0068] [pre-deploy] Successfully pulled image [rancher/hyperkube:v1.14.3-rancher1] on host [xxx.xxx.xxx.143]
INFO[0068] [pre-deploy] Successfully pulled image [rancher/hyperkube:v1.14.3-rancher1] on host [xxx.xxx.xxx.146]
INFO[0070] [pre-deploy] Successfully pulled image [rancher/hyperkube:v1.14.3-rancher1] on host [xxx.xxx.xxx.138]
INFO[0070] Kubernetes images pulled successfully
INFO[0070] [etcd] Building up etcd plane..
INFO[0070] [etcd] Pulling image [rancher/coreos-etcd:v3.3.10-rancher1] on host [xxx.xxx.xxx.138]
INFO[0074] [etcd] Successfully pulled image [rancher/coreos-etcd:v3.3.10-rancher1] on host [xxx.xxx.xxx.138]
INFO[0075] [etcd] Successfully started [etcd] container on host [xxx.xxx.xxx.138]
INFO[0075] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [xxx.xxx.xxx.138]
INFO[0075] [etcd] Successfully started [etcd-rolling-snapshots] container on host [xxx.xxx.xxx.138]
INFO[0081] [certificates] Successfully started [rke-bundle-cert] container on host [xxx.xxx.xxx.138]
INFO[0081] Waiting for [rke-bundle-cert] container to exit on host [xxx.xxx.xxx.138]
INFO[0081] Container [rke-bundle-cert] is still running on host [xxx.xxx.xxx.138]
INFO[0082] Waiting for [rke-bundle-cert] container to exit on host [xxx.xxx.xxx.138]
INFO[0082] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [xxx.xxx.xxx.138]
INFO[0083] [etcd] Successfully started [rke-log-linker] container on host [xxx.xxx.xxx.138]
INFO[0083] [remove/rke-log-linker] Successfully removed container on host [xxx.xxx.xxx.138]
INFO[0083] [etcd] Pulling image [rancher/coreos-etcd:v3.3.10-rancher1] on host [xxx.xxx.xxx.143]
INFO[0087] [etcd] Successfully pulled image [rancher/coreos-etcd:v3.3.10-rancher1] on host [xxx.xxx.xxx.143]
INFO[0087] [etcd] Successfully started [etcd] container on host [xxx.xxx.xxx.143]
INFO[0087] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [xxx.xxx.xxx.143]
INFO[0088] [etcd] Successfully started [etcd-rolling-snapshots] container on host [xxx.xxx.xxx.143]
INFO[0094] [certificates] Successfully started [rke-bundle-cert] container on host [xxx.xxx.xxx.143]
INFO[0094] Waiting for [rke-bundle-cert] container to exit on host [xxx.xxx.xxx.143]
INFO[0094] Container [rke-bundle-cert] is still running on host [xxx.xxx.xxx.143]
INFO[0095] Waiting for [rke-bundle-cert] container to exit on host [xxx.xxx.xxx.143]
INFO[0095] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [xxx.xxx.xxx.143]
INFO[0096] [etcd] Successfully started [rke-log-linker] container on host [xxx.xxx.xxx.143]
INFO[0096] [remove/rke-log-linker] Successfully removed container on host [xxx.xxx.xxx.143]
INFO[0096] [etcd] Pulling image [rancher/coreos-etcd:v3.3.10-rancher1] on host [xxx.xxx.xxx.146]
INFO[0099] [etcd] Successfully pulled image [rancher/coreos-etcd:v3.3.10-rancher1] on host [xxx.xxx.xxx.146]
INFO[0100] [etcd] Successfully started [etcd] container on host [xxx.xxx.xxx.146]
INFO[0100] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [xxx.xxx.xxx.146]
INFO[0101] [etcd] Successfully started [etcd-rolling-snapshots] container on host [xxx.xxx.xxx.146]
INFO[0106] [certificates] Successfully started [rke-bundle-cert] container on host [xxx.xxx.xxx.146]
INFO[0106] Waiting for [rke-bundle-cert] container to exit on host [xxx.xxx.xxx.146]
INFO[0106] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [xxx.xxx.xxx.146]
INFO[0107] [etcd] Successfully started [rke-log-linker] container on host [xxx.xxx.xxx.146]
INFO[0107] [remove/rke-log-linker] Successfully removed container on host [xxx.xxx.xxx.146]
INFO[0107] [etcd] Successfully started etcd plane.. Checking etcd cluster health
FATA[0184] [etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthy

cluster.rkestate:

{
  "desiredState": {
    "rkeConfig": {
      "nodes": [
        {
          "address": "xxx.xxx.xxx.138",
          "port": "22",
          "internalAddress": "xxx.xxx.xxx.138",
          "role": [
            "controlplane",
            "worker",
            "etcd"
          ],
          "hostnameOverride": "xxx.xxx.xxx.138",
          "user": "docker",
          "sshKeyPath": "~/.ssh/id_rsa"
        },
        {
          "address": "xxx.xxx.xxx.143",
          "port": "22",
          "internalAddress": "xxx.xxx.xxx.143",
          "role": [
            "controlplane",
            "worker",
            "etcd"
          ],
          "hostnameOverride": "xxx.xxx.xxx.143",
          "user": "docker",
          "sshKeyPath": "~/.ssh/id_rsa"
        },
        {
          "address": "xxx.xxx.xxx.146",
          "port": "22",
          "internalAddress": "xxx.xxx.xxx.146",
          "role": [
            "controlplane",
            "worker",
            "etcd"
          ],
          "hostnameOverride": "xxx.xxx.xxx.146",
          "user": "docker",
          "sshKeyPath": "~/.ssh/id_rsa"
        }
      ],
      "services": {
        "etcd": {
          "image": "rancher/coreos-etcd:v3.3.10-rancher1",
          "extraArgs": {
            "election-timeout": "5000",
            "heartbeat-interval": "500"
          },
          "snapshot": true,
          "retention": "24h",
          "creation": "6h"
        },
        "kubeApi": {
          "image": "rancher/hyperkube:v1.14.3-rancher1",
          "serviceClusterIpRange": "10.43.0.0/16",
          "serviceNodePortRange": "30000-32767"
        },
        "kubeController": {
          "image": "rancher/hyperkube:v1.14.3-rancher1",
          "clusterCidr": "10.42.0.0/16",
          "serviceClusterIpRange": "10.43.0.0/16"
        },
        "scheduler": {
          "image": "rancher/hyperkube:v1.14.3-rancher1"
        },
        "kubelet": {
          "image": "rancher/hyperkube:v1.14.3-rancher1",
          "clusterDomain": "cluster.local",
          "infraContainerImage": "rancher/pause:3.1",
          "clusterDnsServer": "10.43.0.10"
        },
        "kubeproxy": {
          "image": "rancher/hyperkube:v1.14.3-rancher1"
        }
      },
      "network": {
        "plugin": "canal",
        "options": {
          "canal_flannel_backend_type": "vxlan"
        }
      },
      "authentication": {
        "strategy": "x509"
      },
      "systemImages": {
        "etcd": "rancher/coreos-etcd:v3.3.10-rancher1",
        "alpine": "rancher/rke-tools:v0.1.34",
        "nginxProxy": "rancher/rke-tools:v0.1.34",
        "certDownloader": "rancher/rke-tools:v0.1.34",
        "kubernetesServicesSidecar": "rancher/rke-tools:v0.1.34",
        "kubedns": "rancher/k8s-dns-kube-dns:1.15.0",
        "dnsmasq": "rancher/k8s-dns-dnsmasq-nanny:1.15.0",
        "kubednsSidecar": "rancher/k8s-dns-sidecar:1.15.0",
        "kubednsAutoscaler": "rancher/cluster-proportional-autoscaler:1.3.0",
        "coredns": "rancher/coredns-coredns:1.3.1",
        "corednsAutoscaler": "rancher/cluster-proportional-autoscaler:1.3.0",
        "kubernetes": "rancher/hyperkube:v1.14.3-rancher1",
        "flannel": "rancher/coreos-flannel:v0.10.0-rancher1",
        "flannelCni": "rancher/flannel-cni:v0.3.0-rancher1",
        "calicoNode": "rancher/calico-node:v3.4.0",
        "calicoCni": "rancher/calico-cni:v3.4.0",
        "calicoCtl": "rancher/calico-ctl:v2.0.0",
        "canalNode": "rancher/calico-node:v3.4.0",
        "canalCni": "rancher/calico-cni:v3.4.0",
        "canalFlannel": "rancher/coreos-flannel:v0.10.0",
        "weaveNode": "weaveworks/weave-kube:2.5.0",
        "weaveCni": "weaveworks/weave-npc:2.5.0",
        "podInfraContainer": "rancher/pause:3.1",
        "ingress": "rancher/nginx-ingress-controller:0.21.0-rancher3",
        "ingressBackend": "rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1",
        "metricsServer": "rancher/metrics-server:v0.3.1"
      },
      "sshKeyPath": "~/.ssh/id_rsa",
      "sshAgentAuth": false,
      "authorization": {
        "mode": "rbac"
      },
      "ignoreDockerVersion": false,
      "kubernetesVersion": "v1.14.3-rancher1-1",
      "ingress": {
        "provider": "nginx"
      },
      "clusterName": "local",
      "cloudProvider": {},
      "prefixPath": "/",
      "addonJobTimeout": 30,
      "bastionHost": {},
      "monitoring": {
        "provider": "metrics-server"
      },
      "restore": {},
      "dns": {
        "provider": "coredns"
      }
    },
    "certificatesBundle": {
      "kube-admin": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    api-version: v1\n    certificate-authority-data: xxxSNIPxxx\n    server: \"https://xxx.xxx.xxx.138:6443\"\n  name: \"local\"\ncontexts:\n- context:\n    cluster: \"local\"\n    user: \"kube-admin-local\"\n  name: \"local\"\ncurrent-context: \"local\"\nusers:\n- name: \"kube-admin-local\"\n  user:\n    client-certificate-data: xxxSNIPxxx=\n    client-key-data: xxxSNIPxxx=",
        "name": "kube-admin",
        "commonName": "kube-admin",
        "ouName": "system:masters",
        "envName": "KUBE_ADMIN",
        "path": "/etc/kubernetes/ssl/kube-admin.pem",
        "keyEnvName": "KUBE_ADMIN_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-admin-key.pem",
        "configEnvName": "KUBECFG_KUBE_ADMIN",
        "configPath": "./kube_config_cluster.yml"
      },
      "kube-apiserver": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "",
        "name": "kube-apiserver",
        "commonName": "system:kube-apiserver",
        "ouName": "",
        "envName": "KUBE_APISERVER",
        "path": "/etc/kubernetes/ssl/kube-apiserver.pem",
        "keyEnvName": "KUBE_APISERVER_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-apiserver-key.pem",
        "configEnvName": "",
        "configPath": ""
      },
      "kube-apiserver-proxy-client": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    api-version: v1\n    certificate-authority: /etc/kubernetes/ssl/kube-ca.pem\n    server: \"https://127.0.0.1:6443\"\n  name: \"local\"\ncontexts:\n- context:\n    cluster: \"local\"\n    user: \"kube-apiserver-proxy-client-local\"\n  name: \"local\"\ncurrent-context: \"local\"\nusers:\n- name: \"kube-apiserver-proxy-client-local\"\n  user:\n    client-certificate: /etc/kubernetes/ssl/kube-apiserver-proxy-client.pem\n    client-key: /etc/kubernetes/ssl/kube-apiserver-proxy-client-key.pem",
        "name": "kube-apiserver-proxy-client",
        "commonName": "system:kube-apiserver-proxy-client",
        "ouName": "",
        "envName": "KUBE_APISERVER_PROXY_CLIENT",
        "path": "/etc/kubernetes/ssl/kube-apiserver-proxy-client.pem",
        "keyEnvName": "KUBE_APISERVER_PROXY_CLIENT_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-apiserver-proxy-client-key.pem",
        "configEnvName": "KUBECFG_KUBE_APISERVER_PROXY_CLIENT",
        "configPath": "/etc/kubernetes/ssl/kubecfg-kube-apiserver-proxy-client.yaml"
      },
      "kube-apiserver-requestheader-ca": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    api-version: v1\n    certificate-authority: /etc/kubernetes/ssl/kube-ca.pem\n    server: \"https://127.0.0.1:6443\"\n  name: \"local\"\ncontexts:\n- context:\n    cluster: \"local\"\n    user: \"kube-apiserver-requestheader-ca-local\"\n  name: \"local\"\ncurrent-context: \"local\"\nusers:\n- name: \"kube-apiserver-requestheader-ca-local\"\n  user:\n    client-certificate: /etc/kubernetes/ssl/kube-apiserver-requestheader-ca.pem\n    client-key: /etc/kubernetes/ssl/kube-apiserver-requestheader-ca-key.pem",
        "name": "kube-apiserver-requestheader-ca",
        "commonName": "system:kube-apiserver-requestheader-ca",
        "ouName": "",
        "envName": "KUBE_APISERVER_REQUESTHEADER_CA",
        "path": "/etc/kubernetes/ssl/kube-apiserver-requestheader-ca.pem",
        "keyEnvName": "KUBE_APISERVER_REQUESTHEADER_CA_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-apiserver-requestheader-ca-key.pem",
        "configEnvName": "KUBECFG_KUBE_APISERVER_REQUESTHEADER_CA",
        "configPath": "/etc/kubernetes/ssl/kubecfg-kube-apiserver-requestheader-ca.yaml"
      },
      "kube-ca": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "",
        "name": "kube-ca",
        "commonName": "system:kube-ca",
        "ouName": "",
        "envName": "KUBE_CA",
        "path": "/etc/kubernetes/ssl/kube-ca.pem",
        "keyEnvName": "KUBE_CA_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-ca-key.pem",
        "configEnvName": "",
        "configPath": ""
      },
      "kube-controller-manager": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    api-version: v1\n    certificate-authority: /etc/kubernetes/ssl/kube-ca.pem\n    server: \"https://127.0.0.1:6443\"\n  name: \"local\"\ncontexts:\n- context:\n    cluster: \"local\"\n    user: \"kube-controller-manager-local\"\n  name: \"local\"\ncurrent-context: \"local\"\nusers:\n- name: \"kube-controller-manager-local\"\n  user:\n    client-certificate: /etc/kubernetes/ssl/kube-controller-manager.pem\n    client-key: /etc/kubernetes/ssl/kube-controller-manager-key.pem",
        "name": "kube-controller-manager",
        "commonName": "system:kube-controller-manager",
        "ouName": "",
        "envName": "KUBE_CONTROLLER_MANAGER",
        "path": "/etc/kubernetes/ssl/kube-controller-manager.pem",
        "keyEnvName": "KUBE_CONTROLLER_MANAGER_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-controller-manager-key.pem",
        "configEnvName": "KUBECFG_KUBE_CONTROLLER_MANAGER",
        "configPath": "/etc/kubernetes/ssl/kubecfg-kube-controller-manager.yaml"
      },
      "kube-etcd-185-110-173-138": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "",
        "name": "kube-etcd-185-110-173-138",
        "commonName": "system:kube-etcd-185-110-173-138",
        "ouName": "",
        "envName": "KUBE_ETCD_185_110_173_138",
        "path": "/etc/kubernetes/ssl/kube-etcd-185-110-173-138.pem",
        "keyEnvName": "KUBE_ETCD_185_110_173_138_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-etcd-185-110-173-138-key.pem",
        "configEnvName": "",
        "configPath": ""
      },
      "kube-etcd-185-110-173-143": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "",
        "name": "kube-etcd-185-110-173-143",
        "commonName": "system:kube-etcd-185-110-173-143",
        "ouName": "",
        "envName": "KUBE_ETCD_185_110_173_143",
        "path": "/etc/kubernetes/ssl/kube-etcd-185-110-173-143.pem",
        "keyEnvName": "KUBE_ETCD_185_110_173_143_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-etcd-185-110-173-143-key.pem",
        "configEnvName": "",
        "configPath": ""
      },
      "kube-etcd-185-110-173-146": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "",
        "name": "kube-etcd-185-110-173-146",
        "commonName": "system:kube-etcd-185-110-173-146",
        "ouName": "",
        "envName": "KUBE_ETCD_185_110_173_146",
        "path": "/etc/kubernetes/ssl/kube-etcd-185-110-173-146.pem",
        "keyEnvName": "KUBE_ETCD_185_110_173_146_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-etcd-185-110-173-146-key.pem",
        "configEnvName": "",
        "configPath": ""
      },
      "kube-node": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    api-version: v1\n    certificate-authority: /etc/kubernetes/ssl/kube-ca.pem\n    server: \"https://127.0.0.1:6443\"\n  name: \"local\"\ncontexts:\n- context:\n    cluster: \"local\"\n    user: \"kube-node-local\"\n  name: \"local\"\ncurrent-context: \"local\"\nusers:\n- name: \"kube-node-local\"\n  user:\n    client-certificate: /etc/kubernetes/ssl/kube-node.pem\n    client-key: /etc/kubernetes/ssl/kube-node-key.pem",
        "name": "kube-node",
        "commonName": "system:node",
        "ouName": "system:nodes",
        "envName": "KUBE_NODE",
        "path": "/etc/kubernetes/ssl/kube-node.pem",
        "keyEnvName": "KUBE_NODE_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-node-key.pem",
        "configEnvName": "KUBECFG_KUBE_NODE",
        "configPath": "/etc/kubernetes/ssl/kubecfg-kube-node.yaml"
      },
      "kube-proxy": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    api-version: v1\n    certificate-authority: /etc/kubernetes/ssl/kube-ca.pem\n    server: \"https://127.0.0.1:6443\"\n  name: \"local\"\ncontexts:\n- context:\n    cluster: \"local\"\n    user: \"kube-proxy-local\"\n  name: \"local\"\ncurrent-context: \"local\"\nusers:\n- name: \"kube-proxy-local\"\n  user:\n    client-certificate: /etc/kubernetes/ssl/kube-proxy.pem\n    client-key: /etc/kubernetes/ssl/kube-proxy-key.pem",
        "name": "kube-proxy",
        "commonName": "system:kube-proxy",
        "ouName": "",
        "envName": "KUBE_PROXY",
        "path": "/etc/kubernetes/ssl/kube-proxy.pem",
        "keyEnvName": "KUBE_PROXY_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-proxy-key.pem",
        "configEnvName": "KUBECFG_KUBE_PROXY",
        "configPath": "/etc/kubernetes/ssl/kubecfg-kube-proxy.yaml"
      },
      "kube-scheduler": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    api-version: v1\n    certificate-authority: /etc/kubernetes/ssl/kube-ca.pem\n    server: \"https://127.0.0.1:6443\"\n  name: \"local\"\ncontexts:\n- context:\n    cluster: \"local\"\n    user: \"kube-scheduler-local\"\n  name: \"local\"\ncurrent-context: \"local\"\nusers:\n- name: \"kube-scheduler-local\"\n  user:\n    client-certificate: /etc/kubernetes/ssl/kube-scheduler.pem\n    client-key: /etc/kubernetes/ssl/kube-scheduler-key.pem",
        "name": "kube-scheduler",
        "commonName": "system:kube-scheduler",
        "ouName": "",
        "envName": "KUBE_SCHEDULER",
        "path": "/etc/kubernetes/ssl/kube-scheduler.pem",
        "keyEnvName": "KUBE_SCHEDULER_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-scheduler-key.pem",
        "configEnvName": "KUBECFG_KUBE_SCHEDULER",
        "configPath": "/etc/kubernetes/ssl/kubecfg-kube-scheduler.yaml"
      },
      "kube-service-account-token": {
        "certificatePEM": "-----BEGIN CERTIFICATE-----xxxSNIPxxx---",
        "keyPEM": "-----BEGIN RSA PRIVATE KEY-----xxxSNIPxxx---",
        "config": "",
        "name": "kube-service-account-token",
        "commonName": "kube-service-account-token",
        "ouName": "",
        "envName": "KUBE_SERVICE_ACCOUNT_TOKEN",
        "path": "/etc/kubernetes/ssl/kube-service-account-token.pem",
        "keyEnvName": "KUBE_SERVICE_ACCOUNT_TOKEN_KEY",
        "keyPath": "/etc/kubernetes/ssl/kube-service-account-token-key.pem",
        "configEnvName": "",
        "configPath": ""
      }
    }
  },
  "currentState": {}
}

etcd container logs for xxx.xxx.xxx.138:


2019-08-02 08:11:01.765719 W | pkg/flags: unrecognized environment variable ETCD_UNSUPPORTED_ARCH=x86_64
2019-08-02 08:11:01.765861 I | etcdmain: etcd Version: 3.3.10
2019-08-02 08:11:01.765871 I | etcdmain: Git SHA: 27fc7e2
2019-08-02 08:11:01.765876 I | etcdmain: Go Version: go1.10.4
2019-08-02 08:11:01.765880 I | etcdmain: Go OS/Arch: linux/amd64
2019-08-02 08:11:01.765886 I | etcdmain: setting maximum number of CPUs to 3, total number of available CPUs is 3
2019-08-02 08:11:01.766011 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2019-08-02 08:11:01.766063 I | embed: peerTLS: cert = /etc/kubernetes/ssl/kube-etcd-185-110-173-138.pem, key = /etc/kubernetes/ssl/kube-etcd-185-110-173-138-key.pem, ca = , trusted-ca = /etc/kubernetes/ssl/kube-ca.pem, client-cert-auth = true, crl-file =
2019-08-02 08:11:01.767299 I | embed: listening for peers on https://0.0.0.0:2380
2019-08-02 08:11:01.767360 I | embed: listening for client requests on 0.0.0.0:2379
2019-08-02 08:11:01.769053 I | etcdserver: name = etcd-xxx.xxx.xxx.138
2019-08-02 08:11:01.769076 I | etcdserver: data dir = /var/lib/rancher/etcd/
2019-08-02 08:11:01.769088 I | etcdserver: member dir = /var/lib/rancher/etcd/member
2019-08-02 08:11:01.769094 I | etcdserver: heartbeat = 500ms
2019-08-02 08:11:01.769099 I | etcdserver: election = 5000ms
2019-08-02 08:11:01.769104 I | etcdserver: snapshot count = 100000
2019-08-02 08:11:01.769117 I | etcdserver: advertise client URLs = https://xxx.xxx.xxx.138:2379,https://xxx.xxx.xxx.138:4001
2019-08-02 08:11:01.770118 I | etcdserver: restarting member 5c0a849577d100ed in cluster 7d426ced5096f303 at commit index 3
2019-08-02 08:11:01.770210 I | raft: 5c0a849577d100ed became follower at term 55
2019-08-02 08:11:01.770303 I | raft: newRaft 5c0a849577d100ed [peers: [], term: 55, commit: 3, applied: 0, lastindex: 3, lastterm: 1]
2019-08-02 08:11:01.775966 W | auth: simple token is not cryptographically signed
2019-08-02 08:11:01.779238 I | etcdserver: starting server... [version: 3.3.10, cluster version: to_be_decided]
2019-08-02 08:11:01.782220 I | embed: ClientTLS: cert = /etc/kubernetes/ssl/kube-etcd-185-110-173-138.pem, key = /etc/kubernetes/ssl/kube-etcd-185-110-173-138-key.pem, ca = , trusted-ca = /etc/kubernetes/ssl/kube-ca.pem, client-cert-auth = true, crl-file =
2019-08-02 08:11:01.782928 I | etcdserver/membership: added member 54df41104aeddfae [https://xxx.xxx.xxx.143:2380] to cluster 7d426ced5096f303
2019-08-02 08:11:01.782959 I | rafthttp: starting peer 54df41104aeddfae...
2019-08-02 08:11:01.783060 I | rafthttp: started HTTP pipelining with peer 54df41104aeddfae
2019-08-02 08:11:01.784427 I | rafthttp: started streaming with peer 54df41104aeddfae (writer)
2019-08-02 08:11:01.784719 I | rafthttp: started streaming with peer 54df41104aeddfae (writer)
2019-08-02 08:11:01.785262 I | rafthttp: started peer 54df41104aeddfae
2019-08-02 08:11:01.785286 I | rafthttp: added peer 54df41104aeddfae
2019-08-02 08:11:01.785313 I | rafthttp: started streaming with peer 54df41104aeddfae (stream Message reader)
2019-08-02 08:11:01.785333 I | rafthttp: started streaming with peer 54df41104aeddfae (stream MsgApp v2 reader)
2019-08-02 08:11:01.785663 I | etcdserver/membership: added member 5c0a849577d100ed [https://xxx.xxx.xxx.138:2380] to cluster 7d426ced5096f303
2019-08-02 08:11:01.786094 I | etcdserver/membership: added member 76267e9078cb09f8 [https://xxx.xxx.xxx.146:2380] to cluster 7d426ced5096f303
2019-08-02 08:11:01.786114 I | rafthttp: starting peer 76267e9078cb09f8...
2019-08-02 08:11:01.786138 I | rafthttp: started HTTP pipelining with peer 76267e9078cb09f8
2019-08-02 08:11:01.787437 I | rafthttp: started peer 76267e9078cb09f8
2019-08-02 08:11:01.787539 I | rafthttp: added peer 76267e9078cb09f8
2019-08-02 08:11:01.788012 I | rafthttp: started streaming with peer 76267e9078cb09f8 (stream Message reader)
2019-08-02 08:11:01.788288 I | rafthttp: started streaming with peer 76267e9078cb09f8 (writer)
2019-08-02 08:11:01.788382 I | rafthttp: started streaming with peer 76267e9078cb09f8 (stream MsgApp v2 reader)
2019-08-02 08:11:01.788548 I | rafthttp: started streaming with peer 76267e9078cb09f8 (writer)
2019-08-02 08:11:06.785565 W | rafthttp: health check for peer 54df41104aeddfae could not connect: dial tcp xxx.xxx.xxx.143:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:06.785901 W | rafthttp: health check for peer 54df41104aeddfae could not connect: dial tcp xxx.xxx.xxx.143:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:06.788208 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:06.788236 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:09.770966 I | raft: 5c0a849577d100ed is starting a new election at term 55
2019-08-02 08:11:09.771092 I | raft: 5c0a849577d100ed became candidate at term 56
2019-08-02 08:11:09.771161 I | raft: 5c0a849577d100ed received MsgVoteResp from 5c0a849577d100ed at term 56
2019-08-02 08:11:09.771192 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 54df41104aeddfae at term 56
2019-08-02 08:11:09.771212 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 76267e9078cb09f8 at term 56
2019-08-02 08:11:11.786141 W | rafthttp: health check for peer 54df41104aeddfae could not connect: dial tcp xxx.xxx.xxx.143:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:11.786277 W | rafthttp: health check for peer 54df41104aeddfae could not connect: dial tcp xxx.xxx.xxx.143:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:11.788347 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:11.788548 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:16.270831 I | raft: 5c0a849577d100ed is starting a new election at term 56
2019-08-02 08:11:16.270936 I | raft: 5c0a849577d100ed became candidate at term 57
2019-08-02 08:11:16.270973 I | raft: 5c0a849577d100ed received MsgVoteResp from 5c0a849577d100ed at term 57
2019-08-02 08:11:16.270996 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 76267e9078cb09f8 at term 57
2019-08-02 08:11:16.271011 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 54df41104aeddfae at term 57
2019-08-02 08:11:16.782896 E | etcdserver: publish error: etcdserver: request timed out
2019-08-02 08:11:16.786496 W | rafthttp: health check for peer 54df41104aeddfae could not connect: dial tcp xxx.xxx.xxx.143:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:16.786733 W | rafthttp: health check for peer 54df41104aeddfae could not connect: dial tcp xxx.xxx.xxx.143:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:16.788705 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:16.789022 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:21.787006 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:21.787182 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:21.789063 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:21.789248 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:24.270793 I | raft: 5c0a849577d100ed is starting a new election at term 57
2019-08-02 08:11:24.271205 I | raft: 5c0a849577d100ed became candidate at term 58
2019-08-02 08:11:24.271325 I | raft: 5c0a849577d100ed received MsgVoteResp from 5c0a849577d100ed at term 58
2019-08-02 08:11:24.271467 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 54df41104aeddfae at term 58
2019-08-02 08:11:24.271634 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 76267e9078cb09f8 at term 58
2019-08-02 08:11:26.787306 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:26.787797 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:26.789281 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:26.789641 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: dial tcp xxx.xxx.xxx.146:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:29.770767 I | raft: 5c0a849577d100ed is starting a new election at term 58
2019-08-02 08:11:29.770851 I | raft: 5c0a849577d100ed became candidate at term 59
2019-08-02 08:11:29.770888 I | raft: 5c0a849577d100ed received MsgVoteResp from 5c0a849577d100ed at term 59
2019-08-02 08:11:29.771235 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 54df41104aeddfae at term 59
2019-08-02 08:11:29.771263 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 76267e9078cb09f8 at term 59
2019-08-02 08:11:31.783221 E | etcdserver: publish error: etcdserver: request timed out
2019-08-02 08:11:31.788224 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:31.788480 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:31.789507 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:31.789863 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:36.788823 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:36.789045 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:36.790158 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:36.790510 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:38.270756 I | raft: 5c0a849577d100ed is starting a new election at term 59
2019-08-02 08:11:38.270862 I | raft: 5c0a849577d100ed became candidate at term 60
2019-08-02 08:11:38.270934 I | raft: 5c0a849577d100ed received MsgVoteResp from 5c0a849577d100ed at term 60
2019-08-02 08:11:38.270967 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 76267e9078cb09f8 at term 60
2019-08-02 08:11:38.270994 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 54df41104aeddfae at term 60
2019-08-02 08:11:41.789125 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:41.789461 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:41.790354 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:41.790863 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:45.270924 I | raft: 5c0a849577d100ed is starting a new election at term 60
2019-08-02 08:11:45.271448 I | raft: 5c0a849577d100ed became candidate at term 61
2019-08-02 08:11:45.271613 I | raft: 5c0a849577d100ed received MsgVoteResp from 5c0a849577d100ed at term 61
2019-08-02 08:11:45.271826 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 54df41104aeddfae at term 61
2019-08-02 08:11:45.271940 I | raft: 5c0a849577d100ed [logterm: 1, index: 3] sent MsgVote request to 76267e9078cb09f8 at term 61
2019-08-02 08:11:46.783981 E | etcdserver: publish error: etcdserver: request timed out
2019-08-02 08:11:46.789499 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:46.790119 W | rafthttp: health check for peer 54df41104aeddfae could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:46.790551 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2019-08-02 08:11:46.791039 W | rafthttp: health check for peer 76267e9078cb09f8 could not connect: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca") (prober "ROUND_TRIPPER_SNAPSHOT")
2019-08-02 08:11:50.770777 I | raft: 5c0a849577d100ed is starting a new election at term 61

and output for the other two etcd containers:

2019-08-02 08:11:26.619054 W | pkg/flags: unrecognized environment variable ETCD_UNSUPPORTED_ARCH=x86_64
2019-08-02 08:11:26.619149 I | etcdmain: etcd Version: 3.3.10
2019-08-02 08:11:26.619157 I | etcdmain: Git SHA: 27fc7e2
2019-08-02 08:11:26.619162 I | etcdmain: Go Version: go1.10.4
2019-08-02 08:11:26.619165 I | etcdmain: Go OS/Arch: linux/amd64
2019-08-02 08:11:26.619169 I | etcdmain: setting maximum number of CPUs to 3, total number of available CPUs is 3
2019-08-02 08:11:26.619234 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2019-08-02 08:11:26.619276 I | embed: peerTLS: cert = /etc/kubernetes/ssl/kube-etcd-185-110-173-146.pem, key = /etc/kubernetes/ssl/kube-etcd-185-110-173-146-key.pem, ca = , trusted-ca = /etc/kubernetes/ssl/kube-ca.pem, client-cert-auth = true, crl-file =
2019-08-02 08:11:26.619940 I | embed: listening for peers on https://0.0.0.0:2380
2019-08-02 08:11:26.619977 I | embed: listening for client requests on 0.0.0.0:2379
2019-08-02 08:11:26.622726 I | etcdserver: recovered store from snapshot at index 100001
2019-08-02 08:11:26.623334 I | mvcc: restore compact to 96920
2019-08-02 08:11:26.626829 I | etcdserver: name = etcd-xxx.xxx.xxx.146
2019-08-02 08:11:26.626872 I | etcdserver: data dir = /var/lib/rancher/etcd/
2019-08-02 08:11:26.626883 I | etcdserver: member dir = /var/lib/rancher/etcd/member
2019-08-02 08:11:26.626911 I | etcdserver: heartbeat = 500ms
2019-08-02 08:11:26.626917 I | etcdserver: election = 5000ms
2019-08-02 08:11:26.626923 I | etcdserver: snapshot count = 100000
2019-08-02 08:11:26.626941 I | etcdserver: advertise client URLs = https://xxx.xxx.xxx.146:2379,https://xxx.xxx.xxx.146:4001
2019-08-02 08:11:26.663850 I | embed: rejected connection from "xxx.xxx.xxx.138:60104" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:26.664600 I | embed: rejected connection from "xxx.xxx.xxx.138:60106" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:26.762876 I | embed: rejected connection from "xxx.xxx.xxx.138:60113" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:26.765844 I | embed: rejected connection from "xxx.xxx.xxx.138:60112" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:26.863583 I | embed: rejected connection from "xxx.xxx.xxx.138:60122" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:26.866506 I | embed: rejected connection from "xxx.xxx.xxx.138:60120" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:26.962392 I | embed: rejected connection from "xxx.xxx.xxx.138:60130" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:26.963041 I | embed: rejected connection from "xxx.xxx.xxx.138:60128" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:27.010458 I | etcdserver: restarting member 11ff66a5ffe8e8b0 in cluster 699d825044deed9e at commit index 131166
2019-08-02 08:11:27.011878 I | raft: 11ff66a5ffe8e8b0 became follower at term 8
2019-08-02 08:11:27.011907 I | raft: newRaft 11ff66a5ffe8e8b0 [peers: [11ff66a5ffe8e8b0,54df41104aeddfae], term: 8, commit: 131166, applied: 100001, lastindex: 131166, lastterm: 8]
2019-08-02 08:11:27.012163 I | etcdserver/api: enabled capabilities for version 3.3
2019-08-02 08:11:27.012262 I | etcdserver/membership: added member 54df41104aeddfae [https://xxx.xxx.xxx.143:2380] to cluster 699d825044deed9e from store
2019-08-02 08:11:27.012297 I | etcdserver/membership: added member 11ff66a5ffe8e8b0 [https://xxx.xxx.xxx.146:2380] to cluster 699d825044deed9e from store
2019-08-02 08:11:27.012377 I | etcdserver/membership: set the cluster version to 3.3 from store
2019-08-02 08:11:27.013647 I | mvcc: restore compact to 96920
2019-08-02 08:11:27.016019 W | auth: simple token is not cryptographically signed
2019-08-02 08:11:27.018094 I | rafthttp: starting peer 54df41104aeddfae...
2019-08-02 08:11:27.018292 I | rafthttp: started HTTP pipelining with peer 54df41104aeddfae
2019-08-02 08:11:27.019141 I | rafthttp: started streaming with peer 54df41104aeddfae (writer)
2019-08-02 08:11:27.020011 I | rafthttp: started streaming with peer 54df41104aeddfae (writer)
2019-08-02 08:11:27.023467 I | rafthttp: started peer 54df41104aeddfae
2019-08-02 08:11:27.023534 I | rafthttp: added peer 54df41104aeddfae
2019-08-02 08:11:27.023577 I | etcdserver: starting server... [version: 3.3.10, cluster version: 3.3]
2019-08-02 08:11:27.024196 I | rafthttp: started streaming with peer 54df41104aeddfae (stream MsgApp v2 reader)
2019-08-02 08:11:27.024453 I | rafthttp: started streaming with peer 54df41104aeddfae (stream Message reader)
2019-08-02 08:11:27.026232 I | embed: ClientTLS: cert = /etc/kubernetes/ssl/kube-etcd-185-110-173-146.pem, key = /etc/kubernetes/ssl/kube-etcd-185-110-173-146-key.pem, ca = , trusted-ca = /etc/kubernetes/ssl/kube-ca.pem, client-cert-auth = true, crl-file =
2019-08-02 08:11:27.026480 I | rafthttp: peer 54df41104aeddfae became active
2019-08-02 08:11:27.026514 I | rafthttp: established a TCP streaming connection with peer 54df41104aeddfae (stream Message writer)
2019-08-02 08:11:27.026779 I | rafthttp: established a TCP streaming connection with peer 54df41104aeddfae (stream MsgApp v2 writer)
2019-08-02 08:11:27.041054 I | rafthttp: established a TCP streaming connection with peer 54df41104aeddfae (stream Message reader)
2019-08-02 08:11:27.044108 I | rafthttp: established a TCP streaming connection with peer 54df41104aeddfae (stream MsgApp v2 reader)
2019-08-02 08:11:27.065741 I | embed: rejected connection from "xxx.xxx.xxx.138:60138" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:27.065792 I | embed: rejected connection from "xxx.xxx.xxx.138:60136" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:27.170029 I | embed: rejected connection from "xxx.xxx.xxx.138:60146" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:27.172552 I | embed: rejected connection from "xxx.xxx.xxx.138:60144" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:27.265176 I | embed: rejected connection from "xxx.xxx.xxx.138:60154" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:27.265522 I | embed: rejected connection from "xxx.xxx.xxx.138:60152" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:27.362449 I | embed: rejected connection from "xxx.xxx.xxx.138:60160" (error "remote error: tls: bad certificate", ServerName "")
2019-08-02 08:11:27.364429 I | embed: rejected connection from "xxx.xxx.xxx.138:60162" (error "remote error: tls: bad certificate", ServerName "")
webwiebe commented 5 years ago

Fixed my own problem by running the following on the cluster nodes

docker stop $(docker ps -aq)
docker system prune -f
docker volume rm $(docker volume ls -q)
docker image rm $(docker image ls -q)
rm -rf /etc/ceph \
       /etc/cni \
       /etc/kubernetes \
       /opt/cni \
       /opt/rke \
       /run/secrets/kubernetes.io \
       /run/calico \
       /run/flannel \
       /var/lib/calico \
       /var/lib/etcd \
       /var/lib/cni \
       /var/lib/kubelet \
       /var/lib/rancher/rke/log \
       /var/log/containers \
       /var/log/pods \
       /var/run/calico

I then rebooted the nodes and ran the rke up command again.

krumware commented 5 years ago

I had an issue setting up new nodes via the docker command supplied by the Rancher 2 UI, for rancher agent, and running the above helped me as well.

superseb commented 5 years ago

Old state will block bringing up clusters, either rke remove or the steps on https://rancher.com/docs/rancher/v2.x/en/cluster-admin/cleaning-cluster-nodes/ should be followed and it should be resolved after.

saurabhprakash commented 4 years ago

Fixed my own problem by running the following on the cluster nodes

docker stop $(docker ps -aq)
docker system prune -f
docker volume rm $(docker volume ls -q)
docker image rm $(docker image ls -q)
rm -rf /etc/ceph \
       /etc/cni \
       /etc/kubernetes \
       /opt/cni \
       /opt/rke \
       /run/secrets/kubernetes.io \
       /run/calico \
       /run/flannel \
       /var/lib/calico \
       /var/lib/etcd \
       /var/lib/cni \
       /var/lib/kubelet \
       /var/lib/rancher/rke/log \
       /var/log/containers \
       /var/log/pods \
       /var/run/calico

I then rebooted the nodes and ran the rke up command again.

Running this worked. Earlier I tried changing rke versions, but nothing worked.

aperture147 commented 4 years ago

I'm also having this issue while creating a new Amazon EC2 cluster via Rancher UI 2.3.2

killerquitsche commented 4 years ago

We had the same issue like @aperture147. Is there a Solution?

patrickacioli commented 4 years ago

Fixed my own problem by running the following on the cluster nodes

docker stop $(docker ps -aq)
docker system prune -f
docker volume rm $(docker volume ls -q)
docker image rm $(docker image ls -q)
rm -rf /etc/ceph \
       /etc/cni \
       /etc/kubernetes \
       /opt/cni \
       /opt/rke \
       /run/secrets/kubernetes.io \
       /run/calico \
       /run/flannel \
       /var/lib/calico \
       /var/lib/etcd \
       /var/lib/cni \
       /var/lib/kubelet \
       /var/lib/rancher/rke/log \
       /var/log/containers \
       /var/log/pods \
       /var/run/calico

I then rebooted the nodes and ran the rke up command again.

Work for me. Thanks.

G-Pappas commented 1 year ago

First of all thanks this solved my issue. Do you know of why this occurred in the first place?

Fixed my own problem by running the following on the cluster nodes

docker stop $(docker ps -aq)
docker system prune -f
docker volume rm $(docker volume ls -q)
docker image rm $(docker image ls -q)
rm -rf /etc/ceph \
       /etc/cni \
       /etc/kubernetes \
       /opt/cni \
       /opt/rke \
       /run/secrets/kubernetes.io \
       /run/calico \
       /run/flannel \
       /var/lib/calico \
       /var/lib/etcd \
       /var/lib/cni \
       /var/lib/kubelet \
       /var/lib/rancher/rke/log \
       /var/log/containers \
       /var/log/pods \
       /var/run/calico

I then rebooted the nodes and ran the rke up command again.

aperture147 commented 1 year ago

First of all thanks this solved my issue. Do you know of why this occurred in the first place?

Fixed my own problem by running the following on the cluster nodes

docker stop $(docker ps -aq)
docker system prune -f
docker volume rm $(docker volume ls -q)
docker image rm $(docker image ls -q)
rm -rf /etc/ceph \
       /etc/cni \
       /etc/kubernetes \
       /opt/cni \
       /opt/rke \
       /run/secrets/kubernetes.io \
       /run/calico \
       /run/flannel \
       /var/lib/calico \
       /var/lib/etcd \
       /var/lib/cni \
       /var/lib/kubelet \
       /var/lib/rancher/rke/log \
       /var/log/containers \
       /var/log/pods \
       /var/run/calico

I then rebooted the nodes and ran the rke up command again.

I wrongly setup security groups on AWS and firewall rules. If you setup Rancher on multiple AWS accounts or multiple cloud provider, you have to carefully allow a lot of ports. If the HA deployment failed, then remove everything, check the ports again and redo.