kubesphere / kubekey

Install Kubernetes/K3s only, both Kubernetes/K3s and KubeSphere, and related cloud-native add-ons, it supports all-in-one, multi-node, and HA 🔥 ⎈ 🐳
https://kubesphere.io
Apache License 2.0
2.37k stars 550 forks source link

Offline deploy kubesphere failed, with x509: certificate problem #1124

Closed lostrain closed 2 years ago

lostrain commented 2 years ago

What is version of KubeKey has the issue?

version.BuildInfo{Version:"1.2.1", GitCommit:"94aa580", GitTreeState:"", GoVersion:"go1.16.12"}

What is your os environment?

CentOS Linux release 7.6.1810 (Core)

KubeKey config file

apiVersion: kubekey.kubesphere.io/v1alpha1
kind: Cluster
metadata:
  name: kubesphere-dev
spec:
  hosts:
    - {name: master, address: 10.19.132.191, internalAddress: 10.19.132.191, privateKeyPath: "~/.ssh/id_rsa"}
    - {name: node1, address: 10.19.132.193, internalAddress: 10.19.132.193, privateKeyPath: "~/.ssh/id_rsa"}
    - {name: node2, address: 10.19.132.194, internalAddress: 10.19.132.194, privateKeyPath: "~/.ssh/id_rsa"}
  roleGroups:
    etcd:
    - master
    master:
    - master
    worker:
    - master
    - node1
    - node2
  controlPlaneEndpoint:
    domain: lb.kubesphere.local
    address: ""
    port: 6443
  kubernetes:
    version: v1.21.5
    imageRepo: kubesphere
    clusterName: cluster.local
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
  registry:
    registryMirrors: []
    insecureRegistries: []
    privateRegistry: dockerhub.kubekey.local  # Add the private image registry address here. 
  addons: []

---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
  name: ks-installer
  namespace: kubesphere-system
  labels:
    version: v3.2.1
spec:
  persistence:
    storageClass: ""
  authentication:
    jwtSecret: ""
  zone: ""
  local_registry: ""
  etcd:
    monitoring: false
    endpointIps: localhost
    port: 2379
    tlsEnable: true
  common:
    redis:
      enabled: false
    redisVolumSize: 2Gi
    openldap:
      enabled: false
    openldapVolumeSize: 2Gi
    minioVolumeSize: 20Gi
    monitoring:
      endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090
    es:
      elasticsearchMasterVolumeSize: 4Gi
      elasticsearchDataVolumeSize: 20Gi
      logMaxAge: 7
      elkPrefix: logstash
      basicAuth:
        enabled: false
        username: ""
        password: ""
      externalElasticsearchUrl: ""
      externalElasticsearchPort: ""
  console:
    enableMultiLogin: true
    port: 30880
  alerting:
    enabled: false
    # thanosruler:
    #   replicas: 1
    #   resources: {}
  auditing:
    enabled: false
  devops:
    enabled: false
    jenkinsMemoryLim: 2Gi
    jenkinsMemoryReq: 1500Mi
    jenkinsVolumeSize: 8Gi
    jenkinsJavaOpts_Xms: 512m
    jenkinsJavaOpts_Xmx: 512m
    jenkinsJavaOpts_MaxRAM: 2g
  events:
    enabled: false
    ruler:
      enabled: true
      replicas: 2
  logging:
    enabled: false
    logsidecar:
      enabled: true
      replicas: 2
  metrics_server:
    enabled: false
  monitoring:
    storageClass: ""
    prometheusMemoryRequest: 400Mi
    prometheusVolumeSize: 20Gi
  multicluster:
    clusterRole: none
  network:
    networkpolicy:
      enabled: false
    ippool:
      type: none
    topology:
      type: none
  notification:
    enabled: false
  openpitrix:
    store:
      enabled: false
  servicemesh:
    enabled: false
  kubeedge:
    enabled: false
    cloudCore:
      nodeSelector: {"node-role.kubernetes.io/worker": ""}
      tolerations: []
      cloudhubPort: "10000"
      cloudhubQuicPort: "10001"
      cloudhubHttpsPort: "10002"
      cloudstreamPort: "10003"
      tunnelPort: "10004"
      cloudHub:
        advertiseAddress:
          - ""
        nodeLimit: "100"
      service:
        cloudhubNodePort: "30000"
        cloudhubQuicNodePort: "30001"
        cloudhubHttpsNodePort: "30002"
        cloudstreamNodePort: "30003"
        tunnelNodePort: "30004"
    edgeWatcher:
      nodeSelector: {"node-role.kubernetes.io/worker": ""}
      tolerations: []
      edgeWatcherAgent:
        nodeSelector: {"node-role.kubernetes.io/worker": ""}
        tolerations: []

A clear and concise description of what happend.

I want to deploy kubesphere cluster with 3 vms, and I followed this guide, download binary files and images image and has push images to local registry. Then after I run ./kk create cluster -f config-sample.yaml, it seems that 2 slave nodes auto install docker==20.10.8 and can't pull image from master's registry, which may be caused by docker version. Because I install docker==19.03.12 in master and it can pull image from itself, and 20.10.8 in master doesn't work. My question is, is kk==1.2.1 has bind docker==20.10.8 so it does not support pull image from docker registry? Can I decide which docker version to install in my slaves?

Relevant log output

➜  ~ ./kk create cluster -f config-sample.yaml
+--------+------+------+---------+----------+-------+-------+-----------+----------+------------+-------------+------------------+--------------+
| name   | sudo | curl | openssl | ebtables | socat | ipset | conntrack | docker   | nfs client | ceph client | glusterfs client | time         |
+--------+------+------+---------+----------+-------+-------+-----------+----------+------------+-------------+------------------+--------------+
| master | y    | y    | y       | y        | y     | y     | y         | 19.03.12 |            |             |                  | CST 19:00:10 |
| node2  | y    | y    | y       | y        | y     | y     | y         |          |            |             |                  | CST 19:00:10 |
| node1  | y    | y    | y       | y        | y     | y     | y         |          |            |             |                  | CST 19:00:10 |
+--------+------+------+---------+----------+-------+-------+-----------+----------+------------+-------------+------------------+--------------+

This is a simple check of your environment.
Before installation, you should ensure that your machines meet all requirements specified at
https://github.com/kubesphere/kubekey#requirements-and-recommendations

Continue this installation? [yes/no]: yes
INFO[19:00:11 CST] Downloading Installation Files               
INFO[19:00:11 CST] Downloading kubeadm ...                      
INFO[19:00:12 CST] Downloading kubelet ...                      
INFO[19:00:12 CST] Downloading kubectl ...                      
INFO[19:00:13 CST] Downloading helm ...                         
INFO[19:00:13 CST] Downloading kubecni ...                      
INFO[19:00:13 CST] Downloading etcd ...                         
INFO[19:00:13 CST] Downloading docker ...                       
INFO[19:00:14 CST] Downloading crictl ...                       
INFO[19:00:14 CST] Configuring operating system ...             
[master 10.19.132.191] MSG:
net.ipv4.tcp_fin_timeout = 3
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288
[node2 10.19.132.194] MSG:
net.ipv4.tcp_fin_timeout = 3
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288
[node1 10.19.132.193] MSG:
net.ipv4.tcp_fin_timeout = 3
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288
INFO[19:00:17 CST] Get cluster status                           
INFO[19:00:17 CST] Installing Container Runtime ...             
Push /root/kubekey/v1.21.5/amd64/docker-20.10.8.tgz to 10.19.132.193:/tmp/kubekey/docker-20.10.8.tgz   Done
Push /root/kubekey/v1.21.5/amd64/docker-20.10.8.tgz to 10.19.132.194:/tmp/kubekey/docker-20.10.8.tgz   Done
INFO[19:00:22 CST] Start to download images on all nodes        
[node2] Downloading image: dockerhub.kubekey.local/kubesphere/pause:3.4.1
[master] Downloading image: dockerhub.kubekey.local/kubesphere/pause:3.4.1
[node1] Downloading image: dockerhub.kubekey.local/kubesphere/pause:3.4.1
[master] Downloading image: dockerhub.kubekey.local/kubesphere/kube-apiserver:v1.21.5
[master] Downloading image: dockerhub.kubekey.local/kubesphere/kube-controller-manager:v1.21.5
[master] Downloading image: dockerhub.kubekey.local/kubesphere/kube-scheduler:v1.21.5
[master] Downloading image: dockerhub.kubekey.local/kubesphere/kube-proxy:v1.21.5
[master] Downloading image: dockerhub.kubekey.local/coredns/coredns:1.8.0
[master] Downloading image: dockerhub.kubekey.local/kubesphere/k8s-dns-node-cache:1.15.12
[master] Downloading image: dockerhub.kubekey.local/calico/kube-controllers:v3.20.0
[master] Downloading image: dockerhub.kubekey.local/calico/cni:v3.20.0
[master] Downloading image: dockerhub.kubekey.local/calico/node:v3.20.0
[master] Downloading image: dockerhub.kubekey.local/calico/pod2daemon-flexvol:v3.20.0
ERRO[19:00:54 CST] Failed to download image: dockerhub.kubekey.local/kubesphere/pause:3.4.1: Failed to exec command: sudo env PATH=$PATH docker pull dockerhub.kubekey.local/kubesphere/pause:3.4.1 
Error response from daemon: Get "https://dockerhub.kubekey.local/v2/": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0: Process exited with status 1  node=10.19.132.194
ERRO[19:00:54 CST] Failed to download image: dockerhub.kubekey.local/kubesphere/pause:3.4.1: Failed to exec command: sudo env PATH=$PATH docker pull dockerhub.kubekey.local/kubesphere/pause:3.4.1 
Error response from daemon: Get "https://dockerhub.kubekey.local/v2/": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0: Process exited with status 1  node=10.19.132.193
WARN[19:00:54 CST] Task failed ...                              
WARN[19:00:54 CST] error: interrupted by error                  
Error: Failed to pre-pull images: interrupted by error

Additional information

No response

pixiake commented 2 years ago

You can configure this repo as an insecure-registries in the /etc/docker/daemon.json. example:

{
  "log-opts": {
    "max-size": "5m",
    "max-file":"3"
  },
  "exec-opts": ["native.cgroupdriver=systemd"],
  "insecure-registries": ["dockerhub.kubekey.local:5000"]
}

BTW,kubekey v2.0.0 supports custom offline installation packages. There should be no such problems. You can try it. docs: English: https://github.com/kubesphere/kubekey/blob/master/docs/manifest_and_artifact.md Chinese: https://mp.weixin.qq.com/s/hjtNfSRVYH1O2o_dj6ET4A

lostrain commented 2 years ago

Thanks, after config "insecure-registries": ["dockerhub.kubekey.local"] into slaves, it worked! I have read docs of kk v2.0.0, it's more convenient, next time I will try it.

javaXiaoHan commented 2 years ago

same problem. i have config all nodes and master "insecure-registries": ["dockerhub.kubekey.local"] but it not work do you know why

javaXiaoHan commented 2 years ago
                                                 **Relevant log output**

ERRO[14:35:23 CST] Failed to download image: dockerhub.kubekey.local/kubesphere/pause:3.4.1: Failed to exec command: sudo env PATH=$PATH docker pull dockerhub.kubekey.local/kubesphere/pause:3.4.1 Error response from daemon: Get "https://dockerhub.kubekey.local/v2/": dial tcp: lookup dockerhub.kubekey.local on 100.100.2.136:53: no such host: Process exited with status 1 node=116.62.177.162 ERRO[14:35:23 CST] Failed to download image: dockerhub.kubekey.local/kubesphere/pause:3.4.1: Failed to exec command: sudo env PATH=$PATH docker pull dockerhub.kubekey.local/kubesphere/pause:3.4.1 Error response from daemon: Get "https://dockerhub.kubekey.local/v2/": dial tcp: lookup dockerhub.kubekey.local on 100.100.2.136:53: no such host: Process exited with status 1 node=120.26.192.149 ERRO[14:35:23 CST] Failed to download image: dockerhub.kubekey.local/kubesphere/pause:3.4.1: Failed to exec command: sudo env PATH=$PATH docker pull dockerhub.kubekey.local/kubesphere/pause:3.4.1 Error response from daemon: Get "https://dockerhub.kubekey.local/v2/": dial tcp: lookup dockerhub.kubekey.local on 100.100.2.136:53: no such host: Process exited with status 1 node=114.55.247.153 ERRO[14:35:23 CST] Failed to download image: dockerhub.kubekey.local/kubesphere/pause:3.4.1: Failed to exec command: sudo env PATH=$PATH docker pull dockerhub.kubekey.local/kubesphere/pause:3.4.1 Error response from daemon: Get "https://dockerhub.kubekey.local/v2/": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0: Process exited with status 1 node=121.41.226.37 ERRO[14:35:23 CST] Failed to download image: dockerhub.kubekey.local/kubesphere/pause:3.4.1: Failed to exec command: sudo env PATH=$PATH docker pull dockerhub.kubekey.local/kubesphere/pause:3.4.1 Error response from daemon: Get "https://dockerhub.kubekey.local/v2/": dial tcp: lookup dockerhub.kubekey.local: no such host: Process exited with status 1 node=101.37.89.27 ERRO[14:35:23 CST] Failed to download image: dockerhub.kubekey.local/kubesphere/pause:3.4.1: Failed to exec command: sudo env PATH=$PATH docker pull dockerhub.kubekey.local/kubesphere/pause:3.4.1 Error response from daemon: Get "https://dockerhub.kubekey.local/v2/": dial tcp: lookup dockerhub.kubekey.local on 100.100.2.136:53: no such host: Process exited with status 1 node=118.178.91.229 WARN[14:35:23 CST] Task failed ...
WARN[14:35:23 CST] error: interrupted by error
Error: Failed to pre-pull images: interrupted by error

lostrain commented 2 years ago

@javaXiaoHan Maybe you need to config /etc/hosts of your slave nodes, in order to resolve dockerhub.kubekry.local to your master node.