kubesphere / kubekey

Install Kubernetes/K3s only, both Kubernetes/K3s and KubeSphere, and related cloud-native add-ons, it supports all-in-one, multi-node, and HA 🔥 ⎈ 🐳
https://kubesphere.io
Apache License 2.0
2.31k stars 545 forks source link

kubekey 2.2.0 fail to install kubesphere 3.2.1 #1336

Open epiphyllum opened 2 years ago

epiphyllum commented 2 years ago

What is version of KubeKey has the issue?

2.2.0

What is your os environment?

Linux hadoop100 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

KubeKey config file

apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  - {name: hadoop100, address: 192.168.76.100, internalAddress: 191.168.76.100, user: root, password: "jessie"}
  - {name: hadoop101, address: 192.168.76.101, internalAddress: 191.168.76.101, user: root, password: "jessie"}
  - {name: hadoop102, address: 192.168.76.102, internalAddress: 191.168.76.102, user: root, password: "jessie"}
  - {name: hadoop103, address: 192.168.76.103, internalAddress: 191.168.76.103, user: root, password: "jessie"}
  - {name: hadoop104, address: 192.168.76.104, internalAddress: 191.168.76.104, user: root, password: "jessie"}
  roleGroups:
    etcd:
    - hadoop100
    control-plane:
    - hadoop100
    worker:
    - hadoop101
    - hadoop102
    - hadoop103
    - hadoop104
  controlPlaneEndpoint:
    ## Internal loadbalancer for apiservers
    # internalLoadbalancer: haproxy

    domain: lb.kubesphere.local
    address: ""
    port: 6443
  kubernetes:
    version: v1.21.13
    clusterName: cluster.local
    autoRenewCerts: true
    containerManager: docker
  etcd:
    type: kubekey
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
    ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni
    multusCNI:
      enabled: false
  registry:
    privateRegistry: ""
    namespaceOverride: ""
    registryMirrors: []
    insecureRegistries: []
  addons: []

---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
  name: ks-installer
  namespace: kubesphere-system
  labels:
    version: v3.2.1
spec:
  persistence:
    storageClass: ""
  authentication:
    jwtSecret: ""
  zone: ""
  local_registry: ""
  namespace_override: ""
  # dev_tag: ""
  etcd:
    monitoring: true
    endpointIps: localhost
    port: 2379
    tlsEnable: true
  common:
    core:
      console:
        enableMultiLogin: true
        port: 30880
        type: NodePort
    # apiserver:
    #  resources: {}
    # controllerManager:
    #  resources: {}
    redis:
      enabled: false
      volumeSize: 2Gi
    openldap:
      enabled: false
      volumeSize: 2Gi
    minio:
      volumeSize: 20Gi
    monitoring:
      # type: external
      endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090
      GPUMonitoring:
        enabled: false
    gpu:
      kinds:
      - resourceName: "nvidia.com/gpu"
        resourceType: "GPU"
        default: true
    es:
      # master:
      #   volumeSize: 4Gi
      #   replicas: 1
      #   resources: {}
      # data:
      #   volumeSize: 20Gi
      #   replicas: 1
      #   resources: {}
      logMaxAge: 7
      elkPrefix: logstash
      basicAuth:
        enabled: false
        username: ""
        password: ""
      externalElasticsearchHost: ""
      externalElasticsearchPort: ""
  alerting:
    enabled: false
    # thanosruler:
    #   replicas: 1
    #   resources: {}
  auditing:
    enabled: false
    # operator:
    #   resources: {}
    # webhook:
    #   resources: {}
  devops:
    enabled: false
    jenkinsMemoryLim: 2Gi
    jenkinsMemoryReq: 1500Mi
    jenkinsVolumeSize: 8Gi
    jenkinsJavaOpts_Xms: 512m
    jenkinsJavaOpts_Xmx: 512m
    jenkinsJavaOpts_MaxRAM: 2g
  events:
    enabled: false
    # operator:
    #   resources: {}
    # exporter:
    #   resources: {}
    # ruler:
    #   enabled: true
    #   replicas: 2
    #   resources: {}
  logging:
    enabled: false
    containerruntime: docker
    logsidecar:
      enabled: true
      replicas: 2
      # resources: {}
  metrics_server:
    enabled: false
  monitoring:
    storageClass: ""
    # kube_rbac_proxy:
    #   resources: {}
    # kube_state_metrics:
    #   resources: {}
    # prometheus:
    #   replicas: 1
    #   volumeSize: 20Gi
    #   resources: {}
    #   operator:
    #     resources: {}
    #   adapter:
    #     resources: {}
    # node_exporter:
    #   resources: {}
    # alertmanager:
    #   replicas: 1
    #   resources: {}
    # notification_manager:
    #   resources: {}
    #   operator:
    #     resources: {}
    #   proxy:
    #     resources: {}
    gpu:
      nvidia_dcgm_exporter:
        enabled: false
        # resources: {}
  multicluster:
    clusterRole: none
  network:
    networkpolicy:
      enabled: false
    ippool:
      type: none
    topology:
      type: none
  openpitrix:
    store:
      enabled: false
  servicemesh:
    enabled: false
  kubeedge:
    enabled: false
    cloudCore:
      nodeSelector: {"node-role.kubernetes.io/worker": ""}
      tolerations: []
      cloudhubPort: "10000"
      cloudhubQuicPort: "10001"
      cloudhubHttpsPort: "10002"
      cloudstreamPort: "10003"
      tunnelPort: "10004"
      cloudHub:
        advertiseAddress:
          - ""
        nodeLimit: "100"
      service:
        cloudhubNodePort: "30000"
        cloudhubQuicNodePort: "30001"
        cloudhubHttpsNodePort: "30002"
        cloudstreamNodePort: "30003"
        tunnelNodePort: "30004"
    edgeWatcher:
      nodeSelector: {"node-role.kubernetes.io/worker": ""}
      tolerations: []
      edgeWatcherAgent:
        nodeSelector: {"node-role.kubernetes.io/worker": ""}
        tolerations: []

A clear and concise description of what happend.

[root@hadoop100 ~]# ./kk create cluster -f config-sample.yaml


| | / / | | | | / / | |/ / | | __| |/ / | | | | | ' \ / \ \ / \ | | | | |\ \ || | |) | / |\ \ / || | _| _/_,|./ \_| _/_|__, | _/ | |/

16:22:59 CST [GreetingsModule] Greetings 16:23:29 CST failed: [hadoop104] 16:23:29 CST failed: [hadoop101] 16:23:29 CST failed: [hadoop100] 16:23:29 CST failed: [hadoop102] 16:23:29 CST failed: [hadoop103] error: Pipeline[CreateClusterPipeline] execute failed: Module[GreetingsModule] exec failed: failed: [hadoop104] failed to connect to 192.168.76.104: could not establish connection to 192.168.76.104:22: dial tcp 192.168.76.104:22: i/o timeout failed: [hadoop101] failed to connect to 192.168.76.101: could not establish connection to 192.168.76.101:22: dial tcp 192.168.76.101:22: i/o timeout failed: [hadoop100] execute task timeout, Timeout=30000000000 failed: [hadoop102] execute task timeout, Timeout=30000000000 failed: [hadoop103] execute task timeout, Timeout=30000000000

Relevant log output

No response

Additional information

这5台服务器是可以相互ssh过去的。 防火墙相关的配置也是严格按文档做的。 都没有问题。

24sama commented 2 years ago

Same as #1288

gitanbu commented 2 years ago

Someone Please advise me what to do?

./kk create cluster --with-kubernetes v1.24.0 --with-kubesphere v3.2.0


| | / / | | | | / /
| |/ / | | __| |/ / | | | | | ' \ / \ \ / \ | | | | |\ \ || | |) | / |\ \ / || | _| _/_,|./ \_| _/_|__, | _/ | |/

19:07:35 IST [GreetingsModule] Greetings 19:07:36 IST failed: [amannath.pnq.com] error: Pipeline[CreateClusterPipeline] execute failed: Module[GreetingsModule] exec failed: failed: [amannath.pnq.com] failed to connect to 192.168.1.8: could not establish connection to 192.168.1.8:22: dial tcp 192.168.1.8:22: connect: connection refused