kubesphere / kubekey

Install Kubernetes/K3s only, both Kubernetes/K3s and KubeSphere, and related cloud-native add-ons, it supports all-in-one, multi-node, and HA 🔥 ⎈ 🐳
https://kubesphere.io
Apache License 2.0
2.27k stars 536 forks source link

k3s 因service配置文件配置项导致安装失败 #1766

Open omiku opened 1 year ago

omiku commented 1 year ago

What is version of KubeKey has the issue?

kk version: &version.Info{Major:"3", Minor:"0", GitVersion:"v3.0.7", GitCommit:"e755baf67198d565689d7207378174f429b508ba", GitTreeState:"clean", BuildDate:"2023-01-18T01:57:24Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}

What is your os environment?

Rocky Linux release 9.1 (Blue Onyx)

KubeKey config file

apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  - {name: node1, address: 172.16.0.3, internalAddress: 172.16.0.3, user: root, password: "2ZHoXJT3wbfmEb"}
  roleGroups:
    etcd:
    - node1
    control-plane: 
    - node1
    worker:
    - node1
  controlPlaneEndpoint:
    ## Internal loadbalancer for apiservers 
    # internalLoadbalancer: haproxy

    domain: lb.kubesphere.local
    address: ""
    port: 6443
  kubernetes:
    version: v1.21.4-k3s
    clusterName: cluster.local
    autoRenewCerts: true
    containerManager: containerd
  etcd:
    type: kubekey
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
    ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni
    multusCNI:
      enabled: false
  registry:
    privateRegistry: ""
    namespaceOverride: ""
    registryMirrors: []
    insecureRegistries: []
  addons: []

---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
  name: ks-installer
  namespace: kubesphere-system
  labels:
    version: v3.3.2
spec:
  persistence:
    storageClass: ""
  authentication:
    jwtSecret: ""
  zone: ""
  local_registry: ""
  namespace_override: ""
  # dev_tag: ""
  etcd:
    monitoring: false
    endpointIps: localhost
    port: 2379
    tlsEnable: true
  common:
    core:
      console:
        enableMultiLogin: true
        port: 30880
        type: NodePort
    # apiserver:
    #  resources: {}
    # controllerManager:
    #  resources: {}
    redis:
      enabled: false
      volumeSize: 2Gi
    openldap:
      enabled: false
      volumeSize: 2Gi
    minio:
      volumeSize: 20Gi
    monitoring:
      # type: external
      endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090
      GPUMonitoring:
        enabled: false
    gpu:
      kinds:
      - resourceName: "nvidia.com/gpu"
        resourceType: "GPU"
        default: true
    es:
      # master:
      #   volumeSize: 4Gi
      #   replicas: 1
      #   resources: {}
      # data:
      #   volumeSize: 20Gi
      #   replicas: 1
      #   resources: {}
      logMaxAge: 7
      elkPrefix: logstash
      basicAuth:
        enabled: false
        username: ""
        password: ""
      externalElasticsearchHost: ""
      externalElasticsearchPort: ""
  alerting:
    enabled: false
    # thanosruler:
    #   replicas: 1
    #   resources: {}
  auditing:
    enabled: false
    # operator:
    #   resources: {}
    # webhook:
    #   resources: {}
  devops:
    enabled: false
    # resources: {}
    jenkinsMemoryLim: 8Gi
    jenkinsMemoryReq: 4Gi
    jenkinsVolumeSize: 8Gi
  events:
    enabled: false
    # operator:
    #   resources: {}
    # exporter:
    #   resources: {}
    # ruler:
    #   enabled: true
    #   replicas: 2
    #   resources: {}
  logging:
    enabled: false
    logsidecar:
      enabled: true
      replicas: 2
      # resources: {}
  metrics_server:
    enabled: false
  monitoring:
    storageClass: ""
    node_exporter:
      port: 9100
      # resources: {}
    # kube_rbac_proxy:
    #   resources: {}
    # kube_state_metrics:
    #   resources: {}
    # prometheus:
    #   replicas: 1
    #   volumeSize: 20Gi
    #   resources: {}
    #   operator:
    #     resources: {}
    # alertmanager:
    #   replicas: 1
    #   resources: {}
    # notification_manager:
    #   resources: {}
    #   operator:
    #     resources: {}
    #   proxy:
    #     resources: {}
    gpu:
      nvidia_dcgm_exporter:
        enabled: false
        # resources: {}
  multicluster:
    clusterRole: none
  network:
    networkpolicy:
      enabled: false
    ippool:
      type: none
    topology:
      type: none
  openpitrix:
    store:
      enabled: false
  servicemesh:
    enabled: false
    istio:
      components:
        ingressGateways:
        - name: istio-ingressgateway
          enabled: false
        cni:
          enabled: false
  edgeruntime:
    enabled: false
    kubeedge:
      enabled: false
      cloudCore:
        cloudHub:
          advertiseAddress:
            - ""
        service:
          cloudhubNodePort: "30000"
          cloudhubQuicNodePort: "30001"
          cloudhubHttpsNodePort: "30002"
          cloudstreamNodePort: "30003"
          tunnelNodePort: "30004"
        # resources: {}
        # hostNetWork: false
      iptables-manager:
        enabled: true
        mode: "external"
        # resources: {}
      # edgeService:
      #   resources: {}
  terminal:
    timeout: 600

A clear and concise description of what happend.

在安装v1.21.4-k3s版本时在[K3sInitClusterModule] Enable k3s service步骤提示出错 错误提示为: enable k3s failed: Failed to exec command: sudo -E /bin/bash -c "systemctl daemon-reload && systemctl enable --now k3s" 使用如下命令查看service服务日志 journalctl -xeu k3s.service 日志内容大意为///run/containerd/containerd.sock不存在,详细内容见下文 查看/etc/systemd/system/k3s.service配置文件发现Environment内指定了--container-runtime-endpoint值,而kubekey并没有安装额外的CRI 实现,导致启动k3s失败。 当手动删除该内容后重新运行create命令创建集群成功。

Relevant log output

Mar 07 00:57:02 node1 k3s[30656]: I0307 00:57:02.517798   30656 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Mar 07 00:57:02 node1 k3s[30656]: E0307 00:57:02.518267   30656 remote_runtime.go:86] "Version from runtime service failed" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix ///run/containerd/containerd.sock: connect: no such file or directory\""
Mar 07 00:57:02 node1 k3s[30656]: E0307 00:57:02.518384   30656 kuberuntime_manager.go:208] "Get runtime version failed" err="get remote runtime typed version failed: rpc error: code = Unavailable desc = connection err: desc = \"transport: Error while dialing dial unix ///run/containerd/containerd.sock: connect: no such file or directory\""
Mar 07 00:57:02 node1 k3s[30656]: E0307 00:57:02.518483   30656 server.go:288] "Failed to run kubelet" err="failed to run Kubelet: failed to create kubelet: get remote runtime typed version failed: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix ///run/containerd/containerd.sock: connect: no such file or directory\""
Mar 07 00:57:02 node1 systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE

Additional information

下文为k3s文档内容:

Agent 运行时#

Flag | 默认值 | 描述 -- | -- | -- --docker | N/A | 用 docker 代替 containerd --container-runtime-endpoint value | N/A | 禁用嵌入式 containerd,使用替代的 CRI 实现。
xiaods commented 1 year ago

you can use k8e with kubekey to install k8s cluster. https://getk8e.com/docs/install/210-kubekey/

omiku commented 1 year ago

你可以使用 k8e 和 kubekey 来安装 k8s 集群。 https://getk8e.com/docs/install/210-kubekey/

请问有提供中国安装镜像吗?以及k3s版本是不再进行维护了吗?

xiaods commented 1 year ago

你可以离线安装的,k3s是kubekey社区维护的。我是k8e的maintainer,我是从k3s codebase基础上对企业级特性做了优化。看到你安装失败,所以推荐用k8e来安装。2个工具是可以平替的。供你选择。

omiku commented 1 year ago

你可以脱线安装的,k3s是kubekey社区维维护的。我是k8e的维护者,我是从k3s代码库基础上对企业级特性做优化的。看到你安装失败,所以推荐帮客2k8。是可以替换的。供你选择。 我尝试使用kubekey安装k8e也是失败的根据报错: failed: [LocalHost] [DownloadBinaries] exec failed after 1 retires: Failed to download k8e binary: curl -L -o /root/kubekey/kube/v1.25.7/amd64/k8e https://github.com/xiaods/k8e/releases/download/v1.25.7+k8e2/k8e error: No SHA256 found for k8e. v1.25.7 is not supported. 发现是下载地址出错,还请问何时修复?

xiaods commented 1 year ago

你可以脱线安装的,k3s是kubekey社区维维护的。我是k8e的维护者,我是从k3s代码库基础上对企业级特性做优化的。看到你安装失败,所以推荐帮客2k8。是可以替换的。供你选择。 我尝试使用kubekey安装k8e也是失败的根据报错: failed: [LocalHost] [DownloadBinaries] exec failed after 1 retires: Failed to download k8e binary: curl -L -o /root/kubekey/kube/v1.25.7/amd64/k8e https://github.com/xiaods/k8e/releases/download/v1.25.7+k8e2/k8e error: No SHA256 found for k8e. v1.25.7 is not supported. 发现是下载地址出错,还请问何时修复?

因为这个PR还没有被发布: https://github.com/kubesphere/kubekey/pull/1740

可以安装1.21版本

./kk create cluster --with-kubernetes v1.21.14-k8e

export PATH=/usr/local/bin:$PATH
kubectl get pod -A