apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.21k stars 184 forks source link

[BUG]1.0 mysql cluster created failed: Back-off restarting failed container kbagent-worker in pod #8519

Closed tianyue86 closed 4 days ago

tianyue86 commented 4 days ago

Describe the env

Kubernetes: v1.31.1-aliyun.1
KubeBlocks: 1.0.0-beta.5
kbcli: 1.0.0-beta.3

To Reproduce Steps to reproduce the behavior:

  1. Apply the following yaml to create mysql cluster
    apiVersion: apps.kubeblocks.io/v1
    kind: Cluster
    metadata:
    name: mysqlc02
    labels:
    helm.sh/chart: mysql-cluster-1.0.0-alpha.0
    app.kubernetes.io/version: "8.0.33"
    app.kubernetes.io/instance: mysqlc02
    namespace: default
    annotations:
    spec:
    terminationPolicy: Delete
    componentSpecs:
    - name: mysql
      componentDef: mysql-8.0
      disableExporter: true
      replicas: 1
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"
      volumeClaimTemplates:
        - name: data # ref clusterDefinition components.containers.volumeMounts.name
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi1. Go to '...'
  2. check cluster status: Failed
    k get cluster -A
    NAMESPACE   NAME           CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS    AGE
    default     mysqlc02                            Delete               Failed    2m46s
  3. check pvc: Bound
    k get pvc
    NAME              STATUS   VOLUME             CAPACITY   ACCESS MODES   STORAGECLASS    VOLUMEATTRIBUTESCLASS   AGE
    data-mysql3c2-mysql-0       Bound    d-8vb88aki0yj1qr8tmxos   20Gi       RWO            kb-default-sc   <unset>                 55m
    data-mysqlc02-mysql-0       Bound    d-8vb9c7agv97fjozfg8ul   20Gi       RWO            kb-default-sc   <unset>                 14m
  4. check pod: CrashLoopBackOff
    k get pod
    NAME               READY   STATUS                  RESTARTS        AGE
    mysqlc02-mysql-0   0/4     Init:CrashLoopBackOff   6 (4m22s ago)   10m
  5. describe pod: Back-off restarting failed container kbagent-worker in pod
    Events:
    Type     Reason                  Age                  From                     Message
    ----     ------                  ----                 ----                     -------
    Normal   Scheduled               11m                  default-scheduler        Successfully assigned default/mysqlc02-mysql-0 to cn-zhangjiakou.10.0.0.141
    Normal   SuccessfulAttachVolume  11m                  attachdetach-controller  AttachVolume.Attach succeeded for volume "d-8vb9c7agv97fjozfg8ul"
    Normal   AllocIPSucceed          11m                  terway-daemon            Alloc IP 10.0.0.187/24 took 30.800037ms
    Normal   Pulled                  11m                  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/mysql_audit_log:8.0.33" already present on machine
    Normal   Created                 11m                  kubelet                  Created container init-data
    Normal   Started                 11m                  kubelet                  Started container init-data
    Normal   Pulling                 11m                  kubelet                  Pulling image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/xtrabackup:8.0"
    Normal   Pulled                  10m                  kubelet                  Successfully pulled image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/xtrabackup:8.0" in 25.596s (25.596s including waiting). Image size: 484944832 bytes.
    Normal   Created                 10m                  kubelet                  Created container init-xtrabackup
    Normal   Started                 10m                  kubelet                  Started container init-xtrabackup
    Normal   Pulled                  10m                  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/jemalloc:5.3.0" already present on machine
    Normal   Created                 10m                  kubelet                  Created container init-jemalloc
    Normal   Started                 10m                  kubelet                  Started container init-jemalloc
    Normal   Pulled                  10m                  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/syncer:0.3.3" already present on machine
    Normal   Created                 10m                  kubelet                  Created container init-syncer
    Normal   Started                 10m                  kubelet                  Started container init-syncer
    Normal   Pulled                  10m                  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/dbctl:0.1.5" already present on machine
    Normal   Created                 10m                  kubelet                  Created container init-dbctl
    Normal   Started                 10m                  kubelet                  Started container init-dbctl
    Normal   Started                 10m (x2 over 10m)    kubelet                  Started container kbagent-worker
    Normal   Pulled                  9m57s (x3 over 10m)  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.0-beta.5" already present on machine
    Normal   Created                 9m57s (x3 over 10m)  kubelet                  Created container kbagent-worker
    Warning  BackOff                 52s (x43 over 10m)   kubelet                  Back-off restarting failed container kbagent-worker in pod mysqlc02-mysql-0_default(ecc1d39f-b544-4762-b28f-00879faa9688)

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

leon-inf commented 4 days ago

Has been fixed by the commit 62f1be087a9c7bdc93708ced9764a2024e60bd90