apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.11k stars 173 forks source link

[BUG]pg cluster is always in Creating status due to "failed to bootstrap from leader" #8359

Closed tianyue86 closed 2 days ago

tianyue86 commented 2 days ago

Describe the bug

tianyue@192 kbcli % kbcli version
Kubernetes: v1.30.4-eks-a737599
KubeBlocks: 1.0.0-alpha.11
kbcli: 1.0.0-alpha.0

To Reproduce Steps to reproduce the behavior:

  1. Create pg cluster by following yaml

    apiVersion: apps.kubeblocks.io/v1
    kind: Cluster
    metadata:
    name: postgres-rmlqaf
    namespace: default
    spec:
    terminationPolicy: Delete
    clusterDef: postgresql
    topology: replication
    componentSpecs:
    - name: postgresql
      replicas: 2
      serviceAccountName:
      disableExporter: true
      resources:
        limits:
          cpu: 100m
          memory: 0.5Gi
        requests:
          cpu: 100m
          memory: 0.5Gi
      volumeClaimTemplates:
        - name: data
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 3Gi
  2. check cluster status

    NAMESPACE   NAME              CLUSTER-DEFINITION   VERSION   TERMINATION-POLICY   STATUS     AGE
    default     postgres-rmlqaf                                  Delete               Creating   83m
  3. describe cluster img_v3_02g5_8a9737d1-a907-41cd-bf73-af2efb5b37ag

  4. check pod status

    tianyue@192 kbcli % k get pod | grep postgresql
    postgres-rmlqaf-postgresql-0                             3/4     Running   0          84m
    postgres-rmlqaf-postgresql-1                             4/4     Running   0          84m
  5. check pod logs

    2024-10-30 06:55:44,178 INFO: Lock owner: postgres-rmlqaf-postgresql-1; I am postgres-rmlqaf-postgresql-0
    2024-10-30 06:55:44,178 INFO: trying to bootstrap from leader 'postgres-rmlqaf-postgresql-1'
    2024-10-30 06:55:44,179 ERROR: failed to bootstrap from leader 'postgres-rmlqaf-postgresql-1'
    2024-10-30 06:55:44,179 INFO: Removing data directory: /home/postgres/pgdata/pgroot/data
    2024-10-30 06:55:44,179 INFO: Lock owner: postgres-rmlqaf-postgresql-1; I am postgres-rmlqaf-postgresql-0
    2024-10-30 06:55:44,179 INFO: bootstrap from leader 'postgres-rmlqaf-postgresql-1' in progress
    2024-10-30 06:55:54,176 INFO: Lock owner: postgres-rmlqaf-postgresql-1; I am postgres-rmlqaf-postgresql-0
    2024-10-30 06:55:54,176 INFO: trying to bootstrap from leader 'postgres-rmlqaf-postgresql-1'
    2024-10-30 06:55:54,177 ERROR: failed to bootstrap from leader 'postgres-rmlqaf-postgresql-1'
    2024-10-30 06:55:54,177 INFO: Removing data directory: /home/postgres/pgdata/pgroot/data
  6. check pvc

    image

Further testing: following yaml can create pg cluster successfully

apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: pgcluster06
  namespace: default
spec:
  terminationPolicy: Delete
  clusterDef: postgresql
  topology: standalone
  componentSpecs:
    - name: postgresql
      replicas: 1
      serviceAccountName:
      disableExporter: true
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"
      volumeClaimTemplates:
        - name: data
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

Y-Rookie commented 2 days ago
    - name: {{ include "postgresql-cluster.component-name" . }}
      labels:
        {{- include "postgresql-cluster.patroni-scope-label" . | indent 8 }}

add the postgresql-cluster.patroni-scope-label to cluster yaml

tianyue86 commented 2 days ago

The cluster can be created successfully after adding this label. Thanks all for helping investigate!