apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.08k stars 170 forks source link

[BUG] patroni postgresql config --> stop --> start cluster is Failed #2512

Closed JashBook closed 1 year ago

JashBook commented 1 year ago

Describe the bug patroni postgresql config --> stop --> start cluster Failed.

kbcli version
Kubernetes: v1.25.6-eks-48e63af
KubeBlocks: 0.5.0-alpha.6
kbcli: 0.5.0-alpha.6

To Reproduce Steps to reproduce the behavior:

  1. create pg cluster
    kbcli cluster create pg-cluster --termination-policy=WipeOut --cluster-definition=postgresql --set replicas=2,cpu=300m,memory=500Mi,storage=1Gi
  2. config
    echo yes | kbcli cluster configure pg-cluster\
    --component postgresql \
    --config-spec postgresql-configuration \
    --config-file postgresql.conf \
    --set max_connections=300,shared_buffers=512MB
  3. stop --> start
    
    kbcli cluster stop pg-cluster

kbcli cluster start pg-cluster

4. See error

kbcli cluster describe pg-cluster Name: pg-cluster Created Time: Apr 11,2023 15:50 UTC+0800 NAMESPACE CLUSTER-DEFINITION VERSION STATUS TERMINATION-POLICY
default postgresql postgresql-15.2.0 Failed WipeOut

Endpoints: COMPONENT MODE INTERNAL EXTERNAL
postgresql ReadWrite pg-cluster-postgresql.default.svc.cluster.local:5432
pg-cluster-postgresql.default.svc.cluster.local:9187

Topology: COMPONENT INSTANCE ROLE STATUS AZ NODE CREATED-TIME
postgresql pg-cluster-postgresql-0 primary Running cn-northwest-1a ip-172-31-13-48.cn-northwest-1.compute.internal/172.31.13.48 Apr 11,2023 16:30 UTC+0800
postgresql pg-cluster-postgresql-1-0 secondary Running cn-northwest-1c ip-172-31-44-8.cn-northwest-1.compute.internal/172.31.44.8 Apr 11,2023 16:30 UTC+0800

Resources Allocation: COMPONENT DEDICATED CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE-SIZE STORAGE-CLASS
postgresql false 300m / 300m 500Mi / 500Mi data:1Gi ebs-sc

Images: COMPONENT TYPE IMAGE
postgresql postgresql registry.cn-hangzhou.aliyuncs.com/apecloud/spilo:15.2.0

Events(last 5 warnings, see more:kbcli cluster list-events -n default pg-cluster): TIME TYPE REASON OBJECT MESSAGE
Apr 11,2023 16:29 UTC+0800 Warning ApplyResourcesFailed Cluster/pg-cluster Operation cannot be fulfilled on statefulsets.apps "pg-cluster-postgresql": StorageError: invalid object, Code: 4, Key: /registry/statefulsets/default/pg-cluster-postgresql, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 84df5a84-ee75-446c-be7e-85dbc440991e, UID in object meta:
Apr 11,2023 16:30 UTC+0800 Warning Unhealthy Cluster/pg-cluster Pod pg-cluster-postgresql-1-0: Readiness probe failed: 127.0.0.1:5432 - no response

Apr 11,2023 16:31 UTC+0800 Warning Unhealthy Cluster/pg-cluster Pod pg-cluster-postgresql-0: Readiness probe failed: 127.0.0.1:5432 - no response

Apr 11,2023 16:35 UTC+0800 Warning Unhealthy Instance/pg-cluster-postgresql-1-0 Readiness probe failed: 127.0.0.1:5432 - no response

Apr 11,2023 16:35 UTC+0800 Warning Unhealthy Instance/pg-cluster-postgresql-0 Readiness probe failed: 127.0.0.1:5432 - no response

kubectl get pod,ops,cm -l app.kubernetes.io/instance=pg-cluster
NAME READY STATUS RESTARTS AGE pod/pg-cluster-postgresql-0 3/4 Running 0 5m24s pod/pg-cluster-postgresql-1-0 3/4 Running 0 5m24s

NAME TYPE CLUSTER STATUS PROGRESS AGE opsrequest.apps.kubeblocks.io/pg-cluster-reconfiguring-pmr6n Reconfiguring pg-cluster Succeed 2/2 40m opsrequest.apps.kubeblocks.io/pg-cluster-start-bbbdg Start pg-cluster Failed 2/2 5m25s opsrequest.apps.kubeblocks.io/pg-cluster-stop-xjx4f Stop pg-cluster Succeed 2/2 6m10s

NAME DATA AGE configmap/pg-cluster-postgresql-env 5 5m24s configmap/pg-cluster-postgresql-patroni-config 0 44m configmap/pg-cluster-postgresql-patroni-leader 0 44m configmap/pg-cluster-postgresql-postgresql-configuration 3 5m24s configmap/pg-cluster-postgresql-postgresql-custom-metrics 1 5m24s configmap/pg-cluster-postgresql-postgresql-scripts 3 5m24s


logs pod 

kubectl logs pg-cluster-postgresql-0 Defaulted container "postgresql" out of: postgresql, metrics, kb-checkrole, config-manager, pg-init-container (init)

➜ ~ kubectl logs pg-cluster-postgresql-1-0 Defaulted container "postgresql" out of: postgresql, metrics, kb-checkrole, config-manager, pg-init-container (init)



**Expected behavior**
patroni postgresql config --> stop --> start cluster is OK.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
 - OS: [e.g. iOS]
 - Browser [e.g. chrome, safari]
 - Version [e.g. 22]

**Additional context**
Add any other context about the problem here.
sophon-zt commented 1 year ago

related issue: #2523