apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
1.75k stars 154 forks source link

[BUG] redis-cluster do restart ops all pod role are primary #7589

Open JashBook opened 1 week ago

JashBook commented 1 week ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. create redis cluster
    kubectl apply -f -<<EOF
    apiVersion: apps.kubeblocks.io/v1alpha1
    kind: Cluster
    metadata:
    name: rcluster-cluster
    namespace: default
    spec:
    terminationPolicy: Delete
    shardingSpecs:
    - name: shard
      shards: 3
      template:
        name: redis
        componentDef: redis-cluster-7
        replicas: 2
        switchPolicy:
          type: Noop
        resources:
          limits:
            cpu: 100m
            memory: 0.5Gi
          requests:
            cpu: 100m
            memory: 0.5Gi
        volumeClaimTemplates:
          - name: data
            spec:
              accessModes:
                - ReadWriteOnce
              resources:
                 requests:
                  storage: 1Gi
    EOF
    
    kubectl get cluster    
    NAME               CLUSTER-DEFINITION   VERSION   TERMINATION-POLICY   STATUS    AGE
    rcluster-cluster                                  Delete               Running   52s

➜ ~ kbcli cluster list-instances rcluster-cluster NAME NAMESPACE CLUSTER COMPONENT STATUS ROLE ACCESSMODE AZ CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE NODE CREATED-TIME
rcluster-cluster-shard-42p-0 default rcluster-cluster shard-42p Running primary us-central1-a 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-7l88/10.128.0.44 Jun 21,2024 12:52 UTC+0800
rcluster-cluster-shard-42p-1 default rcluster-cluster shard-42p Running secondary us-central1-b 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8/10.128.0.55 Jun 21,2024 12:52 UTC+0800
rcluster-cluster-shard-jfs-0 default rcluster-cluster shard-jfs Running primary us-central1-b 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8/10.128.0.55 Jun 21,2024 12:52 UTC+0800
rcluster-cluster-shard-jfs-1 default rcluster-cluster shard-jfs Running secondary us-central1-a 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-7l88/10.128.0.44 Jun 21,2024 12:52 UTC+0800
rcluster-cluster-shard-qqs-0 default rcluster-cluster shard-qqs Running primary us-central1-b 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8/10.128.0.55 Jun 21,2024 12:52 UTC+0800
rcluster-cluster-shard-qqs-1 default rcluster-cluster shard-qqs Running secondary us-central1-a 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-7l88/10.128.0.44 Jun 21,2024 12:52 UTC+0800

kubectl get pod -l app.kubernetes.io/instance=rcluster-cluster -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rcluster-cluster-shard-42p-0 3/3 Running 0 78s 10.116.44.7 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-7l88 rcluster-cluster-shard-42p-1 3/3 Running 0 78s 10.116.45.28 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8 rcluster-cluster-shard-jfs-0 3/3 Running 0 76s 10.116.45.29 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8 rcluster-cluster-shard-jfs-1 3/3 Running 0 76s 10.116.44.9 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-7l88 rcluster-cluster-shard-qqs-0 3/3 Running 0 78s 10.116.45.30 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8 rcluster-cluster-shard-qqs-1 3/3 Running 0 78s 10.116.44.8 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-7l88 ➜ ~

2. restart cluster

kbcli cluster restart rcluster-cluster --auto-approve

kubectl get ops
NAME TYPE CLUSTER STATUS PROGRESS AGE rcluster-cluster-restart-6kg9g Restart rcluster-cluster Succeed 6/6 8m47s

kubectl get cluster NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE rcluster-cluster Delete Running 10m

kubectl get pod -l app.kubernetes.io/instance=rcluster-cluster -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rcluster-cluster-shard-42p-0 3/3 Running 0 2m46s 10.116.8.46 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-xx58 rcluster-cluster-shard-42p-1 3/3 Running 0 9m17s 10.116.45.36 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8 rcluster-cluster-shard-jfs-0 3/3 Running 0 8m54s 10.116.45.38 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8 rcluster-cluster-shard-jfs-1 2/3 Running 1 (6s ago) 2m46s 10.116.25.66 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-9w0d rcluster-cluster-shard-qqs-0 3/3 Running 0 8m55s 10.116.45.37 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8 rcluster-cluster-shard-qqs-1 2/3 Running 1 (2s ago) 2m47s 10.116.25.67 gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-9w0d


3. See error all pod role are primary

kubectl get pod NAME READY STATUS RESTARTS AGE rcluster-cluster-shard-42p-0 0/3 Completed 0 5m45s rcluster-cluster-shard-42p-1 3/3 Running 0 6m9s rcluster-cluster-shard-jfs-0 3/3 Running 0 5m46s rcluster-cluster-shard-jfs-1 0/3 Completed 1 6m7s rcluster-cluster-shard-qqs-0 3/3 Running 0 5m47s rcluster-cluster-shard-qqs-1 0/3 Completed 1 6m9s

kubectl get pod NAME READY STATUS RESTARTS AGE rcluster-cluster-shard-42p-0 3/3 Running 0 93s rcluster-cluster-shard-42p-1 3/3 Running 0 8m4s rcluster-cluster-shard-jfs-0 3/3 Running 0 7m41s rcluster-cluster-shard-jfs-1 3/3 Running 0 93s rcluster-cluster-shard-qqs-0 3/3 Running 0 7m42s rcluster-cluster-shard-qqs-1 3/3 Running 0 94s ➜ ~

kbcli cluster list-instances rcluster-cluster NAME NAMESPACE CLUSTER COMPONENT STATUS ROLE ACCESSMODE AZ CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE NODE CREATED-TIME
rcluster-cluster-shard-42p-0 default rcluster-cluster shard-42p Running primary us-central1-a 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-xx58/10.128.15.204 Jun 21,2024 13:00 UTC+0800
rcluster-cluster-shard-42p-1 default rcluster-cluster shard-42p Running secondary us-central1-b 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8/10.128.0.55 Jun 21,2024 12:54 UTC+0800
rcluster-cluster-shard-jfs-0 default rcluster-cluster shard-jfs Running primary us-central1-b 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8/10.128.0.55 Jun 21,2024 12:54 UTC+0800
rcluster-cluster-shard-jfs-1 default rcluster-cluster shard-jfs Running primary us-central1-a 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-9w0d/10.128.0.77 Jun 21,2024 13:00 UTC+0800
rcluster-cluster-shard-qqs-0 default rcluster-cluster shard-qqs Running primary us-central1-b 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-219d1488-cwn8/10.128.0.55 Jun 21,2024 12:54 UTC+0800
rcluster-cluster-shard-qqs-1 default rcluster-cluster shard-qqs Running primary us-central1-a 100m / 100m 512Mi / 512Mi gke-cicd-gke-q3zypql-cicd-gke-q3zypql-771d13bc-9w0d/10.128.0.77 Jun 21,2024 13:00 UTC+0800

logs pod-0 redis-cluster

kubectl logs rcluster-cluster-shard-jfs-0 redis-cluster

logs pod-1 redis-cluster

kubectl logs rcluster-cluster-shard-jfs-1 redis-cluster

logs pod-0 lorry

kubectl logs rcluster-cluster-shard-jfs-0 lorry 2024-06-21T04:54:36Z INFO Initialize DB manager 2024-06-21T04:54:36Z INFO KB_WORKLOAD_TYPE ENV not set 2024-06-21T04:54:36Z INFO Volume-Protection succeed to init volume protection {"pod": "rcluster-cluster-shard-jfs-0", "spec": {"highWatermark":"0","volumes":[]}} 2024-06-21T04:54:36Z INFO HTTPServer Starting HTTP Server 2024-06-21T04:54:36Z INFO HTTPServer API route path {"method": "POST", "path": ["/v1.0/checkrunning", "/v1.0/rebuild", "/v1.0/grantuserrole", "/v1.0/unlockinstance", "/v1.0/switchover", "/v1.0/revokeuserrole", "/v1.0/exec", "/v1.0/deleteuser", "/v1.0/volumeprotection", "/v1.0/getlag", "/v1.0/leavemember", "/v1.0/joinmember", "/v1.0/createuser", "/v1.0/lockinstance", "/v1.0/postprovision", "/v1.0/preterminate", "/v1.0/datadump", "/v1.0/dataload"]} 2024-06-21T04:54:36Z INFO HTTPServer API route path {"method": "GET", "path": ["/v1.0/query", "/v1.0/describeuser", "/v1.0/listsystemaccounts", "/v1.0/checkrole", "/v1.0/getrole", "/v1.0/healthycheck", "/v1.0/listusers"]} 2024-06-21T04:54:36Z INFO cronjobs env is not set {"env": "KB_CRON_JOBS"} 2024-06-21T04:54:46Z INFO Redis DB startup ready 2024-06-21T04:54:46Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=rcluster-cluster,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=shard-jfs 2024-06-21T04:54:46Z INFO DCS-K8S podlist: 2 2024-06-21T04:54:46Z INFO DCS-K8S Leader configmap is not found {"configmap": "rcluster-cluster-shard-jfs-leader"} 2024-06-21T04:54:46Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=rcluster-cluster,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=shard-jfs 2024-06-21T04:54:46Z INFO DCS-K8S podlist: 2 2024-06-21T04:54:46Z DEBUG checkrole check member {"member": "rcluster-cluster-shard-jfs-0", "role": ""} 2024-06-21T04:54:46Z DEBUG checkrole check member {"member": "rcluster-cluster-shard-jfs-1", "role": "primary"} 2024-06-21T04:54:46Z INFO checkrole there is a another leader {"member": "rcluster-cluster-shard-jfs-1"} 2024-06-21T04:54:46Z INFO checkrole another leader's lorry is online, just ignore {"member": "rcluster-cluster-shard-jfs-1"} 2024-06-21T04:54:46Z INFO event send event: map[event:Success operation:checkRole originalRole:waitForStart role:{"term":"1718945686592801","PodRoleNamePairs":[{"podName":"rcluster-cluster-shard-jfs-0","roleName":"primary","podUid":"200c9c9a-f17f-41a8-9223-9ee87e73b490"}]}] 2024-06-21T04:54:46Z INFO event send event success {"message": "{\"event\":\"Success\",\"operation\":\"checkRole\",\"originalRole\":\"waitForStart\",\"role\":\"{\\"term\\":\\"1718945686592801\\",\\"PodRoleNamePairs\\":[{\\"podName\\":\\"rcluster-cluster-shard-jfs-0\\",\\"roleName\\":\\"primary\\",\\"podUid\\":\\"200c9c9a-f17f-41a8-9223-9ee87e73b490\\"}]}\"}"}

logs pod-1 lorry

kubectl logs rcluster-cluster-shard-jfs-1 lorry 2024-06-21T05:01:04Z INFO Initialize DB manager 2024-06-21T05:01:04Z INFO KB_WORKLOAD_TYPE ENV not set 2024-06-21T05:01:04Z INFO Volume-Protection succeed to init volume protection {"pod": "rcluster-cluster-shard-jfs-1", "spec": {"highWatermark":"0","volumes":[]}} 2024-06-21T05:01:04Z INFO HTTPServer Starting HTTP Server 2024-06-21T05:01:04Z INFO HTTPServer API route path {"method": "POST", "path": ["/v1.0/leavemember", "/v1.0/exec", "/v1.0/volumeprotection", "/v1.0/switchover", "/v1.0/lockinstance", "/v1.0/postprovision", "/v1.0/dataload", "/v1.0/preterminate", "/v1.0/unlockinstance", "/v1.0/checkrunning", "/v1.0/joinmember", "/v1.0/rebuild", "/v1.0/revokeuserrole", "/v1.0/datadump", "/v1.0/getlag", "/v1.0/deleteuser", "/v1.0/grantuserrole", "/v1.0/createuser"]} 2024-06-21T05:01:04Z INFO HTTPServer API route path {"method": "GET", "path": ["/v1.0/listsystemaccounts", "/v1.0/checkrole", "/v1.0/describeuser", "/v1.0/getrole", "/v1.0/healthycheck", "/v1.0/query", "/v1.0/listusers"]} 2024-06-21T05:01:04Z INFO cronjobs env is not set {"env": "KB_CRON_JOBS"} 2024-06-21T05:01:12Z INFO Redis DB startup ready 2024-06-21T05:01:12Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=rcluster-cluster,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=shard-jfs 2024-06-21T05:01:12Z INFO DCS-K8S podlist: 2 2024-06-21T05:01:12Z INFO DCS-K8S Leader configmap is not found {"configmap": "rcluster-cluster-shard-jfs-leader"} 2024-06-21T05:01:12Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=rcluster-cluster,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=shard-jfs 2024-06-21T05:01:12Z INFO DCS-K8S podlist: 2 2024-06-21T05:01:12Z DEBUG checkrole check member {"member": "rcluster-cluster-shard-jfs-0", "role": "primary"} 2024-06-21T05:01:12Z INFO checkrole there is a another leader {"member": "rcluster-cluster-shard-jfs-0"} 2024-06-21T05:01:12Z INFO checkrole another leader's lorry is online, just ignore {"member": "rcluster-cluster-shard-jfs-0"} 2024-06-21T05:01:12Z DEBUG checkrole check member {"member": "rcluster-cluster-shard-jfs-1", "role": ""} 2024-06-21T05:01:12Z INFO event send event: map[event:Success operation:checkRole originalRole:waitForStart role:{"term":"1718946072918100","PodRoleNamePairs":[{"podName":"rcluster-cluster-shard-jfs-1","roleName":"primary","podUid":"545d1043-b09a-4df5-9702-d57a9a0a6ac7"}]}] 2024-06-21T05:01:12Z INFO event send event success {"message": "{\"event\":\"Success\",\"operation\":\"checkRole\",\"originalRole\":\"waitForStart\",\"role\":\"{\\"term\\":\\"1718946072918100\\",\\"PodRoleNamePairs\\":[{\\"podName\\":\\"rcluster-cluster-shard-jfs-1\\",\\"roleName\\":\\"primary\\",\\"podUid\\":\\"545d1043-b09a-4df5-9702-d57a9a0a6ac7\\"}]}\"}"} 2024-06-21T05:03:19Z ERROR Redis Role query error {"error": "dial tcp 127.0.0.1:6379: connect: connection refused"} github.com/apecloud/kubeblocks/pkg/lorry/engines/redis.(Manager).GetReplicaRole /src/pkg/lorry/engines/redis/get_replica_role.go:44 github.com/apecloud/kubeblocks/pkg/lorry/operations/replica.(CheckRole).Do /src/pkg/lorry/operations/replica/checkrole.go:144 github.com/apecloud/kubeblocks/pkg/lorry/httpserver.(api).RegisterOperations.OperationWrapper.func1 /src/pkg/lorry/httpserver/apis.go:119 github.com/fasthttp/router.(Router).Handler /go/pkg/mod/github.com/fasthttp/router@v1.4.20/router.go:420 github.com/apecloud/kubeblocks/pkg/lorry/httpserver.(server).StartNonBlocking.(server).apiLogger.func2 /src/pkg/lorry/httpserver/server.go:120 github.com/valyala/fasthttp.(Server).serveConn /go/pkg/mod/github.com/valyala/fasthttp@v1.50.0/server.go:2359 github.com/valyala/fasthttp.(workerPool).workerFunc /go/pkg/mod/github.com/valyala/fasthttp@v1.50.0/workerpool.go:224 github.com/valyala/fasthttp.(*workerPool).getCh.func1 /go/pkg/mod/github.com/valyala/fasthttp@v1.50.0/workerpool.go:196 2024-06-21T05:03:19Z INFO checkrole executing checkRole error {"error": "dial tcp 127.0.0.1:6379: connect: connection refused"} 2024-06-21T05:03:19Z INFO checkrole role checks failed continuously {"times": 0} 2024-06-21T05:03:19Z INFO event send event: map[operation:checkRole originalRole:primary] 2024-06-21T05:03:19Z INFO event send event success {"message": "{\"operation\":\"checkRole\",\"originalRole\":\"primary\"}"}

kubectl exec -it rcluster-cluster-shard-jfs-0 bash

root@rcluster-cluster-shard-jfs-0:/# redis-cli -a O3605v7HsS

cluster nodes ec893952e27517db9ee0094815fb0f0ef4633820 10.116.45.30:6379@16379,rcluster-cluster-shard-qqs-0.rcluster-cluster-shard-qqs-headless.default.svc master,fail? - 1718945677694 1718945676686 1 connected 0-5460 6dedd0fbec884b96bc6c650f339b3fa59bd51c0b 10.116.44.7:6379@16379,rcluster-cluster-shard-42p-0.rcluster-cluster-shard-42p-headless.default.svc master,fail? - 1718945678698 1718945676686 2 connected 5461-10922 b81837847cdf48c1f6c33befcf95c3979d2293ad 10.116.45.38:6379@16379,rcluster-cluster-shard-jfs-0.rcluster-cluster-shard-jfs-headless.default.svc myself,master - 0 1718945676686 3 connected 10923-16383

kubectl exec -it rcluster-cluster-shard-jfs-1 bash redis-cli -a O3605v7HsS cluster nodes 542fd78cd8d275d3a9e8e6f60b95279c91f5875c 10.116.25.66:6379@16379,rcluster-cluster-shard-jfs-1.rcluster-cluster-shard-jfs-headless.default.svc myself,master - 0 1718946199239 0 connected


**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
 - OS: [e.g. iOS]
 - Browser [e.g. chrome, safari]
 - Version [e.g. 22]

kbcli version Kubernetes: v1.27.13-gke.1070000 KubeBlocks: 0.9.0-beta.34 kbcli: 0.9.0-beta.27



**Additional context**
Add any other context about the problem here.
JashBook commented 6 days ago

kubeblocks 0.9.0-beta.39 reappear

  1. create redis cluster
    apiVersion: apps.kubeblocks.io/v1alpha1
    kind: Cluster
    metadata:
    name: rcluster-dwdzfb
    namespace: default
    spec:
    terminationPolicy: Delete
    shardingSpecs:
    - name: shard
      shards: 3
      template:
        name: redis
        componentDef: redis-cluster-7
        replicas: 2
        switchPolicy:
          type: Noop
        resources:
          limits:
            cpu: 100m
            memory: 0.5Gi
          requests:
            cpu: 100m
            memory: 0.5Gi
        volumeClaimTemplates:
          - name: data
            spec:
              accessModes:
                - ReadWriteOnce
              resources:
                 requests:
                  storage: 1Gi
  2. restart
    kbcli cluster restart rcluster-dwdzfb  --auto-approve 
  3. see error
    kbcli cluster list-instances rcluster-dwdzfb
    NAME                          NAMESPACE   CLUSTER           COMPONENT   STATUS    ROLE        ACCESSMODE   AZ              CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE   NODE                                                             CREATED-TIME                 
    rcluster-dwdzfb-shard-dn8-0   default     rcluster-dwdzfb   shard-dn8   Running   primary     <none>       us-central1-f   100m / 100m          512Mi / 512Mi           <none>    gke-infracreate-gke-kbdata-e2-standar-25c8fd47-9yic/10.10.0.70   Jun 27,2024 19:36 UTC+0800   
    rcluster-dwdzfb-shard-dn8-1   default     rcluster-dwdzfb   shard-dn8   Running   primary     <none>       us-central1-f   100m / 100m          512Mi / 512Mi           <none>    gke-infracreate-gke-kbdata-e2-standar-25c8fd47-9yic/10.10.0.70   Jun 27,2024 19:36 UTC+0800   
    rcluster-dwdzfb-shard-jq7-0   default     rcluster-dwdzfb   shard-jq7   Running   primary     <none>       us-central1-f   100m / 100m          512Mi / 512Mi           <none>    gke-infracreate-gke-kbdata-e2-standar-25c8fd47-9yic/10.10.0.70   Jun 27,2024 19:36 UTC+0800   
    rcluster-dwdzfb-shard-jq7-1   default     rcluster-dwdzfb   shard-jq7   Running   secondary   <none>       us-central1-f   100m / 100m          512Mi / 512Mi           <none>    gke-infracreate-gke-kbdata-e2-standar-25c8fd47-9yic/10.10.0.70   Jun 27,2024 19:36 UTC+0800   
    rcluster-dwdzfb-shard-jw4-0   default     rcluster-dwdzfb   shard-jw4   Running   primary     <none>       us-central1-f   100m / 100m          512Mi / 512Mi           <none>    gke-infracreate-gke-kbdata-e2-standar-25c8fd47-9yic/10.10.0.70   Jun 27,2024 19:36 UTC+0800   
    rcluster-dwdzfb-shard-jw4-1   default     rcluster-dwdzfb   shard-jw4   Running   primary     <none>       us-central1-f   100m / 100m          512Mi / 512Mi           <none>    gke-infracreate-gke-kbdata-e2-standar-25c8fd47-9yic/10.10.0.70   Jun 27,2024 19:36 UTC+0800  
  4. cluster nodes
    
    ➜  ~ kubectl exec -it rcluster-dwdzfb-shard-dn8-0 bash
    root@rcluster-dwdzfb-shard-dn8-0:/# redis-cli -a O3605v7HsS
    127.0.0.1:6379> cluster nodes
    8e1663f1de19c42104f42a3757ed34eb4b26fb19 10.128.2.252:6379@16379,rcluster-dwdzfb-shard-dn8-0.rcluster-dwdzfb-shard-dn8-headless.default.svc myself,master - 0 1719488510000 2 connected 5461-10922
    a8b4c9269fdc2cc680129c85dae0c04137a2ac12 10.128.2.251:6379@16379,rcluster-dwdzfb-shard-jw4-0.rcluster-dwdzfb-shard-jw4-headless.default.svc master - 0 1719488509681 1 connected 0-5460
    a5b5eebe1a1685c363bb8465f60fbdda9678db87 10.128.2.253:6379@16379,rcluster-dwdzfb-shard-jq7-0.rcluster-dwdzfb-shard-jq7-headless.default.svc master - 0 1719488511000 3 connected 10923-16383
    c89c7db97324ae484d77a0eebb7df304687d89b4 10.128.2.250:6379@16379,rcluster-dwdzfb-shard-jq7-1.rcluster-dwdzfb-shard-jq7-headless.default.svc slave a5b5eebe1a1685c363bb8465f60fbdda9678db87 0 1719488511692 3 connected
5. logs pod lorry

kubectl logs rcluster-dwdzfb-shard-dn8-0 lorry 2024-06-27T11:36:52Z INFO Initialize DB manager 2024-06-27T11:36:52Z INFO KB_WORKLOAD_TYPE ENV not set 2024-06-27T11:36:52Z INFO Volume-Protection succeed to init volume protection {"pod": "rcluster-dwdzfb-shard-dn8-0", "spec": {"highWatermark":"0","volumes":[]}} 2024-06-27T11:36:52Z INFO HTTPServer Starting HTTP Server 2024-06-27T11:36:52Z INFO HTTPServer API route path {"method": "GET", "path": ["/v1.0/describeuser", "/v1.0/query", "/v1.0/listusers", "/v1.0/healthycheck", "/v1.0/getrole", "/v1.0/checkrole", "/v1.0/listsystemaccounts"]} 2024-06-27T11:36:52Z INFO HTTPServer API route path {"method": "POST", "path": ["/v1.0/exec", "/v1.0/createuser", "/v1.0/lockinstance", "/v1.0/getlag", "/v1.0/leavemember", "/v1.0/dataload", "/v1.0/deleteuser", "/v1.0/datadump", "/v1.0/revokeuserrole", "/v1.0/switchover", "/v1.0/unlockinstance", "/v1.0/preterminate", "/v1.0/rebuild", "/v1.0/grantuserrole", "/v1.0/volumeprotection", "/v1.0/postprovision", "/v1.0/checkrunning", "/v1.0/joinmember"]} 2024-06-27T11:36:52Z INFO cronjobs env is not set {"env": "KB_CRON_JOBS"} 2024-06-27T11:37:00Z INFO Redis DB startup ready 2024-06-27T11:37:00Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=rcluster-dwdzfb,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=shard-dn8 2024-06-27T11:37:00Z INFO DCS-K8S podlist: 2 2024-06-27T11:37:00Z INFO DCS-K8S Leader configmap is not found {"configmap": "rcluster-dwdzfb-shard-dn8-leader"} 2024-06-27T11:37:00Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=rcluster-dwdzfb,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=shard-dn8 2024-06-27T11:37:00Z INFO DCS-K8S podlist: 2 2024-06-27T11:37:00Z DEBUG checkrole check member {"member": "rcluster-dwdzfb-shard-dn8-0", "role": ""} 2024-06-27T11:37:00Z DEBUG checkrole check member {"member": "rcluster-dwdzfb-shard-dn8-1", "role": "primary"} 2024-06-27T11:37:00Z INFO checkrole there is a another leader {"member": "rcluster-dwdzfb-shard-dn8-1"} 2024-06-27T11:37:00Z INFO checkrole another leader's lorry is online, just ignore {"member": "rcluster-dwdzfb-shard-dn8-1"} 2024-06-27T11:37:00Z INFO event send event: map[event:Success operation:checkRole originalRole:waitForStart role:{"term":"1719488220892607","PodRoleNamePairs":[{"podName":"rcluster-dwdzfb-shard-dn8-0","roleName":"primary","podUid":"39f7728d-bc8b-4efc-85c3-5f6b49509b99"}]}] 2024-06-27T11:37:00Z INFO event send event success {"message": "{\"event\":\"Success\",\"operation\":\"checkRole\",\"originalRole\":\"waitForStart\",\"role\":\"{\\"term\\":\\"1719488220892607\\",\\"PodRoleNamePairs\\":[{\\"podName\\":\\"rcluster-dwdzfb-shard-dn8-0\\",\\"roleName\\":\\"primary\\",\\"podUid\\":\\"39f7728d-bc8b-4efc-85c3-5f6b49509b99\\"}]}\"}"}

kubectl logs rcluster-dwdzfb-shard-dn8-1 lorry 2024-06-27T11:36:38Z INFO Initialize DB manager 2024-06-27T11:36:38Z INFO KB_WORKLOAD_TYPE ENV not set 2024-06-27T11:36:38Z INFO Volume-Protection succeed to init volume protection {"pod": "rcluster-dwdzfb-shard-dn8-1", "spec": {"highWatermark":"0","volumes":[]}} 2024-06-27T11:36:38Z INFO HTTPServer Starting HTTP Server 2024-06-27T11:36:38Z INFO HTTPServer API route path {"method": "POST", "path": ["/v1.0/joinmember", "/v1.0/preterminate", "/v1.0/checkrunning", "/v1.0/revokeuserrole", "/v1.0/volumeprotection", "/v1.0/createuser", "/v1.0/getlag", "/v1.0/unlockinstance", "/v1.0/switchover", "/v1.0/exec", "/v1.0/deleteuser", "/v1.0/lockinstance", "/v1.0/datadump", "/v1.0/leavemember", "/v1.0/postprovision", "/v1.0/dataload", "/v1.0/rebuild", "/v1.0/grantuserrole"]} 2024-06-27T11:36:38Z INFO HTTPServer API route path {"method": "GET", "path": ["/v1.0/describeuser", "/v1.0/listusers", "/v1.0/listsystemaccounts", "/v1.0/checkrole", "/v1.0/query", "/v1.0/healthycheck", "/v1.0/getrole"]} 2024-06-27T11:36:38Z INFO cronjobs env is not set {"env": "KB_CRON_JOBS"} 2024-06-27T11:36:46Z INFO Redis DB startup ready 2024-06-27T11:36:46Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=rcluster-dwdzfb,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=shard-dn8 2024-06-27T11:36:46Z INFO DCS-K8S podlist: 2 2024-06-27T11:36:46Z INFO DCS-K8S Leader configmap is not found {"configmap": "rcluster-dwdzfb-shard-dn8-leader"} 2024-06-27T11:36:46Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=rcluster-dwdzfb,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=shard-dn8 2024-06-27T11:36:46Z INFO DCS-K8S podlist: 2 2024-06-27T11:36:46Z DEBUG checkrole check member {"member": "rcluster-dwdzfb-shard-dn8-0", "role": "primary"} 2024-06-27T11:36:46Z INFO checkrole there is a another leader {"member": "rcluster-dwdzfb-shard-dn8-0"} 2024-06-27T11:36:46Z INFO checkrole another leader's lorry is online, just ignore {"member": "rcluster-dwdzfb-shard-dn8-0"} 2024-06-27T11:36:46Z DEBUG checkrole check member {"member": "rcluster-dwdzfb-shard-dn8-1", "role": ""} 2024-06-27T11:36:46Z INFO event send event: map[event:Success operation:checkRole originalRole:waitForStart role:{"term":"1719488206484389","PodRoleNamePairs":[{"podName":"rcluster-dwdzfb-shard-dn8-1","roleName":"primary","podUid":"cc4f62b3-9a38-441d-8922-96ce0b9f985f"}]}] 2024-06-27T11:36:46Z INFO event send event success {"message": "{\"event\":\"Success\",\"operation\":\"checkRole\",\"originalRole\":\"waitForStart\",\"role\":\"{\\"term\\":\\"1719488206484389\\",\\"PodRoleNamePairs\\":[{\\"podName\\":\\"rcluster-dwdzfb-shard-dn8-1\\",\\"roleName\\":\\"primary\\",\\"podUid\\":\\"cc4f62b3-9a38-441d-8922-96ce0b9f985f\\"}]}\"}"} ➜ ~