apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.06k stars 167 forks source link

[BUG] redis cluster sentinel pod crash after hscale in and restart #6538

Closed JashBook closed 3 months ago

JashBook commented 7 months ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. create cluster
    apiVersion: apps.kubeblocks.io/v1alpha1
    kind: Cluster
    metadata:
    name: redis-imvtol
    namespace: default
    spec:
    clusterDefinitionRef: redis
    clusterVersionRef: redis-7.0.6
    terminationPolicy: Delete
    componentSpecs:
    - name: redis
      componentDef: redis
      replicas: 2
      resources:
        requests:
          cpu: 100m
          memory: 0.5Gi
        limits:
          cpu: 100m
          memory: 0.5Gi
      switchPolicy:
        type: Noop
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    - name: redis-sentinel
      componentDef: redis-sentinel
      replicas: 3
      resources:
        requests:
          cpu: 100m
          memory: 0.5Gi
        limits:
          cpu: 100m
          memory: 0.5Gi
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
  2. ops reconfigure
    kbcli cluster configure redis-imvtol --auto-approve                 --set maxclients=10001 --components redis --config-spec redis-replication-config  --namespace default

    test failover

    
    kubectl delete pod redis-imvtol-redis-0  --namespace default

kbcli cluster list-instances redis-imvtol --namespace default

NAME NAMESPACE CLUSTER COMPONENT STATUS ROLE ACCESSMODE AZ CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE NODE CREATED-TIME
redis-imvtol-redis-0 default redis-imvtol redis Running secondary 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 16:00 UTC+0800
redis-imvtol-redis-1 default redis-imvtol redis Running primary 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 15:57 UTC+0800
redis-imvtol-redis-sentinel-0 default redis-imvtol redis-sentinel Running 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 15:57 UTC+0800
redis-imvtol-redis-sentinel-1 default redis-imvtol redis-sentinel Running 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 15:57 UTC+0800
redis-imvtol-redis-sentinel-2 default redis-imvtol redis-sentinel Running 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 15:57 UTC+0800

volume expand

kbcli cluster volume-expand redis-imvtol --auto-approve --components redis --volume-claim-templates data --storage 5Gi --namespace default

vscale

kbcli cluster vscale redis-imvtol --auto-approve --components redis --cpu 200m --memory 0.6Gi --namespace default

hscale out 4
  `kbcli cluster hscale redis-imvtol --auto-approve --components redis --replicas 4 --namespace default`

  `kbcli cluster list-instances redis-imvtol --namespace default `

NAME NAMESPACE CLUSTER COMPONENT STATUS ROLE ACCESSMODE AZ CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE NODE CREATED-TIME
redis-imvtol-redis-0 default redis-imvtol redis Running secondary 200m / 200m 644245094400m / 644245094400m data:5Gi minikube/192.168.49.2 Jan 25,2024 16:01 UTC+0800
redis-imvtol-redis-1 default redis-imvtol redis Running primary 200m / 200m 644245094400m / 644245094400m data:5Gi minikube/192.168.49.2 Jan 25,2024 16:02 UTC+0800
redis-imvtol-redis-2 default redis-imvtol redis Running secondary 200m / 200m 644245094400m / 644245094400m data:5Gi minikube/192.168.49.2 Jan 25,2024 16:02 UTC+0800
redis-imvtol-redis-3 default redis-imvtol redis Running secondary 200m / 200m 644245094400m / 644245094400m data:5Gi minikube/192.168.49.2 Jan 25,2024 16:02 UTC+0800
redis-imvtol-redis-sentinel-0 default redis-imvtol redis-sentinel Running 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 15:57 UTC+0800
redis-imvtol-redis-sentinel-1 default redis-imvtol redis-sentinel Running 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 15:57 UTC+0800
redis-imvtol-redis-sentinel-2 default redis-imvtol redis-sentinel Running

stop start

kbcli cluster stop redis-imvtol --auto-approve --namespace default kbcli cluster start redis-imvtol --namespace default

redis bench

redis-benchmark -h redis-imvtol-redis-redis.default.svc -p 6379 -a "4QlO6f328n" -n 1000 -c 2 -q

hscale in 2

kbcli cluster hscale redis-imvtol --auto-approve --components redis --replicas 2 --namespace default

kbcli cluster list-instances redis-imvtol --namespace default

NAME NAMESPACE CLUSTER COMPONENT STATUS ROLE ACCESSMODE AZ CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE NODE CREATED-TIME
redis-imvtol-redis-0 default redis-imvtol redis Running primary 200m / 200m 644245094400m / 644245094400m data:5Gi minikube/192.168.49.2 Jan 25,2024 16:03 UTC+0800
redis-imvtol-redis-1 default redis-imvtol redis Running secondary 200m / 200m 644245094400m / 644245094400m data:5Gi minikube/192.168.49.2 Jan 25,2024 16:03 UTC+0800
redis-imvtol-redis-sentinel-0 default redis-imvtol redis-sentinel Running 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 16:03 UTC+0800
redis-imvtol-redis-sentinel-1 default redis-imvtol redis-sentinel Running 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 16:03 UTC+0800
redis-imvtol-redis-sentinel-2 default redis-imvtol redis-sentinel Running 100m / 100m 512Mi / 512Mi data:1Gi minikube/192.168.49.2 Jan 25,2024 16:03 UTC+0800

restart sentinel pod crash

kbcli cluster restart redis-imvtol --auto-approve --namespace default


4. See error

kubectl get cluster redis-imvtol NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE redis-imvtol redis redis-7.0.6 Delete Abnormal 33m

➜ ~ kubectl get pod,ops -l app.kubernetes.io/instance=redis-imvtol NAME READY STATUS RESTARTS AGE pod/redis-imvtol-redis-0 3/3 Running 0 26m pod/redis-imvtol-redis-1 3/3 Running 0 26m pod/redis-imvtol-redis-sentinel-0 1/1 Running 0 26m pod/redis-imvtol-redis-sentinel-1 1/1 Running 0 26m pod/redis-imvtol-redis-sentinel-2 0/1 CrashLoopBackOff 10 (30s ago) 26m

NAME TYPE CLUSTER STATUS PROGRESS AGE opsrequest.apps.kubeblocks.io/redis-imvtol-restart-jm5bj Restart redis-imvtol Failed 0/0 26m


logs crash pod 

kubectl logs redis-imvtol-redis-sentinel-2 --previous

FATAL CONFIG FILE ERROR (Redis 7.0.9) Reading the configuration file, at line 30

'sentinel known-replica redis-imvtol-redis redis-imvtol-redis-1.redis-imvtol-redis-headless.default.svc 6379' Duplicate hostname and port for replica.

Expected behavior redis cluster success after hscale in and restart

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

github-actions[bot] commented 6 months ago

This issue has been marked as stale because it has been open for 30 days with no activity