apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
1.75k stars 155 forks source link

[BUG]redis pod crash after restart for Duplicate hostname and port for replica #7598

Open ahjing99 opened 2 weeks ago

ahjing99 commented 2 weeks ago

➜ ~ kbcli version Kubernetes: v1.27.11-gke.1062004 KubeBlocks: 0.9.0-beta.35 kbcli: 0.9.0-beta.27

  1. create cluster

    
    apiVersion: apps.kubeblocks.io/v1alpha1
    kind: Cluster
    metadata:
    name: redis-jzenfb
    namespace: default
    spec:
    terminationPolicy: Halt
    componentSpecs:
    - name: redis
      componentDef: redis-7
      replicas: 2
      resources:
        requests:
          cpu: 100m
          memory: 0.5Gi
        limits:
          cpu: 100m
          memory: 0.5Gi
      switchPolicy:
        type: Noop
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    - name: redis-sentinel
      componentDef: redis-sentinel-7
      replicas: 3
      resources:
        requests:
          cpu: 100m
          memory: 0.5Gi
        limits:
          cpu: 100m
          memory: 0.5Gi
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    - name: redis-twemproxy
      componentDef: redis-twemproxy-0.5
      replicas: 3
      resources:
        requests:
          cpu: 100m
          memory: 0.5Gi
        limits:
          cpu: 100m
          memory: 0.5Gi
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    
      `kubectl apply -f test_create_redis-jzenfb.yaml`

cluster.apps.kubeblocks.io/redis-jzenfb created

2.  Do some ops

kbcli cluster expose redis-jzenfb --auto-approve --force=true --type internet --enable true --components redis --namespace default

OpsRequest redis-jzenfb-expose-ltxpk created successfully, you can view the progress: kbcli cluster describe-ops redis-jzenfb-expose-ltxpk -n default

kbcli cluster configure redis-jzenfb --auto-approve --force=true --set maxclients=10001 --components redis --config-spec redis-replication-config --config-file redis.conf --namespace default

Will updated configure file meta: ConfigSpec: redis-replication-config ConfigFile: redis.conf ComponentName: redis ClusterName: redis-jzenfb OpsRequest redis-jzenfb-reconfiguring-tqxcr created successfully, you can view the progress: kbcli cluster describe-ops redis-jzenfb-reconfiguring-tqxcr -n default

kubectl apply -f benchtest-redis-jzenfb.yaml

pod/benchtest-redis-jzenfb created apply benchtest-redis-jzenfb.yaml Success

kbcli cluster volume-expand redis-jzenfb --auto-approve --force=true --components redis --volume-claim-templates data --storage 3Gi --namespace default

OpsRequest redis-jzenfb-volumeexpansion-mjgx5 created successfully, you can view the progress: kbcli cluster describe-ops redis-jzenfb-volumeexpansion-mjgx5 -n default


3. Stop  then start

kbcli cluster stop redis-jzenfb --auto-approve --force=true --namespace default

OpsRequest redis-jzenfb-stop-2r4pj created successfully, you can view the progress: kbcli cluster describe-ops redis-jzenfb-stop-2r4pj -n default

  `kbcli cluster start redis-jzenfb --force=true --namespace default `

OpsRequest redis-jzenfb-start-wnb7h created successfully, you can view the progress: kbcli cluster describe-ops redis-jzenfb-start-wnb7h -n default


4. Pod crash

➜ ~ k get pod | grep redis redis-jzenfb-redis-0 3/3 Running 0 12m redis-jzenfb-redis-1 2/3 CrashLoopBackOff 6 (106s ago) 12m redis-jzenfb-redis-sentinel-0 0/1 CrashLoopBackOff 7 (38s ago) 11m redis-jzenfb-redis-sentinel-1 1/1 Running 0 12m redis-jzenfb-redis-sentinel-2 1/1 Running 0 12m redis-jzenfb-redis-twemproxy-0 1/1 Running 0 10m redis-jzenfb-redis-twemproxy-1 1/1 Running 0 11m redis-jzenfb-redis-twemproxy-2 1/1 Running 0 12m

➜ ~ k logs redis-jzenfb-redis-sentinel-0

FATAL CONFIG FILE ERROR (Redis 7.2.4) Reading the configuration file, at line 28

'sentinel known-replica redis-jzenfb-redis redis-jzenfb-redis-1.redis-jzenfb-redis-headless.default.svc 6379' Duplicate hostname and port for replica.

➜ ~ k logs redis-jzenfb-redis-1 Defaulted container "redis" out of: redis, metrics, lorry, init-lorry (init)

ahjing99 commented 1 week ago

lower the severity since cannot reproduce

Y-Rookie commented 1 week ago

this issue might be caused by the simultaneous restart of both the Redis and Redis Sentinel components, leading to some kind of anomaly in the Sentinel. It is recommended to restart each component individually. In future versions, the restart operation will be made sequential.