Describe the bug
redis cluster pod crash upgrade from 0.6.2 to 0.8.
To Reproduce
Steps to reproduce the behavior:
install kb 0.6.2
create redis cluster
upgrade from 0.6.2 to 0.7.2 to 0.8
ops hscale out 4 --> stop --> start --> hscale in 2 --> reconfig
See error
kubectl get cluster
NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE
redis-upkb678 redis redis-7.0.6 WipeOut Abnormal 4h56m
kubectl get pod
NAME READY STATUS RESTARTS AGE
redis-upkb678-redis-0 2/3 CrashLoopBackOff 12 (3m53s ago) 42m
redis-upkb678-redis-1 2/3 CrashLoopBackOff 12 (2m51s ago) 40m
Normal AllReplicasReady 54m (x6 over 115m) cluster-controller all pods of components are ready, waiting for the probe detection successful
Normal Running 54m (x5 over 103m) cluster-controller Cluster: redis-upkb678 is ready, current phase is Running
Normal ClusterReady 54m (x5 over 103m) cluster-controller Cluster: redis-upkb678 is ready, current phase is Running
Normal ComponentPhaseTransition 54m (x10 over 115m) cluster-controller component is Running
Normal ComponentPhaseTransition 53m (x9 over 104m) cluster-controller component is Updating
Normal ComponentPhaseTransition 38m cluster-controller component is Failed
Warning ComponentsNotReady 38m (x7 over 115m) cluster-controller pods are unavailable in Components: [redis], refer to related component message in Cluster.status.components
Warning ReplicasNotReady 38m (x7 over 115m) cluster-controller pods are not ready in Components: [redis], refer to related component message in Cluster.status.components
Normal HorizontalScale 33m (x3 over 102m) component-controller start horizontal scale component redis of cluster redis-upkb678 from 2 to 4
Normal PreCheckSucceed 28m (x8 over 104m) cluster-controller The operator has started the provisioning of Cluster: redis-upkb678
Warning Unhealthy 10m (x2 over 27m) event-controller Pod redis-upkb678-redis-sentinel-1: Readiness probe failed: Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
timeout: the monitored command dumped core
Warning Unhealthy 10m (x2 over 23m) event-controller Pod redis-upkb678-redis-sentinel-2: Readiness probe errored: command "sh -c /scripts/redis-sentinel-ping.sh 1" timed out
Warning BackOff 5m3s (x8 over 37m) event-controller Pod redis-upkb678-redis-1: Back-off restarting failed container redis in pod redis-upkb678-redis-1_default(7746e76c-95c5-4502-b417-27c0ffc4dd43)
Warning Unhealthy 2m34s (x10 over 27m) event-controller Pod redis-upkb678-redis-sentinel-1: Readiness probe errored: command "sh -c /scripts/redis-sentinel-ping.sh 1" timed out
Warning BackOff 62s (x10 over 38m) event-controller Pod redis-upkb678-redis-0: Back-off restarting failed container redis in pod redis-upkb678-redis-0_default(75d063f9-e39c-4854-a14e-4f05c5b52aa5)
retry redis-cli -h 127.0.0.1 -p 6379 -a qkf4zpll ping
local max_attempts=20
local attempt=1
redis-cli -h 127.0.0.1 -p 6379 -a qkf4zpll ping
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 127.0.0.1:6379: Connection refused
Command 'redis-cli -h 127.0.0.1 -p 6379 -a qkf4zpll ping' failed. Attempt 1 of 20. Retrying in 5 seconds...
[ 1 -eq 20 ]
echo Command 'redis-cli -h 127.0.0.1 -p 6379 -a qkf4zpll ping' failed. Attempt 1 of 20. Retrying in 5 seconds...
Describe the bug redis cluster pod crash upgrade from 0.6.2 to 0.8.
To Reproduce Steps to reproduce the behavior:
kubectl get pod NAME READY STATUS RESTARTS AGE redis-upkb678-redis-0 2/3 CrashLoopBackOff 12 (3m53s ago) 42m redis-upkb678-redis-1 2/3 CrashLoopBackOff 12 (2m51s ago) 40m
kubectl describe cluster redis-upkb678 Name: redis-upkb678 Namespace: default Labels: app.kubernetes.io/instance=redis-upkb678 clusterdefinition.kubeblocks.io/name=redis clusterversion.kubeblocks.io/name=redis-7.0.6 Annotations: kubeblocks.io/ops-request: [{"name":"redis-upkb678-reconfiguring-r5gt6","type":"Reconfiguring"}] kubeblocks.io/reconcile: 2024-01-11T11:24:47.736690603Z API Version: apps.kubeblocks.io/v1alpha1 Kind: Cluster Metadata: Creation Timestamp: 2024-01-11T06:29:02Z Finalizers: cluster.kubeblocks.io/finalizer Generation: 9 Managed Fields: API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:spec: .: f:affinity: .: f:podAntiAffinity: f:tenancy: f:clusterDefinitionRef: f:clusterVersionRef: f:componentSpecs: .: k:{"name":"redis"}: .: f:componentDefRef: f:name: f:noCreatePDB: f:resources: .: f:limits: .: f:cpu: f:memory: f:requests: .: f:cpu: f:memory: f:serviceAccountName: f:switchPolicy: .: f:type: k:{"name":"redis-sentinel"}: .: f:componentDefRef: f:name: f:noCreatePDB: f:resources: .: f:limits: .: f:cpu: f:memory: f:requests: .: f:cpu: f:memory: f:serviceAccountName: f:volumeClaimTemplates: f:terminationPolicy: Manager: kbcli_0.6.2 Operation: Update Time: 2024-01-11T06:29:02Z API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:labels: f:app.kubernetes.io/instance: f:spec: f:componentSpecs: k:{"name":"redis"}: f:monitor: k:{"name":"redis-sentinel"}: f:monitor: Manager: kbcli Operation: Update Time: 2024-01-11T10:33:25Z API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:spec: f:componentSpecs: k:{"name":"redis"}: f:replicas: Manager: kubectl-edit Operation: Update Time: 2024-01-11T10:57:26Z API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:status: .: f:clusterDefGeneration: f:components: .: f:redis: .: f:membersStatus: f:message: .: f:Pod/redis-upkb678-redis-0: f:Pod/redis-upkb678-redis-1: f:phase: f:podsReady: f:podsReadyTime: f:replicationSetStatus: .: f:primary: .: f:pod: f:secondaries: f:redis-sentinel: .: f:phase: f:podsReady: f:podsReadyTime: f:conditions: f:observedGeneration: f:phase: Manager: manager Operation: Update Subresource: status Time: 2024-01-11T10:57:29Z API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubeblocks.io/ops-request: f:kubeblocks.io/reconcile: f:finalizers: .: v:"cluster.kubeblocks.io/finalizer": f:labels: .: f:clusterdefinition.kubeblocks.io/name: f:clusterversion.kubeblocks.io/name: f:spec: f:componentSpecs: k:{"name":"redis"}: f:volumeClaimTemplates: k:{"name":"redis-sentinel"}: f:replicas: f:monitor: f:resources: .: f:cpu: f:memory: f:storage: .: f:size: Manager: manager Operation: Update Time: 2024-01-11T11:24:47Z Resource Version: 214793820 UID: d506a601-1c14-45a8-a7e2-8a10f1c86614 Spec: Affinity: Pod Anti Affinity: Preferred Tenancy: SharedNode Cluster Definition Ref: redis Cluster Version Ref: redis-7.0.6 Component Specs: Component Def Ref: redis Monitor: true Name: redis No Create PDB: false Replicas: 2 Resources: Limits: Cpu: 100m Memory: 512Mi Requests: Cpu: 100m Memory: 512Mi Rsm Transform Policy: ToSts Service Account Name: kb-redis-upkb678 Switch Policy: Type: Noop Volume Claim Templates: Name: data Spec: Access Modes: ReadWriteOnce Resources: Requests: Storage: 4Gi Component Def Ref: redis-sentinel Monitor: true Name: redis-sentinel No Create PDB: false Replicas: 3 Resources: Limits: Cpu: 100m Memory: 512Mi Requests: Cpu: 100m Memory: 512Mi Rsm Transform Policy: ToSts Service Account Name: kb-redis-upkb678 Volume Claim Templates: Name: data Spec: Access Modes: ReadWriteOnce Resources: Requests: Storage: 1Gi Monitor: Resources: Cpu: 0 Memory: 0 Storage: Size: 0 Termination Policy: WipeOut Status: Cluster Def Generation: 4 Components: Redis: Members Status: Pod Name: redis-upkb678-redis-0 Role: Access Mode: ReadWrite Can Vote: true Is Leader: true Name: primary Pod Name: redis-upkb678-redis-1 Role: Access Mode: Readonly Can Vote: true Is Leader: false Name: secondary Message: Pod/redis-upkb678-redis-0: back-off 5m0s restarting failed container=redis pod=redis-upkb678-redis-0_default(75d063f9-e39c-4854-a14e-4f05c5b52aa5) Pod/redis-upkb678-redis-1: back-off 5m0s restarting failed container=redis pod=redis-upkb678-redis-1_default(7746e76c-95c5-4502-b417-27c0ffc4dd43) Phase: Failed Pods Ready: false Pods Ready Time: 2024-01-11T10:33:00Z Replication Set Status: Primary: Pod: redis-upkb678-redis-0 Secondaries: Pod: redis-upkb678-redis-1 Redis - Sentinel: Phase: Running Pods Ready: true Pods Ready Time: 2024-01-11T10:57:27Z Conditions: Last Transition Time: 2024-01-11T06:29:03Z Message: The operator has started the provisioning of Cluster: redis-upkb678 Observed Generation: 9 Reason: PreCheckSucceed Status: True Type: ProvisioningStarted Last Transition Time: 2024-01-11T06:39:58Z Message: Successfully applied for resources Observed Generation: 9 Reason: ApplyResourcesSucceed Status: True Type: ApplyResources Last Transition Time: 2024-01-11T10:47:28Z Message: pods are not ready in Components: [redis], refer to related component message in Cluster.status.components Reason: ReplicasNotReady Status: False Type: ReplicasReady Last Transition Time: 2024-01-11T10:47:28Z Message: pods are unavailable in Components: [redis], refer to related component message in Cluster.status.components Reason: ComponentsNotReady Status: False Type: Ready Observed Generation: 9 Phase: Abnormal Events: Type Reason Age From Message
Normal AllReplicasReady 54m (x6 over 115m) cluster-controller all pods of components are ready, waiting for the probe detection successful Normal Running 54m (x5 over 103m) cluster-controller Cluster: redis-upkb678 is ready, current phase is Running Normal ClusterReady 54m (x5 over 103m) cluster-controller Cluster: redis-upkb678 is ready, current phase is Running Normal ComponentPhaseTransition 54m (x10 over 115m) cluster-controller component is Running Normal ComponentPhaseTransition 53m (x9 over 104m) cluster-controller component is Updating Normal ComponentPhaseTransition 38m cluster-controller component is Failed Warning ComponentsNotReady 38m (x7 over 115m) cluster-controller pods are unavailable in Components: [redis], refer to related component message in Cluster.status.components Warning ReplicasNotReady 38m (x7 over 115m) cluster-controller pods are not ready in Components: [redis], refer to related component message in Cluster.status.components Normal HorizontalScale 33m (x3 over 102m) component-controller start horizontal scale component redis of cluster redis-upkb678 from 2 to 4 Normal PreCheckSucceed 28m (x8 over 104m) cluster-controller The operator has started the provisioning of Cluster: redis-upkb678 Warning Unhealthy 10m (x2 over 27m) event-controller Pod redis-upkb678-redis-sentinel-1: Readiness probe failed: Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. timeout: the monitored command dumped core Warning Unhealthy 10m (x2 over 23m) event-controller Pod redis-upkb678-redis-sentinel-2: Readiness probe errored: command "sh -c /scripts/redis-sentinel-ping.sh 1" timed out Warning BackOff 5m3s (x8 over 37m) event-controller Pod redis-upkb678-redis-1: Back-off restarting failed container redis in pod redis-upkb678-redis-1_default(7746e76c-95c5-4502-b417-27c0ffc4dd43) Warning Unhealthy 2m34s (x10 over 27m) event-controller Pod redis-upkb678-redis-sentinel-1: Readiness probe errored: command "sh -c /scripts/redis-sentinel-ping.sh 1" timed out Warning BackOff 62s (x10 over 38m) event-controller Pod redis-upkb678-redis-0: Back-off restarting failed container redis in pod redis-upkb678-redis-0_default(75d063f9-e39c-4854-a14e-4f05c5b52aa5)
kubectl logs redis-upkb678-redis-0 redis
describe pod
Expected behavior redis cluster ok.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context Add any other context about the problem here.