OT-CONTAINER-KIT / redis-operator

A golang based redis operator that will make/oversee Redis standalone/cluster/replication/sentinel mode setup on top of the Kubernetes.
https://ot-redis-operator.netlify.app/
Apache License 2.0
831 stars 229 forks source link

when helm upgrade some wrong happend ,one follower get "Error condition on socket for SYNC: Host is unreachable" #165

Closed mu1345 closed 3 years ago

mu1345 commented 3 years ago

What version of redis operator are you using?

kubectl logs <_redis-operator_pod_name> -n <namespace>

redis-operator logs looks normal :

"Request.RedisManager.Name": "redis-cluster-leader-0", "ip": "10.224.0.89"}
2021-10-22T12:21:38.697Z    INFO    controller_redis    Redis cluster nodes are listed  {"Request.RedisManager.Namespace": "redis-operator", "Request.RedisManager.Name": "redis-cluster", "Output": "ee230f94ecf074bf4a558e294b96a8d64b6223ca 10.224.0.89:6379@16379 myself,slave 3fdbd0cf35ef0eac9abba92153b709dbe44b7c65 0 1634905298000 4 connected\n8e612d24cbf7e7cae4d5d002940e5a1b3d8b19ae 10.224.0.84:6379@16379 master - 0 1634905297000 3 connected 10923-16383\n2b6d423cfdff772724c7ed6fb77778db2108273f 10.224.0.85:6379@16379 slave 8e612d24cbf7e7cae4d5d002940e5a1b3d8b19ae 0 1634905298508 3 connected\n343c3fa9936870ee6ac5bfa845504f4cb88d8d2b 10.224.0.86:6379@16379 slave 24348601cf967c43f591a0afa57a483117809822 0 1634905296703 2 connected\n3fdbd0cf35ef0eac9abba92153b709dbe44b7c65 10.224.0.88:6379@16379 master - 0 1634905297000 4 connected 0-5460\n24348601cf967c43f591a0afa57a483117809822 10.224.0.87:6379@16379 master - 0 1634905297706 2 connected 5461-10922\n"}
2021-10-22T12:21:38.697Z    INFO    controller_redis    Total number of redis nodes are {"Request.RedisManager.Namespace": "redis-operator", "Request.RedisManager.Name": "redis-cluster", "Nodes": "6"}
2021-10-22T12:21:38.697Z    INFO    controllers.RedisCluster    Redis leader count is desired   {"Request.Namespace": "redis-operator", "Request.Name": "redis-cluster"}
2021-10-22T12:21:38.699Z    INFO    controller_redis    Successfully got the ip for redis   {"Request.RedisManager.Namespace": "redis-operator","Request.RedisManager.Name": "redis-cluster-leader-0", "ip": "10.224.0.89"}
2021-10-22T12:21:38.699Z    INFO    controller_redis    Redis cluster nodes are listed  {"Request.RedisManager.Namespace": "redis-operator", "Request.RedisManager.Name": "redis-cluster", "Output": "ee230f94ecf074bf4a558e294b96a8d64b6223ca 10.224.0.89:6379@16379 myself,slave 3fdbd0cf35ef0eac9abba92153b709dbe44b7c65 0 1634905298000 4 connected\n8e612d24cbf7e7cae4d5d002940e5a1b3d8b19ae 10.224.0.84:6379@16379 master - 0 1634905297000 3 connected 10923-16383\n2b6d423cfdff772724c7ed6fb77778db2108273f 10.224.0.85:6379@16379 slave 8e612d24cbf7e7cae4d5d002940e5a1b3d8b19ae 0 1634905298508 3 connected\n343c3fa9936870ee6ac5bfa845504f4cb88d8d2b 10.224.0.86:6379@16379 slave 24348601cf967c43f591a0afa57a483117809822 0 1634905296703 2 connected\n3fdbd0cf35ef0eac9abba92153b709dbe44b7c65 10.224.0.88:6379@16379 master - 0 1634905297000 4 connected 0-5460\n24348601cf967c43f591a0afa57a483117809822 10.224.0.87:6379@16379 master - 0 1634905297706 2 connected 5461-10922\n"}
2021-10-22T12:21:38.699Z    INFO    controller_redis    Number of failed nodes in cluster   {"Request.RedisManager.Namespace": "redis-operator","Request.RedisManager.Name": "redis-cluster", "Failed Node Count": 0}

redis-operator version: v0.8.0 、v0.9.0

Does this issue reproduce with the latest release? yes

What operating system and processor architecture are you using (kubectl version)?

kubectl version Output
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.9", GitCommit:"2e808b7cb054ee242b68e62455323aa783991f03", GitTreeState:"clean", BuildDate:"2020-01-18T23:33:14Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.9", GitCommit:"2e808b7cb054ee242b68e62455323aa783991f03", GitTreeState:"clean", BuildDate:"2020-01-18T23:24:23Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

What did you do? Updated configuration of request resource ,then run helm upgrade -i redis-cluster redis-cluster --namespace redis-operator

What did you see instead?

[root@unity-170 redis]# kubectl exec -n redis-operator redis-cluster-follower-1 -- redis-cli -c cluster nodes Defaulting container name to redis-cluster-follower. Use 'kubectl describe pod/redis-cluster-follower-1 -n redis-operator' to see all of the containers in this pod. 3fdbd0cf35ef0eac9abba92153b709dbe44b7c65 10.224.0.88:6379@16379 master - 0 1634905328000 4 connected 0-5460 8e612d24cbf7e7cae4d5d002940e5a1b3d8b19ae 10.224.0.84:6379@16379 master - 0 1634905328000 3 connected 10923-16383 ee230f94ecf074bf4a558e294b96a8d64b6223ca 10.224.0.89:6379@16379 slave 3fdbd0cf35ef0eac9abba92153b709dbe44b7c65 0 1634905328000 4 connected 2b6d423cfdff772724c7ed6fb77778db2108273f 10.224.0.85:6379@16379 slave 8e612d24cbf7e7cae4d5d002940e5a1b3d8b19ae 0 1634905329091 3 connected 343c3fa9936870ee6ac5bfa845504f4cb88d8d2b 10.224.0.86:6379@16379 myself,slave 24348601cf967c43f591a0afa57a483117809822 0 1634905326000 2 connected 24348601cf967c43f591a0afa57a483117809822 10.224.0.87:6379@16379 master - 0 1634905327598 2 connected 5461-10922 [root@unity-170 redis]# kubectl logs -f -n redis-operator redis-cluster-follower-1 redis-cluster-follower | more Redis is running without password which is not recommended Starting redis service in cluster mode..... 10:C 22 Oct 2021 11:53:21.394 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 10:C 22 Oct 2021 11:53:21.394 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=10, just started 10:C 22 Oct 2021 11:53:21.394 # Configuration loaded 10:M 22 Oct 2021 11:53:21.395 monotonic clock: POSIX clock_gettime 10:M 22 Oct 2021 11:53:21.396 Node configuration loaded, I'm 343c3fa9936870ee6ac5bfa845504f4cb88d8d2b 10:M 22 Oct 2021 11:53:21.396 Running mode=cluster, port=6379. 10:M 22 Oct 2021 11:53:21.396 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 10:M 22 Oct 2021 11:53:21.396 # Server initialized 10:M 22 Oct 2021 11:53:21.397 Reading RDB preamble from AOF file... 10:M 22 Oct 2021 11:53:21.397 Loading RDB produced by version 6.2.5 10:M 22 Oct 2021 11:53:21.397 RDB age 27498 seconds 10:M 22 Oct 2021 11:53:21.397 RDB memory usage when created 2.48 Mb 10:M 22 Oct 2021 11:53:21.397 RDB has an AOF tail 10:M 22 Oct 2021 11:53:21.397 Reading the remaining AOF tail... 10:M 22 Oct 2021 11:53:21.397 DB loaded from append only file: 0.000 seconds 10:M 22 Oct 2021 11:53:21.397 Ready to accept connections 10:S 22 Oct 2021 11:53:21.398 Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize wit h the new master with just a partial transfer. 10:S 22 Oct 2021 11:53:21.398 Connecting to MASTER 10.224.0.81:6379 10:S 22 Oct 2021 11:53:21.398 MASTER <-> REPLICA sync started 10:S 22 Oct 2021 11:53:21.398 # Cluster state changed: ok 10:S 22 Oct 2021 11:53:24.464 # Error condition on socket for SYNC: Host is unreachable 10:S 22 Oct 2021 11:53:25.409 Connecting to MASTER 10.224.0.81:6379 10:S 22 Oct 2021 11:53:25.409 MASTER <-> REPLICA sync started 10:S 22 Oct 2021 11:53:27.536 # Error condition on socket for SYNC: Host is unreachable 10:S 22 Oct 2021 11:53:28.418 Connecting to MASTER 10.224.0.81:6379 10:S 22 Oct 2021 11:53:28.418 MASTER <-> REPLICA sync started 10:S 22 Oct 2021 11:53:30.608 # Error condition on socket for SYNC: Host is unreachable 10:S 22 Oct 2021 11:53:31.428 Connecting to MASTER 10.224.0.81:6379 10:S 22 Oct 2021 11:53:31.428 MASTER <-> REPLICA sync started

[root@unity-170 redis]# kubectl get pod -n redis-operator -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES redis-cluster-follower-0 2/2 Running 0 34m 10.224.0.88 unity-170 redis-cluster-follower-1 2/2 Running 0 35m 10.224.0.86 unity-170 redis-cluster-follower-2 2/2 Running 0 36m 10.224.0.85 unity-170 redis-cluster-leader-0 2/2 Running 0 34m 10.224.0.89 unity-170 redis-cluster-leader-1 2/2 Running 0 35m 10.224.0.87 unity-170 redis-cluster-leader-2 2/2 Running 0 36m 10.224.0.84 unity-170 redis-operator-64dd6b5578-tdx8v 1/1 Running 0 26m 10.224.0.90 unity-170

iamabhishek-dubey commented 3 years ago

This is more of a warning message which will come initially because after restart pod IP will change and the old IP will be replaced with the new one but I believe this will not impact the overall cluster health. Have you checked all redis are in the connected state by using the cluster nodes command?