Closed wkd-woo closed 2 months ago
What version of redis operator are you using?
redis-operator version: v0.16.0
Does this issue reproduce with the latest release?
What operating system and processor architecture are you using (kubectl version)? Server Version: v1.24.17
kubectl version
Server Version: v1.24.17
$ kubectl version Server Version: v1.24.17
What did you do?
scale-out from 6 to 16, and then scale-in to 10
$ k exec -it pod/nosql-test-cluster-leader-0 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=339,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-1 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=329,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-2 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=332,expires=0,avg_ttl=0
In the begin, I had 3 shards.
$ k exec -it pod/nosql-test-cluster-leader-0 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=122,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-1 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=125,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-2 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=120,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-3 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=131,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-4 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=119,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-5 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=122,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-6 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=121,expires=0,avg_ttl=0 $ k exec -it pod/nosql-test-cluster-leader-7 -- redis-cli -c INFO KEYSPACE # Keyspace db0:keys=140,expires=0,avg_ttl=0
scale-out from 3 to 8 shards.
$ redis-cli -c CLUSTER NODES a6c608cf14bd9e6e0de0ab6eaf2cbbdaa1953d73 10.240.2.249:6379@16379 slave 835ec3a49b75f342fc264d5ac7fc0a56db43 996c 0 1712128844000 7 connected 5708221e843c8861e2fd88d2a1424ef20193e5d2 10.240.6.45:6379@16379 slave 67bb1fc41c4a4ee42dce1cb7f10e38e00c430 b2c 0 1712128844269 5 connected c1bc149baf71050a8ad8f45b002d77507a75d9b4 10.240.5.19:6379@16379 slave d1330499033c5c01ca7811fa6c4c6dc5d9a00 297 0 1712128844269 1 connected b7149d5a20ef09a05bd5bef1b895b5bb54b42913 10.240.10.224:6379@16379 master - 0 1712128844572 2 connected 8875 -10922 3c9c58d3011f508725160a9a51bd6e943d09f79e 10.240.10.175:6379@16379 slave 0f407db4a5916db6bd8c6eda044d35eed0e ff5f7 0 1712128844000 4 connected 6bb9ddcfb6f0eeb1747f7e74e2ae8a4d310a3574 10.240.7.70:6379@16379 master - 0 1712128844873 6 connected 956-13 64 2185-2730 5461 7647-8192 13108-13653 b2c2189d20d8d33692ba1b1f3244d5e227bccc91 10.240.9.78:6379@16379 master - 0 1712128844269 12 connected 0-295 394-549 820-955 1485-1776 3121-3412 5852-6143 8583-8874 14044-14335 835ec3a49b75f342fc264d5ac7fc0a56db43996c 10.240.3.69:6379@16379 master - 0 1712128844000 7 connected 296-39 3 550-819 1365-1484 2731-3120 5462-5851 8193-8582 13654-14043 917dc82a73ad409f104465e9b08c68c32e7e213f 10.240.4.64:6379@16379 slave b7149d5a20ef09a05bd5bef1b895b5bb54b42 913 0 1712128844000 2 connected d1330499033c5c01ca7811fa6c4c6dc5d9a00297 10.240.4.50:6379@16379 myself,master - 0 1712128843000 1 connected 3413-5460 0f407db4a5916db6bd8c6eda044d35eed0eff5f7 10.240.5.246:6379@16379 master - 0 1712128844000 4 connected 6144- 6826 10923-12287 2779ce8123407da6e510211113b44a9d47d33eea 10.240.7.25:6379@16379 slave b2c2189d20d8d33692ba1b1f3244d5e227bcc c91 0 1712128844873 12 connected f631c80abc79659af0df755a6d3eefd2a1013da1 10.240.1.121:6379@16379 slave 6bb9ddcfb6f0eeb1747f7e74e2ae8a4d310a 3574 0 1712128844572 6 connected 67bb1fc41c4a4ee42dce1cb7f10e38e00c430b2c 10.240.6.52:6379@16379 master - 0 1712128844270 5 connected 1777-2 184 6827-7646 12288-13107 33bc3ec322aa3f7c7a82a9d1109234e58c0627e8 10.240.8.234:6379@16379 master - 0 1712128844572 3 connected 14336 -16383 35860770534767ec79bf92064cfb57cc54f5bb0a 10.240.8.165:6379@16379 slave 33bc3ec322aa3f7c7a82a9d1109234e58c06 27e8 0 1712128844873 3 connected
and there were 8 shards(8 masters, 8 slaves)
and then, I tried scale-in to 5 shards.
$ k exec -it pod/nosql-test-cluster-leader-0 -- redis-cli -c ROLE 1) "master" 2) (integer) 124311 3) 1) 1) "10.240.5.19" 2) "6379" 3) "124311" 2) 1) "10.240.6.45" 2) "6379" 3) "124311" 3) 1) "10.240.10.175" 2) "6379" 3) "124311" 4) 1) "10.240.8.165" 2) "6379" 3) "124311" 5) 1) "10.240.4.64" 2) "6379" 3) "124311" $ k exec -it pod/nosql-test-cluster-leader-0 -- redis-cli -c CLUSTER NODES a6c608cf14bd9e6e0de0ab6eaf2cbbdaa1953d73 10.240.2.249:6379@16379 slave,fail d1330499033c5c01ca7811fa6c4c6dc5d9a00297 1712129690954 1712129689647 20 disconnected 5708221e843c8861e2fd88d2a1424ef20193e5d2 10.240.6.45:6379@16379 slave d1330499033c5c01ca7811fa6c4c6dc5d9a00297 0 1712130184257 20 connected c1bc149baf71050a8ad8f45b002d77507a75d9b4 10.240.5.19:6379@16379 slave d1330499033c5c01ca7811fa6c4c6dc5d9a00297 0 1712130184659 20 connected 3c9c58d3011f508725160a9a51bd6e943d09f79e 10.240.10.175:6379@16379 slave d1330499033c5c01ca7811fa6c4c6dc5d9a00297 0 1712130184257 20 connected 917dc82a73ad409f104465e9b08c68c32e7e213f 10.240.4.64:6379@16379 slave d1330499033c5c01ca7811fa6c4c6dc5d9a00297 0 1712130184257 20 connected d1330499033c5c01ca7811fa6c4c6dc5d9a00297 10.240.4.50:6379@16379 myself,master - 0 1712130183000 20 connected 0-16383 2779ce8123407da6e510211113b44a9d47d33eea 10.240.7.25:6379@16379 slave,fail d1330499033c5c01ca7811fa6c4c6dc5d9a00297 1712129660583 1712129659276 20 connected f631c80abc79659af0df755a6d3eefd2a1013da1 10.240.1.121:6379@16379 slave,fail d1330499033c5c01ca7811fa6c4c6dc5d9a00297 1712129722936 1712129721629 20 connected 35860770534767ec79bf92064cfb57cc54f5bb0a 10.240.8.165:6379@16379 slave d1330499033c5c01ca7811fa6c4c6dc5d9a00297 0 1712130184257 20 connected
Cluster broken during scale-in operation despite testing with the latest version.
$ k get all NAME READY STATUS RESTARTS AGE pod/nosql-test-cluster-follower-0 2/2 Running 0 52m pod/nosql-test-cluster-follower-1 2/2 Running 0 52m pod/nosql-test-cluster-follower-2 2/2 Running 0 52m pod/nosql-test-cluster-follower-3 2/2 Running 0 41m pod/nosql-test-cluster-follower-4 2/2 Running 0 41m pod/nosql-test-cluster-leader-0 2/2 Running 0 53m pod/nosql-test-cluster-leader-1 2/2 Running 0 53m pod/nosql-test-cluster-leader-2 2/2 Running 0 53m pod/nosql-test-cluster-leader-3 2/2 Running 0 42m pod/nosql-test-cluster-leader-4 2/2 Running 0 42m ... NAME READY AGE statefulset.apps/nosql-test-cluster-follower 5/5 52m statefulset.apps/nosql-test-cluster-leader 5/5 53m
resources
--- redisCluster: name: "nosql-test-cluster" clusterSize: 10 clusterVersion: v7 persistenceEnabled: false image: <REPOSITOR:redis> tag: v7.0.12 imagePullPolicy: IfNotPresent imagePullSecrets: {} # - name: Secret with Registry credentials redisSecret: secretName: "" secretKey: "" resources: limits: cpu: 101m memory: 2Gi leader: replicas: 5 serviceType: ClusterIP affinity: tolerations: [] nodeSelector: securityContext: {} pdb: enabled: false maxUnavailable: 1 minAvailable: 1 follower: replicas: 5 serviceType: ClusterIP affinity: tolerations: [] nodeSelector: securityContext: {} pdb: enabled: false maxUnavailable: 1 minAvailable: 1
... and the manifests.
What did you expect to see? redis cluster should be scaled-in successfully.
What did you see instead? redis cluster broken. only 1 master remains.
fixed by #885
What version of redis operator are you using?
redis-operator version: v0.16.0
Does this issue reproduce with the latest release?
846
What operating system and processor architecture are you using (
kubectl version
)?Server Version: v1.24.17
kubectl version
OutputWhat did you do?
846
scale-out from 6 to 16, and then scale-in to 10
In the begin, I had 3 shards.
scale-out from 3 to 8 shards.
and there were 8 shards(8 masters, 8 slaves)
and then, I tried scale-in to 5 shards.
Cluster broken during scale-in operation despite testing with the latest version.
resources
... and the manifests.
What did you expect to see? redis cluster should be scaled-in successfully.
What did you see instead? redis cluster broken. only 1 master remains.