Open amalic opened 1 month ago
When I apply folloging CRD via kubectl -n redis-test apply -f redis-cluster.yaml (from your examples folder, including the affinity example), the cluster does not seem to b e set up properly.
kubectl -n redis-test apply -f redis-cluster.yaml
redis-cluster.yaml
> kubectl -n redis-test apply -f - <<EOF --- # see https://github.com/OT-CONTAINER-KIT/redis-operator/blob/master/config/crd/bases/redis.redis.opstreelabs.in_redisclusters.yaml apiVersion: redis.redis.opstreelabs.in/v1beta2 kind: RedisCluster metadata: name: redis-cluster spec: clusterSize: 3 clusterVersion: v7 persistenceEnabled: true podSecurityContext: runAsUser: 1000 fsGroup: 1000 kubernetesConfig: image: quay.io/opstree/redis:v7.0.12 imagePullPolicy: IfNotPresent resources: requests: cpu: 101m memory: 128Mi limits: cpu: 101m memory: 128Mi # redisSecret: # name: redis-secret # key: password # imagePullSecrets: # - name: regcred redisExporter: enabled: false image: quay.io/opstree/redis-exporter:v1.44.0 imagePullPolicy: Always resources: requests: cpu: 100m memory: 128Mi limits: cpu: 100m memory: 128Mi # Environment Variables for Redis Exporter # env: # - name: REDIS_EXPORTER_INCL_SYSTEM_METRICS # value: "true" # - name: UI_PROPERTIES_FILE_NAME # valueFrom: # configMapKeyRef: # name: game-demo # key: ui_properties_file_name # - name: SECRET_USERNAME # valueFrom: # secretKeyRef: # name: mysecret # key: username # redisLeader: # redisConfig: # additionalRedisConfig: redis-external-config # redisFollower: # redisConfig: # additionalRedisConfig: redis-external-config storage: volumeClaimTemplate: spec: # storageClassName: standard accessModes: [ReadWriteOnce] resources: requests: storage: 1Gi nodeConfVolume: true nodeConfVolumeClaimTemplate: spec: accessModes: [ReadWriteOnce] resources: requests: storage: 1Gi redisLeader: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - redis-cluster-leader #- redis-cluster-follower topologyKey: "kubernetes.io/hostname" redisFollower: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - redis-cluster-follower #- redis-cluster-leader topologyKey: "kubernetes.io/hostname" EOF
After a few minutes all Pods are running
> kubectl -n redis-test get pods NAME READY STATUS RESTARTS AGE redis-cluster-follower-0 1/1 Running 0 2m16s redis-cluster-follower-1 1/1 Running 0 2m6s redis-cluster-follower-2 1/1 Running 0 2m1s redis-cluster-leader-0 1/1 Running 0 7m45s redis-cluster-leader-1 1/1 Running 0 5m58s redis-cluster-leader-2 1/1 Running 0 4m10s
But the cluster does not seem to be healthy
> kubectl -n redis-test exec -it redis-cluster-leader-0 -- redis-cli cluster nodes 5f801647e3022a2c041d67d8b0227592bc34787b 10.83.38.23:6379@16379,,tls-port=0,shard-id=11869bb1a4f17f8d00832ec4f163893b5d56f62c master,fail? - 1716285202756 1716285201762 2 disconnected 5461-10922 3bf6bc1fbd04861450ac3e3de76e01afc26da5cb 10.83.38.70:6379@16379 myself,master - 0 1716285201762 1 connected 0-5460 4dd5317f71f2bbca0c6fbc3796136ecac7c36d50 :0@0,,tls-port=0,shard-id=967322fb8c9f09c34f8a4a4c0988ed97bb54f27f master,fail?,noaddr - 1716285201762 1716285201762 3 disconnected 10923-16383 > kubectl -n redis-test exec -it redis-cluster-leader-0 -n fbp-development -- redis-cli cluster info cluster_state:fail cluster_slots_assigned:16384 cluster_slots_ok:5461 cluster_slots_pfail:10923 cluster_slots_fail:0 cluster_known_nodes:3 cluster_size:3 cluster_current_epoch:3 cluster_my_epoch:1 cluster_stats_messages_ping_sent:1 cluster_stats_messages_sent:1 cluster_stats_messages_received:0 total_cluster_links_buffer_limit_exceeded:0 > kubectl -n redis-test exec -it redis-cluster-leader-1 -- redis-cli cluster nodes 4dd5317f71f2bbca0c6fbc3796136ecac7c36d50 :0@0 master,noaddr - 1716285308371 1716285308371 3 disconnected 10923-16383 3bf6bc1fbd04861450ac3e3de76e01afc26da5cb 10.83.38.21:6379@16379 master,fail? - 1716285308371 1716285308371 1 connected 0-5460 5f801647e3022a2c041d67d8b0227592bc34787b 10.83.38.240:6379@16379 myself,master - 0 1716285308371 2 connected 5461-10922 > kubectl -n redis-test exec -it redis-cluster-leader-1 -- redis-cli cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:10923 cluster_slots_pfail:5461 cluster_slots_fail:0 cluster_known_nodes:3 cluster_size:3 cluster_current_epoch:3 cluster_my_epoch:2 cluster_stats_messages_sent:0 cluster_stats_messages_received:0 total_cluster_links_buffer_limit_exceeded:0 > kubectl -n redis-test exec -it redis-cluster-leader-2 -- redis-cli cluster nodes 3bf6bc1fbd04861450ac3e3de76e01afc26da5cb 10.83.38.144:6379@16379 master,fail? - 1716285409468 1716285409468 1 connected 0-5460 5f801647e3022a2c041d67d8b0227592bc34787b :0@0 master,fail?,noaddr - 1716285409468 1716285409468 2 disconnected 5461-10922 4dd5317f71f2bbca0c6fbc3796136ecac7c36d50 10.83.38.245:6379@16379 myself,master - 0 1716285409468 3 connected 10923-16383 > kubectl -n redis-test exec -it redis-cluster-leader-2 -- redis-cli cluster info cluster_state:fail cluster_slots_assigned:16384 cluster_slots_ok:5461 cluster_slots_pfail:10923 cluster_slots_fail:0 cluster_known_nodes:3 cluster_size:3 cluster_current_epoch:3 cluster_my_epoch:3 cluster_stats_messages_sent:0 cluster_stats_messages_received:0 total_cluster_links_buffer_limit_exceeded:0
Logs from namespace where Redis is running (last 15m)
[ { "line": "{\"level\":\"error\",\"ts\":1716289297.2381048,\"logger\":\"controller_redis\",\"msg\":\"Could not execute command\",\"Request.RedisManager.Namespace\":\"redis-test\",\"Request.RedisManager.Name\":\"redis-cluster\",\"Command\":[\"redis-cli\",\"--cluster\",\"add-node\",\"redis-cluster-follower-2.redis-cluster-follower-headless.redis-test.svc:6379\",\"redis-cluster-leader-2.redis-cluster-leader-headless.redis-test.svc:6379\",\"--cluster-slave\"],\"Output\":\">>> Adding node redis-cluster-follower-2.redis-cluster-follower-headless.redis-test.svc:6379 to cluster redis-cluster-leader-2.redis-cluster-leader-headless.redis-test.svc:6379\\n>>> Performing Cluster Check (using node redis-cluster-leader-2.redis-cluster-leader-headless.redis-test.svc:6379)\\nM: 4dd5317f71f2bbca0c6fbc3796136ecac7c36d50 redis-cluster-leader-2.redis-cluster-leader-headless.redis-test.svc:6379\\n slots:[10923-16383] (5461 slots) master\\n[OK] All nodes agree about slots configuration.\\n>>> Check for open slots...\\n>>> Check slots coverage...\\n[ERR] Not all 16384 slots are covered by nodes.\\n\\n\",\"Error\":\"\",\"error\":\"command terminated with exit code 1\",\"stacktrace\":\"github.com/OT-CONTAINER-KIT/redis-operator/k8sutils.ExecuteRedisReplicationCommand\\n\\t/workspace/k8sutils/redis.go:187\\ngithub.com/OT-CONTAINER-KIT/redis-operator/controllers.(*RedisClusterReconciler).Reconcile\\n\\t/workspace/controllers/rediscluster_controller.go:196\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716289297238339212", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716289297.1535783,\"logger\":\"controller_redis\",\"msg\":\"Could not execute command\",\"Request.RedisManager.Namespace\":\"redis-test\",\"Request.RedisManager.Name\":\"redis-cluster\",\"Command\":[\"redis-cli\",\"--cluster\",\"add-node\",\"redis-cluster-follower-1.redis-cluster-follower-headless.redis-test.svc:6379\",\"redis-cluster-leader-1.redis-cluster-leader-headless.redis-test.svc:6379\",\"--cluster-slave\"],\"Output\":\">>> Adding node redis-cluster-follower-1.redis-cluster-follower-headless.redis-test.svc:6379 to cluster redis-cluster-leader-1.redis-cluster-leader-headless.redis-test.svc:6379\\n>>> Performing Cluster Check (using node redis-cluster-leader-1.redis-cluster-leader-headless.redis-test.svc:6379)\\nM: 5f801647e3022a2c041d67d8b0227592bc34787b redis-cluster-leader-1.redis-cluster-leader-headless.redis-test.svc:6379\\n slots:[5461-10922] (5462 slots) master\\n[OK] All nodes agree about slots configuration.\\n>>> Check for open slots...\\n>>> Check slots coverage...\\n[ERR] Not all 16384 slots are covered by nodes.\\n\\n\",\"Error\":\"Could not connect to Redis at 10.83.38.21:6379: Operation timed out\\n\",\"error\":\"command terminated with exit code 1\",\"stacktrace\":\"github.com/OT-CONTAINER-KIT/redis-operator/k8sutils.ExecuteRedisReplicationCommand\\n\\t/workspace/k8sutils/redis.go:187\\ngithub.com/OT-CONTAINER-KIT/redis-operator/controllers.(*RedisClusterReconciler).Reconcile\\n\\t/workspace/controllers/rediscluster_controller.go:196\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716289297153830127", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716289166.0818326,\"logger\":\"controller_redis\",\"msg\":\"Could not execute command\",\"Request.RedisManager.Namespace\":\"redis-test\",\"Request.RedisManager.Name\":\"redis-cluster\",\"Command\":[\"redis-cli\",\"--cluster\",\"add-node\",\"redis-cluster-follower-0.redis-cluster-follower-headless.redis-test.svc:6379\",\"redis-cluster-leader-0.redis-cluster-leader-headless.redis-test.svc:6379\",\"--cluster-slave\"],\"Output\":\">>> Adding node redis-cluster-follower-0.redis-cluster-follower-headless.redis-test.svc:6379 to cluster redis-cluster-leader-0.redis-cluster-leader-headless.redis-test.svc:6379\\n>>> Performing Cluster Check (using node redis-cluster-leader-0.redis-cluster-leader-headless.redis-test.svc:6379)\\nM: 3bf6bc1fbd04861450ac3e3de76e01afc26da5cb redis-cluster-leader-0.redis-cluster-leader-headless.redis-test.svc:6379\\n slots:[0-5460] (5461 slots) master\\n[OK] All nodes agree about slots configuration.\\n>>> Check for open slots...\\n>>> Check slots coverage...\\n[ERR] Not all 16384 slots are covered by nodes.\\n\\n\",\"Error\":\"Could not connect to Redis at 10.83.38.23:6379: Operation timed out\\n\",\"error\":\"command terminated with exit code 1\",\"stacktrace\":\"github.com/OT-CONTAINER-KIT/redis-operator/k8sutils.ExecuteRedisReplicationCommand\\n\\t/workspace/k8sutils/redis.go:187\\ngithub.com/OT-CONTAINER-KIT/redis-operator/controllers.(*RedisClusterReconciler).Reconcile\\n\\t/workspace/controllers/rediscluster_controller.go:196\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716289166082025014", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716289035.1559641,\"logger\":\"controller.rediscluster\",\"msg\":\"Reconciler error\",\"reconciler group\":\"redis.redis.opstreelabs.in\",\"reconciler kind\":\"RedisCluster\",\"name\":\"redis-cluster\",\"namespace\":\"redis-test\",\"error\":\"Operation cannot be fulfilled on redisclusters.redis.redis.opstreelabs.in \\\"redis-cluster\\\": the object has been modified; please apply your changes to the latest version and try again\",\"stacktrace\":\"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716289035156114124", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716289035.1558983,\"logger\":\"controller_redis\",\"msg\":\"Failed to update status\",\"Request.Namespace\":\"redis-test\",\"Request.Name\":\"redis-cluster\",\"error\":\"Operation cannot be fulfilled on redisclusters.redis.redis.opstreelabs.in \\\"redis-cluster\\\": the object has been modified; please apply your changes to the latest version and try again\",\"stacktrace\":\"github.com/OT-CONTAINER-KIT/redis-operator/controllers.(*RedisClusterReconciler).Reconcile\\n\\t/workspace/controllers/rediscluster_controller.go:221\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716289035156047792", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716289035.1211607,\"logger\":\"controller_redis\",\"msg\":\"Could not execute command\",\"Request.RedisManager.Namespace\":\"redis-test\",\"Request.RedisManager.Name\":\"redis-cluster\",\"Command\":[\"redis-cli\",\"--cluster\",\"add-node\",\"redis-cluster-follower-2.redis-cluster-follower-headless.redis-test.svc:6379\",\"redis-cluster-leader-2.redis-cluster-leader-headless.redis-test.svc:6379\",\"--cluster-slave\"],\"Output\":\">>> Adding node redis-cluster-follower-2.redis-cluster-follower-headless.redis-test.svc:6379 to cluster redis-cluster-leader-2.redis-cluster-leader-headless.redis-test.svc:6379\\n>>> Performing Cluster Check (using node redis-cluster-leader-2.redis-cluster-leader-headless.redis-test.svc:6379)\\nM: 4dd5317f71f2bbca0c6fbc3796136ecac7c36d50 redis-cluster-leader-2.redis-cluster-leader-headless.redis-test.svc:6379\\n slots:[10923-16383] (5461 slots) master\\n[OK] All nodes agree about slots configuration.\\n>>> Check for open slots...\\n>>> Check slots coverage...\\n[ERR] Not all 16384 slots are covered by nodes.\\n\\n\",\"Error\":\"\",\"error\":\"command terminated with exit code 1\",\"stacktrace\":\"github.com/OT-CONTAINER-KIT/redis-operator/k8sutils.ExecuteRedisReplicationCommand\\n\\t/workspace/k8sutils/redis.go:187\\ngithub.com/OT-CONTAINER-KIT/redis-operator/controllers.(*RedisClusterReconciler).Reconcile\\n\\t/workspace/controllers/rediscluster_controller.go:196\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716289035121543919", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716289035.0111623,\"logger\":\"controller_redis\",\"msg\":\"Could not execute command\",\"Request.RedisManager.Namespace\":\"redis-test\",\"Request.RedisManager.Name\":\"redis-cluster\",\"Command\":[\"redis-cli\",\"--cluster\",\"add-node\",\"redis-cluster-follower-1.redis-cluster-follower-headless.redis-test.svc:6379\",\"redis-cluster-leader-1.redis-cluster-leader-headless.redis-test.svc:6379\",\"--cluster-slave\"],\"Output\":\">>> Adding node redis-cluster-follower-1.redis-cluster-follower-headless.redis-test.svc:6379 to cluster redis-cluster-leader-1.redis-cluster-leader-headless.redis-test.svc:6379\\n>>> Performing Cluster Check (using node redis-cluster-leader-1.redis-cluster-leader-headless.redis-test.svc:6379)\\nM: 5f801647e3022a2c041d67d8b0227592bc34787b redis-cluster-leader-1.redis-cluster-leader-headless.redis-test.svc:6379\\n slots:[5461-10922] (5462 slots) master\\n[OK] All nodes agree about slots configuration.\\n>>> Check for open slots...\\n>>> Check slots coverage...\\n[ERR] Not all 16384 slots are covered by nodes.\\n\\n\",\"Error\":\"Could not connect to Redis at 10.83.38.21:6379: Operation timed out\\n\",\"error\":\"command terminated with exit code 1\",\"stacktrace\":\"github.com/OT-CONTAINER-KIT/redis-operator/k8sutils.ExecuteRedisReplicationCommand\\n\\t/workspace/k8sutils/redis.go:187\\ngithub.com/OT-CONTAINER-KIT/redis-operator/controllers.(*RedisClusterReconciler).Reconcile\\n\\t/workspace/controllers/rediscluster_controller.go:196\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716289035011394873", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716288903.9375806,\"logger\":\"controller_redis\",\"msg\":\"Could not execute command\",\"Request.RedisManager.Namespace\":\"redis-test\",\"Request.RedisManager.Name\":\"redis-cluster\",\"Command\":[\"redis-cli\",\"--cluster\",\"add-node\",\"redis-cluster-follower-0.redis-cluster-follower-headless.redis-test.svc:6379\",\"redis-cluster-leader-0.redis-cluster-leader-headless.redis-test.svc:6379\",\"--cluster-slave\"],\"Output\":\">>> Adding node redis-cluster-follower-0.redis-cluster-follower-headless.redis-test.svc:6379 to cluster redis-cluster-leader-0.redis-cluster-leader-headless.redis-test.svc:6379\\n>>> Performing Cluster Check (using node redis-cluster-leader-0.redis-cluster-leader-headless.redis-test.svc:6379)\\nM: 3bf6bc1fbd04861450ac3e3de76e01afc26da5cb redis-cluster-leader-0.redis-cluster-leader-headless.redis-test.svc:6379\\n slots:[0-5460] (5461 slots) master\\n[OK] All nodes agree about slots configuration.\\n>>> Check for open slots...\\n>>> Check slots coverage...\\n[ERR] Not all 16384 slots are covered by nodes.\\n\\n\",\"Error\":\"Could not connect to Redis at 10.83.38.23:6379: Operation timed out\\n\",\"error\":\"command terminated with exit code 1\",\"stacktrace\":\"github.com/OT-CONTAINER-KIT/redis-operator/k8sutils.ExecuteRedisReplicationCommand\\n\\t/workspace/k8sutils/redis.go:187\\ngithub.com/OT-CONTAINER-KIT/redis-operator/controllers.(*RedisClusterReconciler).Reconcile\\n\\t/workspace/controllers/rediscluster_controller.go:196\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716288903937925905", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716288578.0861685,\"logger\":\"controller.rediscluster\",\"msg\":\"Reconciler error\",\"reconciler group\":\"redis.redis.opstreelabs.in\",\"reconciler kind\":\"RedisCluster\",\"name\":\"redis-cluster\",\"namespace\":\"redis-test\",\"error\":\"statefulsets.apps \\\"redis-cluster-follower\\\" not found\",\"stacktrace\":\"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716288578086231831", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } }, { "line": "{\"level\":\"error\",\"ts\":1716288496.096395,\"logger\":\"controller.rediscluster\",\"msg\":\"Reconciler error\",\"reconciler group\":\"redis.redis.opstreelabs.in\",\"reconciler kind\":\"RedisCluster\",\"name\":\"redis-cluster\",\"namespace\":\"redis-test\",\"error\":\"statefulsets.apps \\\"redis-cluster-follower\\\" not found\",\"stacktrace\":\"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227\"}", "timestamp": "1716288496096520031", "fields": { "app": "redis-operator", "container": "redis-operator", "filename": "/var/log/pods/redis-system_redis-operator-5f5c58c5cd-sn4b8_64a5c185-07dc-4c49-85b3-2d9d35c91bf4/redis-operator/0.log", "job": "redis-system/redis-operator", "namespace": "redis-system", "node_name": "ip-10-83-39-52.eu-central-1.compute.internal", "pod": "redis-operator-5f5c58c5cd-sn4b8", "stream": "stderr" } } ]
Logs from created Redis Cluster (last minute)
[ { "line": "11:S 21 May 2024 11:08:20.744 * MASTER <-> REPLICA sync started", "timestamp": "1716289700745066656", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-1_2c4ffe27-be02-4ad0-92f8-fb34d4960d9f/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-233.eu-central-1.compute.internal", "pod": "redis-cluster-follower-1", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:20.744 * Connecting to MASTER 10.83.38.138:6379", "timestamp": "1716289700745041433", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-1_2c4ffe27-be02-4ad0-92f8-fb34d4960d9f/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-233.eu-central-1.compute.internal", "pod": "redis-cluster-follower-1", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:19.902 # Error condition on socket for SYNC: Host is unreachable", "timestamp": "1716289699902887938", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-1_2c4ffe27-be02-4ad0-92f8-fb34d4960d9f/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-233.eu-central-1.compute.internal", "pod": "redis-cluster-follower-1", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:18.146 * MASTER <-> REPLICA sync started", "timestamp": "1716289698146509035", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:18.146 * Connecting to MASTER 10.83.38.149:6379", "timestamp": "1716289698146321832", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:17.619 # Error condition on socket for SYNC: Host is unreachable", "timestamp": "1716289697620012956", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:15.139 * MASTER <-> REPLICA sync started", "timestamp": "1716289695139219562", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:15.138 * Connecting to MASTER 10.83.38.149:6379", "timestamp": "1716289695139027735", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:14.725 * MASTER <-> REPLICA sync started", "timestamp": "1716289694726068763", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-1_2c4ffe27-be02-4ad0-92f8-fb34d4960d9f/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-233.eu-central-1.compute.internal", "pod": "redis-cluster-follower-1", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:14.725 * Reconnecting to MASTER 10.83.38.138:6379 after failure", "timestamp": "1716289694726051903", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-1_2c4ffe27-be02-4ad0-92f8-fb34d4960d9f/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-233.eu-central-1.compute.internal", "pod": "redis-cluster-follower-1", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:14.725 # Timeout connecting to the MASTER...", "timestamp": "1716289694725977882", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-1_2c4ffe27-be02-4ad0-92f8-fb34d4960d9f/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-233.eu-central-1.compute.internal", "pod": "redis-cluster-follower-1", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:14.195 # Error condition on socket for SYNC: Host is unreachable", "timestamp": "1716289694195928918", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:11.129 * MASTER <-> REPLICA sync started", "timestamp": "1716289691129271867", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:11.128 * Connecting to MASTER 10.83.38.149:6379", "timestamp": "1716289691129167648", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:10.195 # Error condition on socket for SYNC: Host is unreachable", "timestamp": "1716289690195907882", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:06.099 * MASTER <-> REPLICA sync started", "timestamp": "1716289686099814198", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:06.099 * Reconnecting to MASTER 10.83.38.149:6379 after failure", "timestamp": "1716289686099794197", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:08:06.099 # Timeout connecting to the MASTER...", "timestamp": "1716289686099690676", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-0_edb799d6-0963-46be-8504-b8259913b144/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-18.eu-central-1.compute.internal", "pod": "redis-cluster-follower-0", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:07:34.968 * MASTER <-> REPLICA sync started", "timestamp": "1716289654969022361", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-2_c825406e-1dfd-4591-af46-19fede936bd2/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-242.eu-central-1.compute.internal", "pod": "redis-cluster-follower-2", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:07:34.968 * Reconnecting to MASTER 10.83.38.73:6379 after failure", "timestamp": "1716289654969018732", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-2_c825406e-1dfd-4591-af46-19fede936bd2/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-242.eu-central-1.compute.internal", "pod": "redis-cluster-follower-2", "stream": "stdout" } }, { "line": "11:S 21 May 2024 11:07:34.968 # Timeout connecting to the MASTER...", "timestamp": "1716289654968989962", "fields": { "app": "redis-cluster-follower", "container": "redis-cluster-follower", "filename": "/var/log/pods/redis-test_redis-cluster-follower-2_c825406e-1dfd-4591-af46-19fede936bd2/redis-cluster-follower/0.log", "job": "redis-test/redis-cluster-follower", "namespace": "redis-test", "node_name": "ip-10-83-38-242.eu-central-1.compute.internal", "pod": "redis-cluster-follower-2", "stream": "stdout" } } ]
When I apply folloging CRD via
kubectl -n redis-test apply -f redis-cluster.yaml
(from your examples folder, including the affinity example), the cluster does not seem to b e set up properly.redis-cluster.yaml
After a few minutes all Pods are running
But the cluster does not seem to be healthy
Logs from namespace where Redis is running (last 15m)
Logs from created Redis Cluster (last minute)