Reshard/validate sharding of clusters on node reboot/every so often.

Is your feature request related to a problem? Please describe. Ive had a few instances of having to completely rebuild my cluster after rebooting my servers (I run k8s on local hardware). I believe that I have finally determined that this is due to sharding issues with the cluster after a reboot (CLUSTERDOWN Hash slot not served). It appears that clusters can reshard when scaled down, but are not validated as properly sharded when rebooted.

Describe the solution you'd like it would be nice of the operator could check and validate the sharding every so often/when a node reboots.

Describe alternatives you've considered I don't currently have any long-term data stored in the cluster, so deleting the storage allows everything to rebuild properly, but that's obviously less than ideal as it requires extra down time and manual intervention.

I would try to prevent the cluster from going down in the first place, but its sometimes unavoidable during maintenance.

What version of redis-operator are you using?

redis-operator version: 0.15.1

Additional context Manifest of the redis cluster (its terraform, but its basically KV to the yaml of the manifest.)

resource "kubernetes_manifest" "redis_prod_cluster" {
  manifest = {
    apiVersion = "redis.redis.opstreelabs.in/v1beta2"
    kind       = "RedisCluster"
    metadata = {
      name      = "redis-prod-cluster"
      namespace = kubernetes_namespace.redis_namespace.metadata[0].name
    }
    spec = {
      kubernetesConfig = {
        image           = "quay.io/opstree/redis:v7.0.12"
        imagePullPolicy = "IfNotPresent"

        redisSecret = {
          name = kubernetes_secret.redis_prod_password.metadata[0].name
          key  = "password"
        }
        service = {
          serviceType = "NodePort"
        }

      }
      resources = {
        limits = {
          memory = "200Mi"
          cpu    = "100m"
        }
      }

      persistenceEnabled = false
      podSecurityContext = {
        fsGroup   = 0
        runAsUser = 0
      }
      storage = {
        volumeClaimTemplate = {
          spec = {
            accessModes = [
              "ReadWriteOnce",
            ]
            resources = {
              requests = {
                storage = "10Gi"
              }
            }
          }
        }
      }
      clusterSize = 3
      redisLeader = {
        replicas = 3
        securityContext = {
          # fsGroup    = 0
          runAsGroup = 0
          runAsUser  = 0
        }
      }
      redisFollower = {
        replicas = 6
        securityContext = {
          # fsGroup    = 0
          runAsGroup = 0
          runAsUser  = 0
        }
      }
      redisExporter = {
        enabled = true
        image   = "quay.io/opstree/redis-exporter:v1.44.0"
      }
    }
  }
}

OT-CONTAINER-KIT / redis-operator

Reshard/validate sharding of clusters on node reboot/every so often. #773