spotahome / redis-operator

Redis Operator creates/configures/manages high availability redis with sentinel automatic failover atop Kubernetes.
Apache License 2.0
1.5k stars 356 forks source link

Different RedisFailover's sentinels join together #550

Closed tparsa closed 1 year ago

tparsa commented 1 year ago

Environment

How are the pieces configured?

dbackeus commented 1 month ago

I think a good solution for this would be to migrate from using IP-addresses to hostnames for both Sentinel's and Redis instances. Ie. use replica-announce-ip <hostname> for Redis instances and sentinel announce-ip <hostname> for Sentinel instances. This would imply starting to use StatefulSet for running sentinels as well.

Hostname support is documented at: https://redis.io/docs/latest/operate/oss_and_stack/management/sentinel/#ip-addresses-and-dns-names

The issues around environments which swap IP-addresses around is sort of documented at https://redis.io/docs/latest/operate/oss_and_stack/management/sentinel/#sentinel-docker-nat-and-possible-issues. But it focuses on Docker environments rather than Kubernetes.

We are currently working on a DIY helm template hostname based approach to replace this operator after having a critical data loss in our production environment due to Sentinels getting mixed up and telling random Redis instances from different clusters to start following each other.

githubixx commented 1 month ago

I can recommend Bitnami Redis Helm chart: https://artifacthub.io/packages/helm/bitnami/redis It also supports Redis Sentinel. It works very well for us with 20+ Redis Sentinel deployments in a Kubernetes cluster. The chart handles various issues you can face with Redis Sentinel very well. They've quite a lot Helm charts for other services too: https://github.com/bitnami/charts/tree/main/bitnami

sapisuper commented 1 month ago

@githubixx correct better using Bitnami Redis. I got issue in OT-REDIS-OPERATOR regarding failover. the failover doesn't running smoothly.