spotahome / redis-operator

Redis Operator creates/configures/manages high availability redis with sentinel automatic failover atop Kubernetes.
Apache License 2.0
1.48k stars 355 forks source link

Add custom sync interval to operator #661

Closed adrianmarcu18 closed 7 months ago

adrianmarcu18 commented 9 months ago

Expected behaviour

Service "master" label propagation should happen immediately after a Redis failover. This doesn't happen because, even if we set the Sentinel configuration, the operator will only sync every 30 seconds, meaning you can wait up until at least 1 minute for the master pod to get labeled as master.

Is there any way we can change the default operator sync interval?

Actual behaviour

Master label propagation happens after the operator sync, which has a fixed value of 30 seconds with no ability to change it.

Steps to reproduce the behaviour

Create a redisfailover, set a low Sentinel failover time and then delete the master pod. Compare the time it takes for replica pod to become master with the time the operator adds the label selector to the service.

Environment

How are the pieces configured?

Logs

Please, add the debugging logs. In order to be able to gather them, add -debug flag when running the operator.

ebuildy commented 9 months ago

+1 for the problem but I am not OK for this solution.

We should use sentinel notification to tell operator to change labels as soon as sentinel detects a master changement.

sentinel.conf:

sentinel notification-script $REDIS_MASTER_NAME /var/redis/notify.sh
adrianmarcu18 commented 9 months ago

That is a much better solution indeed. Hopefully it something that will not take too much to implement. As a workaround I was thinking to use Haproxy to identify the master faster than the label change, but doesn't work well with the current services and I am not able to create an extra headless service (the operator will delete it immediately).

ebuildy commented 9 months ago

You can add any k8s resources, maybe change the name to avoid conflicts with the operator.

Yes haproxy is good, there is health check probe for redis.

Le mar. 12 sept. 2023, 01:43, adrianmarcu18 @.***> a écrit :

That is a much better solution indeed. Hopefully it something that will not take too much to implement. As a workaround I was thinking to use Haproxy to identify the master faster than the label change, but doesn't work well with the current services and I am not able to create an extra headless service (the operator will delete it immediately).

— Reply to this email directly, view it on GitHub https://github.com/spotahome/redis-operator/issues/661#issuecomment-1715007987, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJJZ2I6HDHRFK5DWSKVFC3XZ7ZBZANCNFSM6AAAAAA4SYFZQI . You are receiving this because you commented.Message ID: @.***>

adrianmarcu18 commented 9 months ago

Not really true. For the headless service in order for the subdomain to propagate, the headless service has to have the same name as the serviceName in the statefulset. If you create such a service, the operator will delete it. I don’t get why that is needed, maybe for previous implementations.

ebuildy commented 9 months ago

Ho this is the 1st time I heard that, could you give me a doc URL please about headless service?

adrianmarcu18 commented 9 months ago

Here is the part of interest:

A StatefulSet can use a Headless Service to control the domain of its Pods. The domain managed by this Service takes the form: $(service name).$(namespace).svc.cluster.local, where "cluster.local" is the cluster domain. As each Pod is created, it gets a matching DNS subdomain, taking the form: $(podname).$(governing service domain), where the governing service is defined by the serviceName field on the StatefulSet.

https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

As you can see, the subdomain published uses the serviceName defined in the statefulset. If you create a headless service with the same name, you will be able to resolve the pod ip using the subdomain. When the name of the headless service is different, it cannot resolve the pod ip.

ebuildy commented 9 months ago

ho this is different, here you are talking about network identity of the Pods

adrianmarcu18 commented 9 months ago

You can try it out. In order to setup haproxy correctly you want to target each pod individually. So you will want to have rfr-redis-0.rfr-redis.svc.cluster.local and rfr-redis-1.rfr-redis.svc.cluster.local as servers in haproxy config.

To be able to use such fqdn for pods, you need a headless service created with the name rfr-redis in the same namespace. This needs to also be the serviceName in the statefulset.

if you create a headless service, let’s say: rfr-redis-headless, and the serviceName in statefulset rfr-redis, if you do a ping to rfr-redis-0.rfr-redis-headless.svc.cluster.local it will not be resolved.

Check also this issue:

https://github.com/kubernetes/kubernetes/issues/74950

Has more explanation

adrianmarcu18 commented 9 months ago

@ebuildy Do you think I should open a separate issue for the headless service behaviour? Or are there any plans to have the notification implemented in the near future?

ebuildy commented 9 months ago

yes, much better if you can do the PR as well :-)

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 45 days with no activity.

github-actions[bot] commented 7 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.