Closed adrianmarcu18 closed 7 months ago
+1 for the problem but I am not OK for this solution.
We should use sentinel notification to tell operator to change labels as soon as sentinel detects a master changement.
sentinel.conf:
sentinel notification-script $REDIS_MASTER_NAME /var/redis/notify.sh
That is a much better solution indeed. Hopefully it something that will not take too much to implement. As a workaround I was thinking to use Haproxy to identify the master faster than the label change, but doesn't work well with the current services and I am not able to create an extra headless service (the operator will delete it immediately).
You can add any k8s resources, maybe change the name to avoid conflicts with the operator.
Yes haproxy is good, there is health check probe for redis.
Le mar. 12 sept. 2023, 01:43, adrianmarcu18 @.***> a écrit :
That is a much better solution indeed. Hopefully it something that will not take too much to implement. As a workaround I was thinking to use Haproxy to identify the master faster than the label change, but doesn't work well with the current services and I am not able to create an extra headless service (the operator will delete it immediately).
— Reply to this email directly, view it on GitHub https://github.com/spotahome/redis-operator/issues/661#issuecomment-1715007987, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJJZ2I6HDHRFK5DWSKVFC3XZ7ZBZANCNFSM6AAAAAA4SYFZQI . You are receiving this because you commented.Message ID: @.***>
Not really true. For the headless service in order for the subdomain to propagate, the headless service has to have the same name as the serviceName in the statefulset. If you create such a service, the operator will delete it. I don’t get why that is needed, maybe for previous implementations.
Ho this is the 1st time I heard that, could you give me a doc URL please about headless service?
Here is the part of interest:
A StatefulSet can use a Headless Service to control the domain of its Pods. The domain managed by this Service takes the form: $(service name).$(namespace).svc.cluster.local, where "cluster.local" is the cluster domain. As each Pod is created, it gets a matching DNS subdomain, taking the form: $(podname).$(governing service domain), where the governing service is defined by the serviceName field on the StatefulSet.
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
As you can see, the subdomain published uses the serviceName defined in the statefulset. If you create a headless service with the same name, you will be able to resolve the pod ip using the subdomain. When the name of the headless service is different, it cannot resolve the pod ip.
ho this is different, here you are talking about network identity of the Pods
You can try it out. In order to setup haproxy correctly you want to target each pod individually. So you will want to have rfr-redis-0.rfr-redis.svc.cluster.local and rfr-redis-1.rfr-redis.svc.cluster.local as servers in haproxy config.
To be able to use such fqdn for pods, you need a headless service created with the name rfr-redis in the same namespace. This needs to also be the serviceName in the statefulset.
if you create a headless service, let’s say: rfr-redis-headless, and the serviceName in statefulset rfr-redis, if you do a ping to rfr-redis-0.rfr-redis-headless.svc.cluster.local it will not be resolved.
Check also this issue:
https://github.com/kubernetes/kubernetes/issues/74950
Has more explanation
@ebuildy Do you think I should open a separate issue for the headless service behaviour? Or are there any plans to have the notification implemented in the near future?
yes, much better if you can do the PR as well :-)
This issue is stale because it has been open for 45 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Expected behaviour
Service "master" label propagation should happen immediately after a Redis failover. This doesn't happen because, even if we set the Sentinel configuration, the operator will only sync every 30 seconds, meaning you can wait up until at least 1 minute for the master pod to get labeled as master.
Is there any way we can change the default operator sync interval?
Actual behaviour
Master label propagation happens after the operator sync, which has a fixed value of 30 seconds with no ability to change it.
Steps to reproduce the behaviour
Create a redisfailover, set a low Sentinel failover time and then delete the master pod. Compare the time it takes for replica pod to become master with the time the operator adds the label selector to the service.
Environment
How are the pieces configured?
Logs
Please, add the debugging logs. In order to be able to gather them, add
-debug
flag when running the operator.