Open Kilowhisky opened 1 year ago
This workaround works but its more than a bit hacky...
My failover period is 20 seconds so 30 should be enough time to elect a new leader.
var reconfiguring = false;
_multiplexer.ConnectionFailed += (s, e) =>
{
_log.LogError($"Connection failed: {e.ConnectionType}, {e.FailureType}, {e.EndPoint}, {e.Exception}");
_ = Task.Run(async () =>
{
try
{
if (reconfiguring) return;
reconfiguring = true;
await Task.Delay(TimeSpan.FromSeconds(30));
await _multiplexer.ReconfigureAsync("connection_lost");
await _multiplexer.PublishReconfigureAsync();
await Task.Delay(TimeSpan.FromSeconds(5));
_log.LogInformation($"Reconfigured status: {_multiplexer.GetStatus()}");
}finally { reconfiguring = false; }
});
};
The intended usage for Sentinel today is to connect to the sentinel endpoints with a service name, and the ConnectionMultiplexer you get back will be connected to the current primary. When the primary switches, it'll follow (there are 2 multiplexers under the covers, with 1 listening to Sentinel).
It looks like you may have tried this given the commented out line in code there looking to do this. What was or wasn't happening with that setup and nothing else?
I'm testing Redis, StackExchange.Redis, and Redis-Sentinel in order to setup a appropriate HA environment.
I've noticed that when i stand up a sentinel cluster as described in these docs: https://hub.docker.com/r/bitnami/redis-sentinel/ and i shutdown the master after a multiplexer connection has been successfully established by a client using
StackExchange.Redis
. The multiplexer fails to realize that the failover has occurred and continues to queue messages for the now non-existent leader.Funny thing is If i manually trigger a failover using the sentinel-cli with
SENTINEL FAILOVER mymaster
the multiplexer realizes the change and appropriately updates the leader.I looked into the code and noticed that the
+switch-master
subscription hook is never fired when the primary Redis is shutdown even if the sentinel associated is still up and running. In the situation where i manually trigger the failover, the hook properly fires.https://github.com/StackExchange/StackExchange.Redis/blob/f6171a19a0be078c6528b4631d42dfa4adcc8564/src/StackExchange.Redis/ConnectionMultiplexer.Sentinel.cs#L33-L62
I've confirmed that the sentinel is still issuing the
+switch-master
command in both scenarios.I've tried multiple ways to setup the multiplexer, here is my last iteration.