Azure Redis Best Practices

Sam,

I was hoping pick your brain a little. I have been experimenting with your RedLock implementation in a distributed micro-service architecture hosted in Azure and using Azure Redis. Up until now I have been using a single RedLockEndPoint, pointing at an Azure Redis P1 w/2 shards.

The load balancer for Azure Redis ensures that the master is always write accessible as you pointed out in your response to this SO question: https://stackoverflow.com/questions/34681168/what-is-the-recommended-way-of-creating-a-distributed-lock-with-redis-on-azure.

What is not clear in reading your writings on the topic or others that have invested time in implementing the redlock pattern is that in cases like Azure Redis, regardless of cluster shard count (i.e. Azure Redis P1 w/2 shards), do you still need to run a minimum of three clusters independent of each other (i.e. 3 distinct Azure Redis Clusters w/2 shards each, each represented by a RedLockEndPoint)?

My instinct is that yes, this is the case simply because of the fact that if a write to master, load balanced or not is successful but the master goes offline prior to replication, regardless of the load balancer in front of it, that lock would be "lost" since the slave that the LB re-targets will be unaware of it. That is of course assuming that the LB is not playing a role in the distribution to the shards and that the slave shards are being seeded by master, but that would be less of a master/slave and more of a masterless slave config w/ distributed seeding but I do not believe that is happening in Azure Redis.

All of my testing has shown great promise and I want to bring this into production testing but before doing so I want to ensure that the environment supporting it is sufficient (number of Azure Redis instances, each with a sufficient number of shards). Also, if there is a difference between the minimum number of both and a recommended/best practice number it would be interesting to understand that as well.

Kind Regards,

James Legan

Hi James,

Sorry about the delay replying.

I agree with your reasoning around clustering and replication.

As you've noted, there is the possibility in a replicated set (either cluster with replicated shards, or just a simple master/slave replica) for a master node to become unavailable and have a slave promoted to the master role before the write has been replicated to that slave (essentially losing the write). The "Redis Cluster consistency guarantees" section of https://redis.io/topics/cluster-tutorial talks about this scenario.

In my mind, you would get the most resiliency for RedLock by running multiple (a minimum of 3) independent nodes/clusters as a single RedLockEndPoint. In this scenario you can stand to lose a node/cluster (or lose the key in a node/cluster) without losing the lock you are holding, as long at the key is still present in more than half of the configured nodes/clusters.

As for how many nodes/clusters, that probably depends on your workload, tolerance for failure, and budget. If you have 3 nodes/clusters you can stand to lose 1 before you don't have enough for a quorum, 5 nodes you can lose 2, etc...

If the redis instances are only used for RedLock (i.e. you don't have any other keys in there that you can't stand to lose) I think you'd get better lock resiliency for your money using, say, 3 independent non-clustered instances than 1 clustered instance (so long as they can handle your workload).

Sam

samcook / RedLock.net

Azure Redis Best Practices #69