twitter / twemproxy

A fast, light-weight proxy for memcached and redis
Apache License 2.0
12.13k stars 2.06k forks source link

best practice for slot distribution in redis backends #405

Open haukebruno opened 9 years ago

haukebruno commented 9 years ago

Hi there,

I want to use some nutcrackers in front of a small set of Redis (cluster) backends. Currently this will be 5 backends and in summary 5 Masters with 4 Slaves each.

Can someone explain to me, what the better option will be? To use 5 independend clusters containing 1 master with 16384 hash slots mapped to each cluster or to use on big cluster containing all masters with a distributed set of hash slots?

As I want to use sharding from nutcrackers site, I wonder what the best practice will be in that case.

cheers, hauke

therealbill commented 9 years ago

On Aug 21, 2015, at 21:44, Hauke Bruno notifications@github.com wrote:

Hi there,

I want to use some nutcrackers in front of a small set of Redis (cluster) backends. Currently this will be 5 backends and in summary 5 Masters with 4 Slaves each.

First things first: now that Redis Cluster is available, do try to not call a master-slave setup a cluster. It merely confuses things.

So you are planning to run 5 masters, each master having 4 slaves, so 5 masters and 20 slaves in all; is that correct?

Can someone explain to me, what the better option will be? To use 5 independend clusters containing 1 master with 16384 hash slots mapped to each cluster or to use on big cluster containing all masters with a distributed set of hash slots?

Now you seem to be talking about Redis Cluster, which you won't want to have behind Twemproxy at all. Perhaps you can explain it more clearly, being sure to only use the term cluster to mean Redis Cluster. Feel free to use Pod for a master+slaves configuration.

As I want to use sharding from nutcrackers site, I wonder what the best practice will be in that case.

It depends on what your needs are, why you think you will need so few masters to so many slaves, and what workload and goals you are after.

Cheers, Bill

haukebruno commented 9 years ago

Hi Bill,

sorry for the confusion.

So you are planning to run 5 masters, each master having 4 slaves, so 5 masters and 20 slaves in all; is that correct?

Yes. I have 5 virtual maschines and want to run 5 redis instances each (1 as a master and 4 as slaves for the other masters).

Now you seem to be talking about Redis Cluster, which you won't want to have behind Twemproxy at all. Perhaps you can explain it more clearly, being sure to only use the term cluster to mean Redis Cluster. Feel free to use Pod for a master+slaves configuration. It depends on what your needs are, why you think you will need so few masters to so many slaves, and what workload and goals you are after.

What I need (or want) is a highly available redis backend for one frontend application. The application itself can't talk to the native redis cluster now, so my idea is to let the application talk to twemproxy and let twemproxy talk to the 5 redis masters (and let the 5 redis masters replicate their stuff to 4 other slaves each for availability). Sentinel and Smitty (or an equal agent) should do the failover things.

The workload is a bit unknown for me, so it could be possible to add more redis masters later on. Those high number of slaves is just a kind of paranoia, I do not want things to stop work.

thanks, hauke

therealbill commented 9 years ago

On Aug 22, 2015, at 01:54, Hauke Bruno notifications@github.com wrote:

Hi Bill,

sorry for the confusion.

No worries, it happens. Especially with the word "cluster". ;)

So you are planning to run 5 masters, each master having 4 slaves, so 5 masters and 20 slaves in all; is that correct?

Yes. I have 5 virtual maschines and want to run 5 redis instances each (1 as a master and 4 as slaves for the other masters).

Begging some leeway, I'd like to drill this down further. So we have 5 VMs. Each VM will run one master, and 1 slave for each other master in the system. Thus each master only has one slave. Is that accurate? Now you seem to be talking about Redis Cluster, which you won't want to have behind Twemproxy at all. Perhaps you can explain it more clearly, being sure to only use the term cluster to mean Redis Cluster. Feel free to use Pod for a master+slaves configuration. It depends on what your needs are, why you think you will need so few masters to so many slaves, and what workload and goals you are after.

What I need (or want) is a highly available redis backend for one frontend application. The application itself can't talk to the native redis cluster now, so my idea is to let the application talk to twemproxy and let twemproxy talk to the 5 redis masters (and let the 5 redis masters replicate their stuff to 4 other slaves each for availability). Sentinel and Smitty (or an equal agent) should do the failover things.

The workload is a bit unknown for me, so it could be possible to add more redis masters later on. Those high number of slaves is just a kind of paranoia, I do not want things to stop work.

So here is what I would recommend given what I understand of the above:

Keep clustering in mind when laying out your data structures. By that I mean try to use commands that work in cluster modes. For example, avoid mget and multi-key operations such as sorted set set operations.

Next, I would use "you ain't gonna need it" (YAGNI) principle and go with a straightforward single-master and one or two slave setup using sentinel to manage failover. Redis can handle very high rates of traffic, into the millions. What you know of your use case doesn't sound like the ones Redis Cluster is designed for.

For HA I almost always recommend Pods over Clustering. Nearly all major libraries support sentinel - even if you have to execute the commands to find the master yourself. Fail overs in Redis are quite rare in my experience (I run thousands of these) so unless your app will be differentiating reads and writes onto slaves and the master, more than two slaves doesn't really get you much.

Indeed you could even put a TCP load balancer in front of the pod to entirely isolate Sentinel usage from the clients, but that may be unneeded complexity. If running Sentinel you may want to check out https://github.com/sentinel-tools/ for some convenience tools.

Cluster brings a lot of complexity and operational overhead - and you can't put a proxy in front of it and have it work out of the box. Plus it is designed make for having multiple write masters, or for exceeding the limits of physical memory by effectively pooling memory from multiple systems.

Twemproxy primarily handles doing some of what cluster does (you'd still need to manage Redis fail overs and their knock-on effects), and can provide a decent way to scale concurrent connections. If your client connections are not large this benefit may not be applicable.

Indeed if my understanding of your current plan is accurate it is actually more fragile than you'd think. I'll explain more if my understanding is accurate.

Redis has some nice HA tools, but fortunately due to it's speed and robustness they are rarely needed. I'd keep it simple until you've got some decent data around workload and specific use. Otherwise you could see longer delays and more sleepless nights than slowly adding complexity when it solves a specific problem.

Cheers, Bill

haukebruno commented 9 years ago

So we have 5 VMs. Each VM will run one master, and 1 slave for each other master in the system. Thus each master only has one slave. Is that accurate?

From my current point of view, the layout should be the following: VM1: M1, S2, S3, S4, S5 VM2: M2, S1, S3, S4, S5 VM3: M3, S1, S2, S4, S5 [...]

I fully agree with all your points (especially using KISS principle), but my current situation is the following: I do have a customer who developed an application using redis. But the client libraries used in that app can't talk to Redis Cluster. My job is now to evaluate and implement some kind of architecture that provides one single endpoint to the client that is highly available (+ scalable for the future). I also do not have the opportunity to edit the application servers (e.g. putting sentinels to it).

So I definitively need some kind of virtual IP that is provided by whatever to enable the communication from the client to the (more or less dynamic) redis instances. Of cource I could put some TCP balancers in the front, but then I need to implement the Redis failover logics to it, therefore the current idea is to use twemproxy for serving that endpoint.

I know that this will add a lot more complexity to the whole system instead of just using Redis Cluster and I look forward to change the client side for native Cluster support, but for now I need a good solution to serve 1 (or 5) Redis Cluster to a non-redis-cluster client.

cheers, hauke

therealbill commented 9 years ago

On Aug 22, 2015, at 21:19, Hauke Bruno notifications@github.com wrote:

So we have 5 VMs. Each VM will run one master, and 1 slave for each other master in the system. Thus each master only has one slave. Is that accurate?

From my current point of view, the layout should be the following: VM1: M1, S2, S3, S4, S5 VM2: M2, S1, S3, S4, S5 VM3: M3, S1, S2, S4, S5 [...]

Ok, that's what I thought. Thanks for the clarification. Particularly given these are virtual machines I'd recommend against that layout. If you actually have five pods worth of write transactions your network will choke between the master traffic and the replication traffic.

Consider this. Say you have 10Mb/second of write traffic to each master. That would mean you have 50Mb inbound to each VM and a minimum of 40Mb outbound - not counting read traffic from the master.

Now that may sound like very little. But if you have enough write traffic to actually need multiple masters you are very unlikely to have the bandwidth on your NICs to smoothly handle the traffic for the rest. Redis is faster then your network.

To give you an idea, a single Redis master can send enough data to swamp a 1GB NIC without breaking a sweat, or even breathing hard.

I fully agree with all your points (especially using KISS principle), but my current situation is the following: I do have a customer who developed an application using redis. But the client libraries used in that app can't talk to Redis Cluster. My job is now to evaluate and implement some kind of architecture that provides one single endpoint to the client that is highly available (+ scalable for the future). I also do not have the opportunity to edit the application servers (e.g. putting sentinels to it). So I definitively need some kind of virtual IP that is provided by whatever to enable the communication from the client to the (more or less dynamic) redis instances. Of cource I could put some TCP balancers in the front, but then I need to implement the Redis failover logics to it, therefore the current idea is to use twemproxy for serving that endpoint.

You don't have to implement failover logic in the load balancer. You can have sentinel do that by having it run in the balancers and having it reconfigure them when a failover occurs. It is pretty standard Redis architecture, and you can find many resources in how to do it.

That said, if you have no access to the client, you've got another problem lying in wait if you go with either Twemproxy or Redis Cluster. Neither of them support the full Redis command set. As a result the moment tour client decides to use a standard Redis command which doesn't work on whichever you go with, your service - more accurately their application - will break.

I know that this will add a lot more complexity to the whole system instead of just using Redis Cluster and I look forward to change the client side for native Cluster support, but for now I need a good solution to serve 1 (or 5) Redis Cluster to a non-redis-cluster client.

You say just like you are actually planning to run Redis Cluster behind Twemproxy. That is a non-starter as Twemproxy doesn't support it either. If that is your intent I'll submit you don't understand Redis Cluster or Twemproxy well enough to be designing a system combining the two.

Ultimately the decision is, of course, yours. However as someone with a lot of experience with Redis across thousands of systems and in a large variety and sizes, I would neither implement nor recommend what you are planning.

If you want the official Redis Cluster to be transparently proxied to clients which don't understand protocol you will need to write said proxy yourself as it does not currently exist.

If you simply need transparent HA you will need to set up a TCP load balancer which is updated based on failovers.

Cheers and good luck, Bill

haukebruno commented 9 years ago

Thanks for the explainations. I think it would be the best to just start with one Redis Cluster behind a HAProxy/Sentinel box for providing HA.

Using my planned architecture seems a lot more complicated than I thought, so I will follow your recommendations and keeping things simple.

Thanks a lot for the advices,

cheers, hauke

therealbill commented 9 years ago

On Aug 23, 2015, at 20:13, Hauke Bruno notifications@github.com wrote:

Thanks for the explainations. I think it would be the best to just start with one Redis Cluster behind a HAProxy/Sentinel box for providing HA.

You're welcome, and you're headed in the right direction. But you can't place Redis Cluster behind a proxy and expect it to work unless you teach said proxy to act as a cluster client and handle all redirections and topology changes. Any existing proxy will simply proxy the responses from the Cluster. Thus when you ask it for a key and it proxies to a node which doesn't have it, HAproxy will proxy the "MOVED" reply to the client, which as you've already indicated won't understand it.

What your looking for is a simple Redis master with a couple slaves and sentinel handling failover and updating HAProxy on the rare occasion it happens. Don't look for Redis Cluster in this setup.

Using my planned architecture seems a lot more complicated than I thought, so I will follow your recommendations and keeping things simple.

Thanks a lot for the advices,

You're welcome. Anytime I can help someone avoid a deep hole of needless pain with Redis, it's worth the effort. Do keep in mind the Google Group for areas on your path.

Cheers, Bill