RedisLabs / redis-cluster-proxy

A proxy for Redis clusters.
GNU Affero General Public License v3.0
994 stars 129 forks source link

Ability to specify several entrypoints to the cluster #8

Open JanBerktold opened 4 years ago

JanBerktold commented 4 years ago

Hi there, @artix75!

First of all, this project looks like it could eliminate many pain points with using a Redis cluster for us, super exciting! Thanks a lot for working on this.

I've taken it for a test drive with our Redis cluster deployment and noticed a few things that would make it easier to use. For context, we run our Redis clusters in a container scheduler (think Kubernetes, Hashicorp Nomad etc) which implies that individual nodes will move around a lot.

  1. Being able to specify more than a single entrypoint to the cluster would allow a two-tier deployment with individually scalable redis & proxy groups. Additionally, the proxy would not fail to start if the one node it was assigned to just moved/crashed. As part of this, it would be nice to be able to read servers from the config file as opposed to a mandatory command line argument as that's a little easier to setup in our context (and everybody else running in docker I imagine)
  2. Really cool would also be the ability to reload those entrypoints (maybe by listening for SIGHUP + rereading the config file?). IMO the ideal behaviour would be
    • If any nodes were added but I have healthy instances to talk to -> no action required
    • If any nodes were added and all of the nodes that I know about are dead/gone -> connect to these new nodes That'd allow us to not restart the proxy whenever one of the allocation moves.
  3. The last one is a potential bug which I'm working on a solid repro case for: As part of cluster bootstrapping, we do a CLUSTER RESET HARD. If the proxy connected to the node prior to doing this, it will not pick up the nodes that we start learning about afterwards - The cluster can be healthy but requests sent to the proxy will fail. Take this with a grain of salt - a few assumptions here, still investigating this.

I'd be happy to send pull requests for any of the points above if they align with your vision for this project. Additionally, do you have a published roadmap/next planned work items so we could start contributing a bit? :)

toredash commented 4 years ago

Wanted to chime in on this as we are looking on using redis-cluster-proxy in the future ourselves.

I've modified a Lua client to support a dynamic list of redis servers. When the client is starting, it needs to reach at least one healthy redis node. Once a connection is established, the clients creates an internal list of servers listed in cluster slots. If the client for any reason is unable to communicate with the cluster, it will try to contact all servers in the internal list to get a working connection.

If there is nodes added/removed, the internal list is updated, as ASK/MOVED commands will force the client to update it slots information.

Modifications I did for the lua client is here: https://github.com/steve0511/resty-redis-cluster/compare/master...toredash:fix/dynamic-serv-list

I've taken it for a test drive with our Redis cluster deployment and noticed a few things that would make it easier to use. For context, we run our Redis clusters in a container scheduler (think Kubernetes, Hashicorp Nomad etc) which implies that individual nodes will move around a lot.

  1. Being able to specify more than a single entrypoint to the cluster would allow a two-tier deployment with individually scalable redis & proxy groups. Additionally, the proxy would not fail to start if the one node it was assigned to just moved/crashed. As part of this, it would be nice to be able to read servers from the config file as opposed to a mandatory command line argument as that's a little easier to setup in our context (and everybody else running in docker I imagine)

Could this be solved by using a service in e.g. k8s ? That way you would always hit a node that is reporting. This seems like a thing that could be solved with using DNS.

  1. Really cool would also be the ability to reload those entrypoints (maybe by listening for SIGHUP + rereading the config file?). IMO the ideal behaviour would be
  • If any nodes were added but I have healthy instances to talk to -> no action required
  • If any nodes were added and all of the nodes that I know about are dead/gone -> connect to these new nodes That'd allow us to not restart the proxy whenever one of the allocation moves.

See my notes above about one approach in Lua

JanBerktold commented 4 years ago

Using DNS is an interesting idea and it would for sure solve the "please give me any node which is living right now" and as such would be a step up/allow us to run a two-tiered deployment.

However, it does a lack of the flexibility that specifically listing all/a subset of IPs would provide, e.g.

Thank you for posting your solution for the lua client, that behaviour looks great and would solve all of my concerns. :)

artix75 commented 4 years ago

@JanBerktold Hi, thank you for your reports and suggestions.

As for the multiple entrypoints, it's a cool feature that it's worth to implement in the next future IMHO. I'll try to answer to the other suggestions:

  1. ...it would be nice to be able to read servers from the config file as opposed to a mandatory command line argument as that's a little easier to setup in our context (and everybody else running in docker I imagine)

You can already specify the endpoint in the config file. Just launch the proxy specifying the config file:

./redis-cluster-proxy -c /path/to/proxy.conf

Inside the config file, you can set the endpoint by this way:

cluster 127.0.0.1:7000

In the latest commit (unstable branch), you can also find an example of a config file (proxy.conf)

  1. Really cool would also be the ability to reload those entrypoints (maybe by listening for SIGHUP + rereading the config file?). IMO the ideal behaviour would be
  • If any nodes were added but I have healthy instances to talk to -> no action required
  • If any nodes were added and all of the nodes that I know about are dead/gone -> connect to these new nodes That'd allow us to not restart the proxy whenever one of the allocation moves.
  1. The last one is a potential bug which I'm working on a solid repro case for: As part of cluster bootstrapping, we do a CLUSTER RESET HARD. If the proxy connected to the node prior to doing this, it will not pick up the nodes that we start learning about afterwards - The cluster can be healthy but requests sent to the proxy will fail. Take this with a grain of salt - a few assumptions here, still investigating this.

Currently, the proxy automatically reconfigures its internal cluster representation following ASK or MOVED replies (in the next days I'll implement handlers for other cluster-related errors), anyway it always uses the endpoint specified via command-line or config file. Implementing multiple endpoints and considering every master node as a potential entry point (also after a cluster reconfiguration, ie with added nodes) could maybe manage the issue.

I'd be happy to send pull requests for any of the points above if they align with your vision for this project. Additionally, do you have a published roadmap/next planned work items so we could start contributing a bit? :)

You're free to send PRs if you want, a roadmap will be available after the first RC1 (which is planned by the end of this month). Thank you again :)