sysown / proxysql

High-performance MySQL proxy with a GPL license.
http://www.proxysql.com
GNU General Public License v3.0
5.95k stars 971 forks source link

Make it possible to send all writes to one node #587

Closed mnarusze closed 8 years ago

mnarusze commented 8 years ago

I am trying to setup a Galera cluster constisting of 3 servers where all writes are sent to one node and reads are distributed evenly across the nodes. I found a discussion in https://groups.google.com/forum/#!topic/proxysql/1jG7EJWkbrU where you suggested using weights so that very few of the writes go to remaining nodes.

So far this solution sounds fine but I can easily imagine getting into troubles with it - going for more than 3 servers, for instance. Since this is a workaround rather than good practice, I would suggest making it possible to send all writes only to specified nodes and, in the event of a failure, switch to another ones.

One way of accomplishing that would be to add a parameter to hostgroups which would limit the amount of nodes used. For example, I would like to specify 3 writes-only nodes where only 1 node is used until it is marked as shunned. If it is shunned, then another node from the group takes its place until the new one is shunned and so on...

What do you think?

renecannao commented 8 years ago

Hi. The suggested approach seems reasonable good when there is only 1 ProxySQL instance. If there are many ProxySQL instances accessing the same Galera Cluster, they need to synchronize and agree which one of the Galera nodes should be the writer. Synchronization can be achieved using a service discovery agent like etcd or consul . Right now there is documentation on how to do it, but I plan to write something soon.

Thoughts?

mnarusze commented 8 years ago

Hi Rene,

Thanks for the answer.

I agree with etcd/consul being the best solution for such scenarios in general but, in my opinion, it would take a lot of time to introduce it to a production environment where introducing proxy is already tough enough :) I was hoping for an easier solution like the one I described above but I understand that it's difficult becuase of the > 1 ProxySQL per host scenario. Is this scenario that popular? As far as I understand it would make sense if I had lots of query-matching, but how many would it take to hit performance severely?

FYI while I was running sysbench against 1.2.0h I started unattaching Galera nodes and I also came across this bug. Let me know if you need any logs.

renecannao commented 8 years ago

Hi. Well, the issue is not with more than 1 ProxySQL per host , but when there is 1 ProxySQL per application host or db server. It depends from where you deploy it. If you deploy ProxySQL in the application server (this is a very common scenario), you will need one instance for each server : that could be dozen or even hundreds (I know of setup with 70 proxysql instances). Other setups instead deploy ProxySQL in a dedicated host (or more than one, for HA), or one ProxySQL in the master(or slave) and the proxy will perform read/write split forwarding requests from there to the slave(or master).

I know Percona is working on integrating ProxySQL with Percona PXC (Galera Cluster), so very soon there should be some easy way to send all write requests to just one node!

With regards to this bug: unfortunately, it is a regression bug introduced in recent releases! The good news is that after several attempts I managed to reproduce it easily, and a bug fix for it should be ready very soon. Thank you for your availability in providing logs, but shouldn't be needed. If you agree, I will ping you when a new release is ready. Thank you.

ktwalrus commented 8 years ago

In addition to "weight", I'd like to be able to specify a "backup_group" for each server in a hostgroup. If the backup_group=0, then that server is considered for load balanced requests to the hostgroup (according to its weight). If no servers in the backup_group=0 are up, then the servers in the lowest (numerically) backup_group with online servers would be considered for the new request (load balancing among the servers in the same backup_group).

In the case of a Galera Cluster, the hostgroup for writes could have all servers in the cluster, with the first master having a backup_group=0 and the other servers in separate backup_group according to the order the admin wants these servers to be eligible to receive writes. So for a 3 node cluster, the first server would have backup_group=0, the second server would have backup_group=1, and the third server would have backup_group=2.

In the case of multiple ProxySQL instances, the hostgroup servers should have the same backup_group settings. This wouldn't guarantee that only one server receives writes at all times as there could be a temporary networking glitch (or some other interruption) where a server would be seen as DOWN from one ProxySQL instances and not from another.

I'm not using Galera Cluster, but I do need failover within a hostgroup where the first server receives all requests and the backup server only receives requests if the first server is DOWN. A backup_group setting would help in my case too. It still wouldn't completely solve my use case as I want one way failover, but if ProxySQL runs an external script when a switch is made to another backup_group, that script could adjust the servers list or the load balancing weights to make sure the transition is a one way failover. In my case, the failover script would probably spin up a new backup slave and put it into a new backup_group for the hostgroup so there would be a server to failover if the new master goes down too.

Anyway, just my current idea. I haven't switched over to ProxySQL yet since it doesn't handle one way failover within the proxy (need to implement some sort of polling cron job script to do the failover externally to ProxySQL). I'm still using HAProxy where HAProxy allows servers in a backend to be marked as BACKUP and gives the ability to set the number of successful health checks for each server in the backend before a server is considered UP. This is enough to implement one way failover within HAProxy.

renecannao commented 8 years ago

@mnarusz: I wrote a very simple script that can solve the single writer problem.

The script is a possible implementation on how to solve the problem. It works on the following principle:

The script doesn't take in consideration the fact that there could be a local datacenter and a remote datacenter, but the script is a proof of concept to show that the logic can be implemented externally, and have any level of complexity.

The scheduler and the fact that ProxySQL is re-configurable at runtime allow to implement whatever algorithm required, either balancing writes using weights, or using one writer only.

Further information about the Scheduler can be found here.

renecannao commented 8 years ago

@ktwalrus: The external script capability is now here! You can implement any complex logic/algorithm, however you would like to define it. One way failover is now an option.

Thank you!

mnarusze commented 8 years ago

@renecannao - I just noticed that you replied with a solution - thanks a lot for providing it and being extremely active here. I haven't tested it yet but I will try it as soon as I have a bit of time for improving our DB cluster.

It's me who should thank you, and I do so :)