tarantool / tarantool-java

A Java client for Tarantool
http://tarantool.org
BSD 3-Clause "New" or "Revised" License
52 stars 18 forks source link

Support load balancing between several tarantool instances #246

Open Totktonada opened 4 years ago

Totktonada commented 4 years ago

Now we have the cluster client that supports failover, but it does not allow to balance a load between several tarantool instances. Users often ask for this, however.

We however provide some guarantees: say, that sync requests will be executed in order. Or that JDBC's batch updates (as well as async requests) will be in order under some circumstances: mainly it is using of memtx engine and don't have DDL requests in a batch.

We should not break those guarantees for existing singletone and cluster client: at least we should do this under a certain option and cleanly state which kinds of assumptions will fails when requests are balanced over instances.

Totktonada commented 4 years ago

Raw idea, which is based on what @nicktorwald shares about postgresql:

For example an application can create two connection pools. One data source is for writes, another for reads. The write pool limits connections only to master node:

jdbc:postgresql://node1,node2,node3/accounting?targetServerType=master.

And read pool balances connections between slaves nodes, but allows connections also to master if no slaves are available:

jdbc:postgresql://node1,node2,node3/accounting?targetServerType=preferSlave&loadBalanceHosts=true

If a slave fails, all slaves in the list will be tried first. If the case that there are no available slaves the master will be tried. If all of the servers are marked as "can't connect" in the cache then an attempt will be made to connect to all of the hosts in the URL in order.

Cluster discovery contract can be extended to provide information which instances are ready to write requests and which are not (using the second return value). An option (in jdbc: URI or in configuration) can control whether a particular connection instance supports write requests. If we'll get ER_READONLY, then we should trigger cluster discovery and retry the request if it appears that a current instance becomes read-only, but another one becomes writeable. Maybe after read-only session support there will be another way to detect this case.

Open questions:

This is just random thoughts around: written here to don't forget them.