Open etschannen opened 5 years ago
I take it that clients will need to opt-in to reading potentially stale data, and that the current scalability limit is very large?
You are correct, clients would need to pass a transaction option to read stale data. Currently, the cluster controller can handle 1000+ processes without any issues. This limit would be even higher however there is another issue which is that every client keeps a connection open to the cluster controller so that they can get updates when the master proxies change.
The cluster controller is a singleton role which can be overwhelmed if too many processes are connected to the database. This limits the total number of processes that can join a database, and therefore limits how large a database can scale.
In addition to scaling concerns, supporting multiple cluster controllers will enable a datacenter that is partitioned from the rest of the system to still provide stale reads from that location.
Finally, having multiple cluster controllers will reduce or eliminate the cost of starting up a new cluster controller when either a cluster controller has died, or we are switching the primary datacenter of a cluster. This should reduce the master recovery times in these scenarios.
Proposed design: