bloomberg / comdb2

Bloomberg's distributed RDBMS
Other
1.36k stars 209 forks source link

how does comdb deal with partitions #647

Open pdeva opened 6 years ago

pdeva commented 6 years ago

The paper doesnt say much. Only:

A node disconnected from a master
for a prolonged time (ie longer than lease) is unable to serve
any type of request. Further, a network partition leaving no
group as a majority results in a cluster with no master and
all nodes unable to serve any requests.

Lets say you have a cluster with 9 nodes.

a partition happens such that, the cluster gets separated into groups of:

  1. 5 nodes
  2. 3 nodes
  3. 1 node

Now after a while of partitioning, each of these 3 clusters will think of themselves as largest and allow transactions over each one.

Once the partition goes, away and all 9 nodes can see each other, what happens to the transactions that were already committed in each of the separate 3 groups?

akshatsikarwar commented 6 years ago

The partition with 5 machines will keep serving requests as it is the group with majority of nodes. The other two partitions will not service any requests.

mhannum72 commented 6 years ago

We've done a lot of work in the past year making sure that we behave correctly for these cases. The short explanation is that if a commit isn't replicated to at least a majority of the cluster nodes, the master will return an rcode that asks the API to retry the request against another node. There's a small window where a master on the non-majority side of a partition might commit locally (and add to the log-stream). During that window the master will always send a "NOT_DURABLE" rcode to the client api. After the partition heals, the former master will unwind any log-records that are no longer part of the log stream.

We haven't made this the default behavior yet (but we are considering it). See tests/jepsen_register.test/lrl.options for the lrl options that we used when testing.

pdeva commented 6 years ago

what happens when the number of nodes is even. eg: there are 10 nodes. how does comdb decide the majority?

i know one must have odd number of nodes, but it possible that the 11th node never came online and the cluster is comprised of 10 nodes due to that.

akshatsikarwar commented 6 years ago

With even number of nodes, partition with existing master is allowed to service request.