Closed robert-s-lee closed 3 years ago
Ranges with their leaseholder at B are fine. Ranges with their leaseholder at A or C will remain readable (for clients who can reach them). They will remain writeable for a time, but may become read-only to avoid allowing the disconnected replica to fall too far behind.
If the liveness range has its leaseholder in A, nodes in C will be unable to update their heartbeats, so they will appear to be down and everything will move towards A and B (or vice versa if the liveness range is in C). If the liveness range has its leaseholder in B, the broken state could persist for a long time. Failures of the liveness range (and to a lesser extent the meta ranges) can cascade into larger-scale cluster problems.
When connectivity is restored, everything should come back up. We've done some testing of this (jepsen and otherwise); we've had bugs in the past where we would not come back from this kind of network failure without manual intervention.
I don't think the 4-locality version really changes anything currently - nodes with their leader in A or C will still have problems, and there is nothing in particular that will pull replicas out of those regions into a connected subset of nodes.
Currently, this is best addressed as a network routing problem. When the A-C break is detected, packets addressed from A to C should be rerouted on the A-B-C path.
@m-schneider This issue was spawned from a private issue. Let's add this scenario to our geo-distributed testing and discover what the cluster actually does.
Will do!
On Thu, Jan 4, 2018 at 11:56 AM, Peter Mattis notifications@github.com wrote:
@m-schneider https://github.com/m-schneider This issue was spawned from a private issue. Let's add this scenario to our geo-distributed testing and discover what the cluster actually does.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cockroachdb/cockroach/issues/21232#issuecomment-355336734, or mute the thread https://github.com/notifications/unsubscribe-auth/Ab6W5xwM6ThRcRk2mknC2I6SkEKHMmhSks5tHQLWgaJpZM4RTL9Y .
@m-schneider Below is a draft test plan. The attached scripts are Docker demo scripts to perform these tests on a laptop.
W=West C=Central E=East
Symbol | Description |
---|---|
*E* |
either the node or the database is down |
--x-- |
bi-directional network block |
-10ms- |
10ms bi-directional latency between the nodes |
-10ms>- |
left side can initiate with delay and the other side can respond, but the other side cannot initiate |
-<10ms- |
right side can initiate with delay and the other side can respond, but the other side cannot initiate |
Additional esoteric edge condition tests are possible as supported by Linux tc
W-------E
\ /
\ /
\ /
C
W------*E*
\ /
\ /
\ /
C
- Any one node can be down. One node network isolated but the node and process itself are running.
Example of network to node E down is shown below.
W---x---E \ / \ x \ / C
## unusual failure scenarios
- Network link is down between two nodes.
Example of nodes W and E link down is shown below.
W---x---E \ / \ / \ / C
- Network link only works in one direction between two nodes.
Example of node W being able to communicate with E, but E cannot initiate communication with W
W--->---E
\ /
\ /
\ /
C
Example of node E being able to communicate with W, but W cannot initiate communicate with E
W---<---E \ / \ / \ / C
## failure scenarios where system will not be available
- Any two nodes are down
Example of nodes A and C down is shown below.
W-------E \ / \ / \ / C
- network partition
W---x---E \ / x x \ / C
[Archive.zip](https://github.com/cockroachdb/cockroach/files/1612877/Archive.zip)
Let's model these in the context of the network partitioning roachtests: #23141
cc @tbg for re-triage - is there anything actionable here?
Yes, test these in CI and fix any unexpected gotchas.
QUESTION
Consider the following 3 localities with default 3 way replica
A and C suffer network outage, but B is still connected to A and C
[ ] what happens to the leaseholders at A,B and C
[ ] when the A---C connectivity is restored, what happens?
[ ] How about for the 4 localities as below where only A---C connectivity is lost?