Test network partitioning

edsko commented 5 years ago

We should test what happens when the network gets temporarily partitioned. This is a particularly important test because it is the one that will cause rollbacks.

edsko commented 5 years ago

In prop_simple_bft_convergence we have

takeChainPrefix :: Chain (Block DemoBFT) -> Chain (Block DemoBFT)
takeChainPrefix = id -- in BFT, chains should indeed all be equal.

This however I think is not true once we start having network failures (@brunjlar , do you agree?)

nfrisby commented 5 years ago

@mrBliss Here's my initial approach. Seem reasonable? Might need more complexity when eventually combined with input-output-hk/ouroboros-network#231.

broadcastNetwork defines a mesh network for each mini protocol: every node is connected to every other node via two unidirectional Channels per mini protocol. We have data Channel m a = Channel {send :: a -> m (),recv :: m (Maybe a)}. So we could simulate a temporary partitioning into two node sets A and B by having the send function of every Channel connecting A and B simply discard its input when btime is inside some interval(s). A new QuickCheck generator would randomize the node sets (without input-output-hk/ouroboros-network#231, a simple threshold on node ID should be general enough) and the disconnection interval(s).

Edit: Seems like maybe the node needs to be notified (e.g. by exception) that its send failed. With the approach describe above, it looks nodes are just locked up; I'm guessing blocked on the response. Learning about Channel and its use in the nodes/protocols now.

Edit 2: Actually, it seems much more reasonable to block the send until the end of the network interruption instead of silently discard the send.

nfrisby commented 5 years ago

I have a local patch that splits the network into two non-empty partitions during one interval of the run. Once the interruption ceases, the two partitions exchange rollbacks (in the common case where they both extend their chains incompatibly during the interruption).

For BFT and PBFT, if I include logic in the interruption plan generator to bound the duration of the network interruption, then the subsequent rollbacks respect k and the tests pass. If the interruption duration is not bounded in that way, then the resulting rollbacks may exceed k, causing the tests to fail. So it's a well-behaved and useful property test.

For the Praos test, however, the logic I used for BFT and PBFT is insufficient. For example, it schedules an interruption that BFT and PBFT would handle without failure, but that same interruption schedule happens to bridge the gap between two "crowded runs" in the Praos test. The crowded+partitioned+crowded run's duration is long enough too create unacceptably large rollbacks once it ends.

In other words, the network partitioning introduces an "interrupt duration too long" concept similar to the "LeaderSchedule too crowded`" concept. Moreover, the two concepts can interact to create failures when neither would individually cause a failure.

I'm unsure how to proceed and would appreciate advice. Here are some related observations.

The LeaderSchedule test uses suchThat to avoid generating "too crowded" leader schedules.
The Praos test uses the nodes' uninterpretable PRNG seed instead of a LeaderSchedule generator, so it can't avoid "too crowded" LeaderSchedules a priori. It instead dismisses them after the fact, in a guard in the Property.
My generator for BFT and PBFT is comparable to the LeaderSchedule test's approach, but instead uses carefully chosen choose boundaries instead of suchThat.

And here are some options.

Extend the Praos property to be aware of the interruption and dismiss those cases too.
Extend the interruption plan generator to take a LeaderSchedule argument and avoid interruptions that problematically interact with crowded runs.
Don't write tests that mix the two concepts, at at first.

Edit: remove the sentence about LeaderSchedule test not failing: it is failing as expected.

mrBliss commented 5 years ago

What about the following approach:

We stop carefully tuning our generators to avoid rollbacks of more than k blocks and to avoid hard to predict conflicts between the network interruption and "crowded runs", and accept the fact that in some test cases we won't be able to reach consensus. We can still write our generators such that in most cases we are able to reach consensus (verify this with labelling!), otherwise we're not testing the interesting scenarios.
Define a function that for a whole test setup (including the LeaderSchedule and any information that may only be available after running the test, etc.) determines whether we should have been able to reach consensus or not.
After running each test, we check whether we have reached consensus, if not, then our function must tell us that it was indeed acceptable to not reach consensus for the given test setup.

I have not tried it out, but I hope this makes some things easier. What do you think?

nfrisby commented 5 years ago

Yes, I've reached the same conclusion: the next thing to try is writing a dismissable predicate that determines (conservatively) when a combination of the LeaderSchedule, InterruptionPlan, etc (eventually including input-output-hk/ouroboros-consensus#802, input-output-hk/ouroboros-network#231, input-output-hk/ouroboros-consensus#800 ...) test configurations risk rollbacks > k. We can use it in generators with suchThat (like LeaderSchedule test) or in properties when the a priori is not an option (like Praos test).

For that reason, I'm now working to understand the details of the LeaderSchedule (and by proxy the Praos) test property.

nfrisby commented 4 years ago

While recently re-reading the Ouroboros papers (see eg Issue input-output-hk/ouroboros-consensus#695), I noticed that the BFT paper does not limit how many blocks a node can rollback. Our BFT implementation, though, does limit that to the security parameter k.

This observation relates to this Issue because it invalidates the paper's claim that BFT is resilient to network partitions.

Network splits. In the case of a network split, the network is temporarily partitioned into s connected components for some s≥2, each one containing n1,...,ns servers for a sequence of slots D. Assuming no other failures, it is easy to see that transaction processing will continue normally in each connected component [edit: the paper does not mention that chain density will decrease). Furthermore, by slot max D+n, all servers will be activated and the system will converge to a unique blockchain. Indeed, let i be the maximal connected component that includes the server that controls the earliest time slot in the n slots that follow D, say slj. It easy to see that after slj all servers will converge to the blockchain emitted by this server. It follows that Ouroboros-BFT is resilient to network splits. Note that transactions processed within any other connected component other than the maximal component with the winning server maybe lost and hence they have to be resubmitted.

For D "large enough" (possibly also including the effects of other confounding factors like network latency etc), the connected components' chains will have unique suffixes that include more than k blocks, and so nodes running our implementation of BFT will not be able to all switch to the same chain after the network split ends.

dcoutts commented 4 years ago

Yes, we inherit this aspect of Ouroboros classic, and when we're in Praos then we'll also have a k rollback limit. So this aspect of BFT is something we do take advantage of, and of course it would only be temporary anyway. So yes, we can still only tolerate partitions up to k blocks long.

IntersectMBO / ouroboros-consensus

Test network partitioning #801