Systematically permutate message ordering in simulations

masih commented 5 months ago

Copied from slack originally written by @anorth:

The key search space that we want to cover [in tests] is the ordering/interleaving of messages as observed by individual nodes. The timestamps at which they arrive make not much difference, particularly if you consider alarms as just another kind of message to be reordered. The order between different nodes makes not much direct difference (except those nodes may then send new messages which can affect others). So thorough coverage is coverage over all of the possible orderings of messages arriving at each node. A thorough test framework might deterministically try every possible order at least for a small number of participants, and quickly converging protocols. Maybe every possible isn't practical, but we want something where most test runs exercise some different ordering at some nodes. My first thoughts about how to do that are to replace the message queue with something that doesn't use a latency model, but an explicit permutation generator. This isn't a very realistic simulation if we're thinking about gathering metrics, but will be far more effective at exploring the search space than a realistic simulation which would "normally" be "normal".

Implementation considerations:

Consider making message queueing in simulation pluggable, such that a user can choose what queueing to use for a simulation, i.e. permutating queue or the current latency based one.
message order permutation space grows explosively; we probably want to concentrate on full systematic permutation per gpbft instance than reduced permutations across multiple instances. Because, the current lanecy based message queue already does the latter.
full permutation of message per instance would then mean a simulation should be able to checkpoint its state at each instance and re-execute the instance with every permutation to assert that some given condition is met regardless.

masih commented 5 months ago

Also consider:

I think finding a metric for state space coverage is probably beyond scope here, but something to think about. Possibly about as good would be a metric coverage of the permutations of the sequence of messages observed, which we might be able to measure in the message queue later.

Borrowed from https://github.com/filecoin-project/go-f3/pull/219#discussion_r1604065802

rjan90 commented 2 months ago

@masih Is this covered by the passive testing we are doing, or do we need a simulation to test this edge case?

masih commented 2 months ago

Is this covered by the passive testing we are doing

No.

do we need a simulation to test this edge case

This is not an edge case; it's a systematic way of simulating every case. This requires a bunch of refactoring / feature implementation in F3 simulation mechanism. Relative to other work this one seems lower priority.

filecoin-project / go-f3

Systematically permutate message ordering in simulations #232