filecoin-project / go-f3

Golang implementation of Fast Finality in Filecoin (F3)
Apache License 2.0
10 stars 6 forks source link

Systematically permutate message ordering in simulations #232

Open masih opened 5 months ago

masih commented 5 months ago

Copied from slack originally written by @anorth:

The key search space that we want to cover [in tests] is the ordering/interleaving of messages as observed by individual nodes. The timestamps at which they arrive make not much difference, particularly if you consider alarms as just another kind of message to be reordered. The order between different nodes makes not much direct difference (except those nodes may then send new messages which can affect others). So thorough coverage is coverage over all of the possible orderings of messages arriving at each node. A thorough test framework might deterministically try every possible order at least for a small number of participants, and quickly converging protocols. Maybe every possible isn't practical, but we want something where most test runs exercise some different ordering at some nodes. My first thoughts about how to do that are to replace the message queue with something that doesn't use a latency model, but an explicit permutation generator. This isn't a very realistic simulation if we're thinking about gathering metrics, but will be far more effective at exploring the search space than a realistic simulation which would "normally" be "normal".

Implementation considerations:

masih commented 5 months ago

Also consider:

I think finding a metric for state space coverage is probably beyond scope here, but something to think about. Possibly about as good would be a metric coverage of the permutations of the sequence of messages observed, which we might be able to measure in the message queue later.

Borrowed from https://github.com/filecoin-project/go-f3/pull/219#discussion_r1604065802

rjan90 commented 2 months ago

@masih Is this covered by the passive testing we are doing, or do we need a simulation to test this edge case?

masih commented 2 months ago

Is this covered by the passive testing we are doing

No.

do we need a simulation to test this edge case

This is not an edge case; it's a systematic way of simulating every case. This requires a bunch of refactoring / feature implementation in F3 simulation mechanism. Relative to other work this one seems lower priority.