jepsen-io / knossos

Verifies the linearizability of experimentally accessible histories.
398 stars 31 forks source link

partial order reduction as an optimization? #4

Closed colin-scott closed 9 years ago

colin-scott commented 9 years ago

Have you considered applying partial order reduction as a technique for reducing the runtime of knossos?

Basic idea: if two messages are concurrent (formally: if one message does not happen-before the other), they must be independent, therefore you do not need to consider both interleavings of those messages.

In a distributed system, you wouldn't need to modify the systems under test to infer concurrent messages; you could infer concurrent messages by attaching vector clocks to events in the network. cf. Modist and shivector.

Sidenote: For a nice overview of state space reduction techniques, check out section 2.2 of this paper.

colin-scott commented 9 years ago

Then again, it's possible that these techniques aren't relevant for checking linearizability? Need to think about this a little more deeply

aphyr commented 9 years ago

Knossos already applies partial order reduction, and yeah, both papers were references for Knossos.

colin-scott commented 9 years ago

Just so that I can more easily find what techniques you use in the future, where are those papers referenced?

aphyr commented 9 years ago

They're referenced in the knossos blog post (http://aphyr.com/posts/314-computational-techniques-in-knossos) but they probably won't be of much help in figuring out the algorithm. A lot of the papers required information knossos isn't allowed to have, and some of the more naive approaches I built didn't deal well with the kind of histories Jepsen creates.

So: to my knowledge, nobody's tried doing it this way before, haha. SOMEONE probably has, but I don't know who! I should probably turn the blog post into an actual paper at some point, haha.

When I say it does partial order reduction, it's kind of implicit; the algorithm just doesn't visit worlds where concurrent evaluation of nonconcurrent ops happen.