hernanponcedeleon / Dat3M

A verification tool for many memory models
MIT License
77 stars 30 forks source link

Rework ExecutionModel #732

Open hernanponcedeleon opened 2 months ago

hernanponcedeleon commented 2 months ago

This is as a collection of requirements for the rework of the ExecutionModel class and its usages (specially generation of violation witnesses in *.graphviz format).

Currently, the ExecutionModel is a wrapper around an SMT model representing a (not necessarily consistent) execution with an API that allows to get information in a workable manner. We need a more general class that represents executions no matter where this execution is coming from (i.e., its "source"). Some possible sources are SMT model, svcomp violation witness, violation coming from other tools like genmc.

Internally, the class should contain at least 1) the executed events 2) the reads-from relation (rf) 3) the coherence relation (co)

It should also be able to give information about other base relations like po, ext, ctrl, ... One could explicitly store those relations (similar to rf and co) or extract them using information from the Event class, e.g. we can check if two events belong to the same thread with Event.getThread() and compare ids with Event.getGlobalId(). This gives us all info we need to decide the program order po. For dependency relations (data, addr, ctrl), it might be easier to save them explicitly (e.g., this is something the SMT solver will give us, but maybe not other sources)

It should also be able to compute derive relations. Right now, if one uses --method=eager, derived relations can also be queried to the SMT solver model (same as for base relations). However, if one uses --method=lazy, the SMT solver only knows about base relations. The WMMSolver will construct a ExecutionGraph from the ExecutionModel which populates the graph representation of derived relations.

The new ExecutionModel should definitely be able to populate itself from an SMT model, but the design should be flexible enough to eventually rewrite the code in /home/ponce/git/Dat3M/dartagnan/src/main/java/com/dat3m/dartagnan/witness/graphml to use the ExecutionModel. The default --method is eager, so the solution should definitely be able to populate the derived relations from the SMT model information of base ones.

Right now, ExecutionModel only keep track of memory events (loads and stores). The new class should also keep track of other events (at least Local since those help to debug low level data flow). Control-flow events like CondJump and Label can probably be skipped.

There are at least three usages of the ExecutionModel 1) WMMSolver: each iteration of the verification will create an execution (in most of the cases inconsistent wrt the memory model) and send this to the theory solver to check for consistency. This use case only requires capturing executed events and base relations. 2) Graphviz violation visualizer: once a consistent execution violating a property was found, we need to write this to a file for the user. The current code implementing this is in /home/ponce/git/Dat3M/dartagnan/src/main/java/com/dat3m/dartagnan/witness/grapviz. We need to allow at least two extra options: show derived relations (given by the user as an option --witness.doshow=a,b,c), and show local events (in many cases this will blow up the size of the graph so the option should be off by default) 3) SVCOMP violation witness: this use case assumes the existence of a hb relation in the cat model (file /cat/svcomp.cat satisfies this). We then create a linearization of the order and convert it to *.graphml format. The current code doing this (but not using ExecutionModel) is in /home/ponce/git/Dat3M/dartagnan/src/main/java/com/dat3m/dartagnan/witness/graphml

hernanponcedeleon commented 2 months ago

@CapZTr as we discussed,

@ThomasHaas @natgavrilenko @JonasOberhauser if you have further comments or requirements, feel free to comment

ThomasHaas commented 2 months ago

I think the following is true:

Ideally, such an ExecutionModel can be dumped in possibly different formats and maybe even reparsed. When parsed, you likely will lose information about the original event that is modelled by the EventModels. So if that is a desired feature, ExecutionModels should be able to stand alone without references to a Program or Wmm. On the other hand, for refinement-based solving, the ExecutionModel should reference the original Program/Wmm in order to generate refinement clauses. This can either be achieved by having optional "source information" attached to the model, or adding a new "SourceMapping" that provides this information. This class would then only be needed for Refinement-based solving but otherwise is orthogonal to the ExecutionModel.