A framework for modular verification of causal consistency for replicated key-value store implementations and their client programs in Coq. Includes proofs of the causal consistency of two key-value store implementations and a simple automatic model checker for the correctness of client programs.
Chapar
The easiest way to install the latest released version of Chapar is via OPAM:
opam repo add coq-released https://coq.inria.fr/opam/released
opam install coq-chapar
To instead build and install manually, do:
git clone https://github.com/coq-community/chapar.git
cd chapar
make # or make -j <number-of-cores-on-your-machine>
make install
Three key-value stores, verified to be causally consistent in the Coq proof assistant and extracted to executable code. See here for the requirements to build the stores.
The Coq definitions and proofs are located in the theories
directory. The code location of the definitions and lemmas presented in the paper are listed below.
KVStore.v
, Section ValSec
KVStore.v
, Module Type AlgDef
KVStore.v
, Module ConcExec
KVStore.v
, Module AbsExec
KVStore.v
, Module InstConcExec
KVStore.v
, Module Type CauseObl
KVStore.v
, Definition cause_step
and Inductive cause
KVStore.v
, Module SeqExec
KVStore.v
, Theorem CausallyConsistent
. Note that (CauseObl: CauseObl AlgDef)
is a parameter of the module ExecToAbstExec
.KVStore.v
, Lemma FaultFreedom
. Note that (CauseObl: CauseObl AlgDef)
is a parameter of the module ExecToAbstExec
.KVSAlg1.v
, Module Type KVSAlg1
KVSAlg1.v
, Module KVSAlg1CauseObl (SyntaxArg: SyntaxPar) <: CauseObl KVSAlg1 SyntaxArg
KVSAlg1.v
, Lemma CausallyConsistent
KVSAlg1.v
, Lemma cause_clock
KVSAlg1.v
, Lemma cause_rec
KVSAlg2.v
, Module Type KVSAlg2
KVSAlg1.v
, Module KVSAlg2CauseObl (SyntaxArg: SyntaxPar) <: CauseObl KVSAlg2 SyntaxArg
KVSAlg2.v
, Lemma CausallyConsistent
KVSAlg2.v
, Lemma cause_dep
KVSAlg2.v
, Lemma cause_received_received
KVSAlg2.v
, Lemma cause_rec
KVSAlg3.v
, Module Type KVSAlg3
KVSAlg3.v
, Module KVSAlg3CauseObl (SyntaxArg: SyntaxPar) <: CauseObl KVSAlg3 SyntaxArg
KVSAlg3.v
, Lemma CausallyConsistent
KVSAlg3.v
, Lemma cause_clock
KVSAlg3.v
, Lemma dep_leq_rec
KVSAlg3.v
, Lemma cause_rec
Clients.v
, Definition prog_photo_upload
Clients.v
, Definition prog_lost_ring
ListClient.v
Clients.v
, Lemma CauseConsistent_Prog1
Definition CausallyContent
ReflectiveAbstractSemantics.v
scripts
(directory): The execution scripts described in the section Running Experiments below
theories
(directory); the Coq verification framework:
Framework/KVStore.v
: The basic definitions, the semantics and accompanying lemmaFramework/ReflectiveAbstractSemantics.v
: The client verification definitions and lemmasAlgorithms/KVSAlg1.v
: The definition and proof of algorithm 1 in the paperAlgorithms/KVSAlg2.v
: The definition and proof of algorithm 2 in the paperAlgorithms/KVSAlg3.v
: The definition and proof of algorithm 3 in the appendixAlgorithms/ExtractAlgorithm.v
: Extraction directives for extracting the algorithms to OCamlExamples/Clients.v
: Verified client programExamples/ListClient.v
: Verified client programLib
(directory): General purpose Coq librariessrc
(directory); the OCaml runtime to execute the algorithms:
algorithm.ml
: Key-value store algorithm shared interfacealgorithm1.ml
, algorithm2.ml
, algorithm3.ml
: Wrappers for the extracted algorithmsbenchgen.ml
: Benchmark generation and storing programbenchprog.ml
: Benchmark retrieval programcommonbench.ml
: Common definitions for benchmarkscommon.ml
: Common definitionsconfiguration.ml
: Execution configuration definitionsreadConfig.ml
: Configuration retrieval programruntime.ml
: Execution runtimelaunchStore1.ml
, launchStore2.ml
, launchStore3.ml
: Launchers for the extracted algorithmsutil.ml
: General purpose OCaml functionsexperiment.ml
: Small OCaml programming testsWe run with 4 nodes called the worker nodes and a node called the master node that keeps track of the start and end of the runs. The scripts support running with both the current terminal blocked or detached. In the former, the terminal should be active for the entire execution time. To avoid this, we use another node called the launcher node. Repeating and collecting the results of the runs is delegated to the launcher node. The terminal can be closed and the execution results can be retrieved later from the launcher node. The four workers, the master and the launcher can be different nodes. However, to simplify running, the scripts support assigning one host to all of these roles. This is the default setting.
The settings of the nodes can be edited in the file Settings.txt
. The following should be
noted if other machines are used as the running nodes.
The host should have password-less ssh access to the launcher node. The launcher node
should have password-less ssh access to the other nodes. This can be done by copying
the public key of the accessing machine to the accessed machine by a command like:
cat ~/.ssh/id_dsa.pub | ssh -l remoteuser remote.example.com 'cat >> ~/.ssh/authorized_keys'
The port numbers 9100, 9101, 9102, and 9103 should be open on the worker nodes 1, 2, 3 and 4 respectively. The port number 9099 should be open on the master node.
To start the run:
./batchrundetach
To check the status of the run:
./printlauncherout
To get the results once the run is finished:
./fetchresults
The result are stored in the file RemoteAllResults.txt
. See fetchresults
below for the format
of the results.
All of the following files are in the root directory:
Settings.txt
KeyRange
: The range of keys in the generated benchmarks is from 0 to this number. For our experiments, it is set to 50.RepeatCount
: The number of times that each experiment is repeated. For our experiments, it is set to 5.LauncherNode
: The user name and the ip of the launcher nodeMasterNode
: The user name and the ip of the master nodeWorkerNodes
: The user name and the ip of the worker nodesbatchrun
:
This is the place where the experiments are listed. Each call to the script run
is an experiment. The arguments are:
This script can be called without using the launcher node. The current terminal is blocked. See batchrundetach below for detached execution of the experiments.
batchrundetach
: To execute using the launcher node. The current terminal is detached.
printlauncherout
: To see the output of the launcher even while the experiments are being run
printnodesout
: To see the output of the worker nodes
fetchresults
: To get the results. The fetched files are:
RemoteAllResults.txt
: The timing of the replicasRemoteAllOutputs.txt
: The outputs of the replicasRemoteLauncherOutput.txt
: The output of the launcher nodeThe following example output is for the algorithm 2 with 4 worker nodes and 40000 operations per node with 10 percent puts. It shows two runs. Under each run, the time spend by each of the nodes is shown. We compute the maximum of these four numbers to compute the total process time.
Algorithm: 2
Server count: 4
Operation count: 40000
Put percent: 10
Run: 1
1.000000
3.000000
1.000000
1.000000
Run: 2
1.000000
1.000000
4.000000
1.000000
clearnodes
: To remove the output and result files and the running processes in all the nodes; this is used to start over
The goal of our experimental result section was to show that our verification effort can lead to executable code and also to compare the performance of the two algorithms. As described in the paper, the experiments were done with four worker nodes cluster. Each worker node had an Intel(R) Xeon(R) 2.27GHz CPU with 2GB of RAM and ran Linux Ubuntu 14.04.2 with the kernel version 3.13.0-48-generic#80-Ubuntu. The nodes were connected to a gigabit switch. The keys were uniformly selected from 0 to 50 for the benchmarks. Each experiment was repeated 5 times. (The reported numbers are the arithmetic mean of the five runs.) Each node processed 60,000 requests. The put ratio ranged from 10% to 90%.
Here are the contents of the two configuration files, Settings.txt
and batchrun
:
Note: user and ip should be filled with specific values.
Settings.txt
:
KeyRange=
50
RepeatCount=
5
LauncherNode=
<user>@<ip>
MasterNode=
<user>@<ip>
WorkerNodes=
<user>@<ip>
<user>@<ip>
<user>@<ip>
<user>@<ip>
batchrun
:
./run 4 60000 10
./run 4 60000 20
./run 4 60000 30
./run 4 60000 40
./run 4 60000 50
./run 4 60000 60
./run 4 60000 70
./run 4 60000 80
./run 4 60000 90
As expected, the throughput of both of the stores increases as the ratio of the get operation increases. The second algorithm shows a higher throughput than the first algorithm. The reason is twofold. Firstly, in the first algorithm, the clock function of a node keeps an over-approximation of the dependencies of the node. This over-approximation incurs extra dependencies on updates. On the other hand, the second algorithm does not require any extra dependencies. Therefore, in the first algorithm compared to the second, the updates can have longer waiting times, and the update queues tend to be longer. Therefore, the traversal of the update queue is more time consuming in the first algorithm than the second. Secondly, the update payload that is sent and received by the first algorithm contains the function clock. OCaml cannot marshal functions. However, as the clock function has the finite domain of the participating nodes, it can be serialized to and deserialized from a list. Nonetheless, serialization and de-serialization on every sent and received message adds performance cost. On the other hand, the payload of the second algorithm consists of only data types that can be directly marshalled. Therefore, the second algorithm has no extra marshalling cost.