SBV benchmark suite, phase 1 with puzzle benchmarks

What

Pull request for phase 1 sbv benchmarks as discussed in #484. This adds basic benchmarking utility to sbv using criterion. This doesn't add any new dependencies and provides an api for creating new benchmarks, see BenchSuite.Puzzles for examples.

How

Run with cabal bench:

$ cabal bench
benchmarking Puzzles/Coins/standalone
time                 179.8 ms   (168.0 ms .. 193.0 ms)
                     0.977 R²   (0.964 R² .. 0.997 R²)
mean                 171.8 ms   (169.6 ms .. 175.9 ms)
std dev              10.75 ms   (6.284 ms .. 17.10 ms)
variance introduced by outliers: 42% (moderately inflated)
...

or, if desired you can also output the results to a csv file with:

$ cabal build
$ ./dist/build/SBVBench/SBVBench --csv "test.csv"
$ cat test.csv 
Name,Mean,MeanLB,MeanUB,Stddev,StddevLB,StddevUB
Puzzles/Coins/standalone,0.1717946026770784,0.16961267840212116,0.17589632020186868,1.0745479995855869e-2,6.284256037335621e-3,1.7098143311250525e-2

Design

I've chosen to keep the overhead benchmarks but it is trivial to just run the sbv benchmark without the standalone one, see Overhead.SBVOverhead.mkOverheadBenchmark for details. The framework depends on RankNTypes and ExistentialQuantification to hide the type information for the problems to be solved and their return type. This is useful for a couple of reasons: first it keeps the benchmarks as close to the problem definition as possible while still allowing a user to operate on the benchmark later. For example, I define all of the puzzle benchmarks, many of which require Data.SBV.allSatWith for their problem definition, but then I change these to Data.SBV.satWith in SBVBench.hs because Data.SBV.allSatWith leads to a lot of variance and outliers in criterion. Secondly, it allows SBVBench.hs, the bench suite entry point, to use heterogeneous lists which keeps the interface with criterion clean.

Please review and get back to me. All comments and feedback welcome!

Jeff

LeventErkok / sbv

SBV benchmark suite, phase 1 with puzzle benchmarks #489

What

How

Design