SSoelvsten / bdd-benchmark

Benchmarking Suite for BDD packages
MIT License
11 stars 2 forks source link
adiar buddy cal cudd lib-bdd oxidd sylvan

BDD Benchmarking Suite

This is a collection of benchmarks for Binary Decision Diagrams (BDDs) [Bryant1986] and some of its variants. These are useful to survey the strengths and weaknesses of different implementations and to guide their developers. To this end, each benchmark is implemented by use of C++ templates such that they are agnostic of the BDD package used.

This project has been developed at the Logic and Semantics group at Aarhus University.

Table of Contents

Decision Diagrams

Our benchmarks target the following types of decision diagrams.

BDD Packages

The benchmarks are implemented using C++ templates to make them agnostic of the BDD package used. To do so, each implementation has an adapter in-between which is fully inlined at compile time. Currently, we support the following BDD packages.

Adiar BuDDy CAL CUDD LibBDD OxiDD Sylvan
Language C++ C C C Rust Rust C
BDD (✓) (✓)
BCDD
MTBDD
ZDD
Thread Safe
Multi-threaded
Reordering (✓) (✓)
Shared Nodes
Ext. Memory

We hope to extend the number of packages. See [issue

12](https://github.com/SSoelvsten/bdd-benchmark/issues/12) for a list of BDD

packages we would like to have added to this set of benchmarks. Any help to do so is very much appreciated.

Dependencies

All packages use CMake to build. This makes compilation and linking very simple. One merely has to initialize all submodules (recursively) using the following command.

git submodule update --init --recursive

To build, one needs CMake and a C++ compiler of your choice. On Ubuntu, one can obtain these with the following command:

apt install cmake g++

The Picotrav benchmark requires GNU Bison and Flex. On Ubuntu, these can be installed with.

apt install bison flex

The McNet benchmark requires the Boost Library. On Ubuntu, these can be installed as follows

apt install libboost-all-dev

Adiar

Adiar also has a transitive dependency on the Boost Library. On Ubuntu, these can be installed as follows

apt install libboost-all-dev

LibBDD and OxiDD

These libraries are implemented in Rust and interact with C/C++ via an FFI. Hence, one needs to use cargo which in turn requires an internet connection to download the dependency crates during the first build ("crate" is the Rust term for a package).

There are different ways to install Rust. On all platforms, it is possible to use the official installer script for rustup, the toolchain manager. Some OSes also provide a rustup or rustup-init package. If you have rustup installed, then you can install the latest stable Rust toolchain (including cargo) using rustup default stable. Some OSes also provide a fairly recent Rust package. On Ubuntu, you can simply install cargo via:

apt install cargo

Furthermore, the libraries depend on cbindgen to generate a C header from Rust code. cbindgen can either be installed through Cargo or your OS package manager. Note however, that a relatively recent version of cbindgen is required (we tested 0.26.0, which is not provided by Ubuntu versions prior to 24.04).

cargo install --force cbindgen

[!NOTE] If installing cbindgen through cargo, remember to update your PATH. Otherwise, CMake will abort with an "Could not find toolchain ''" error. You can obtain the path from the print message during cargo install which states "warning: be sure to add to your PATH ..."

Usage

When all dependencies have been resolved (see above), you can build and run the benchmarks as follows.

Build

To configure the CMake project run:

cmake -B build

The default settings can be changed by also parsing various parameters to CMake. Here are the values that might be relevant.

Features of the benchmarks and BDD packages can also be turned on and off as follows

After configuring, you can build the benchmarks:

cmake --build build

Run

After building, the build/src folder contains one executable for every possible combination of BDD library, benchmark, and DD kind. Not all libraries support every kind of DD (see above) and not all benchmarks are available for BDDs/BCDDs or ZDDs.

The executables follows a <Library>_<Benchmark>_<DD> naming scheme (each of the three components in nocase). All executables have a common command line interface to configure the BDD library in question:

For example, you can run the Queens benchmark on Sylvan with 1024 MiB of memory and 4 threads as follows:

./build/src/sylvan_queens_bcdd -M 1024 -P 4

Furthermore, each benchmark requires options. See -h or the Benchmarks Section for details.

Benchmarks

Apply

Based on [Pastva2023], this benchmark loads two (or more) decision diagrams stored in a binary format (as they are serialized by the LibBDD library) and then combines them with an Apply operation (or more).

The benchmark can be configured with the following options:

./build/src/${LIB}_apply_${KIND} -f benchmarks/apply/x0.bdd -f benchmarks/apply/x1.bdd -o and

CNF Solver

This benchmark loads a DIMACS CNF file, constructs its clauses and conjoins them. The benchmark uses the variable order specified in the CNF file and interprets the clause list as an approximately balanced tree, e.g. (c0 c1) (c2 (c3 c4)). It makes sense to preprocess raw CNF files using external tools and infer good variable and clause orders.

./build/src/${LIB}_cnf_${KIND} -f benchmarks/cnf/sample.cnf

Game Of Life

Solves the following problem:

Given N1 and N2, how many Garden of Edens exist of size N1xN2 in Conway's Game of Life?

The benchmark can be configured with the following options:

All symmetries use a variable order where the pre/post variables are zipped and and follow a row-major ordering.

./build/src/${LIB}_game-of-life_${KIND} -n 5 -n 4 -o rotate-180

Hamiltonian Cycle

Solves the following problem:

Given N1 and N2, how many hamiltonian cycles exist on a Grid Graph size N1xN2?

The benchmark can be configured with the following options:

The time encoding are designed with ZDDs in mind whereas the binary encoding is designed for BDDs. That is, using the time encoding with BDDs does not give you great, i.e., small and fast, results.

./build/src/${LIB}_hamiltonian_${KIND} -n 6 -n 5

McNet

[!IMPORTANT] ZDDs are not supported for this benchmark (yet)!

This simple Model Checker for Nets (pronounced "Mac Net", "magnet" or maybe "Monet"?) provides algorithms for the exploration of various transition system formats.

./build/src/${LIB}_mcnet_${KIND} -f benchmarks/mcnet/split.pnml

Picotrav

This benchmark is a small recreation of the Nanotrav example provided with the CUDD library [Somenzi2015]. Given a hierarchical circuit in (a subset of the) Berkeley Logic Interchange Format (BLIF) a BDD is created for every net; dereferencing BDDs for intermediate nets after they have been used for the last time. If two .blif files are given, then BDDs for both are constructed and every output net is compared for equality.

The benchmark can be configured with the following options:

./build/src/${LIB}_picotrav_${KIND} -f benchmarks/picotrav/not_a.blif -f benchmarks/picotrav/not_b.blif -o level_df

QBF Solver

[!IMPORTANT] ZDDs are not supported for this benchmark (yet)!

This benchmark is a small recreation of Jaco van de Pol's Qubi project. Given a Quantified Boolean Formula (QBF) in the qcir format, the decision diagram representing each gate is computed bottom-up. The outermost quantifier in the prenex is not resolved with BDD operations. Instead, if the decision diagram has not already collapsed to a terminal, a witness or a counter-example is obtained from the diagram.

./build/src/${LIB}_qbf_${KIND} -f benchmarks/qbf/example_a.qcir -o df

Queens

Solves the following problem:

Given N, in how many ways can N queens be placed on an N x N chess board without threatening eachother?

Our implementation of these benchmarks are based on the description of [Kunkle2010]. Row by row, we construct an BDD that represents whether the row is in a legal state: is at least one queen placed on each row and is it also in no conflicts with any other? On the accumulated BDD we then count the number of satisfying assignments.

The benchmark can be configured with the following options:

./build/src/${LIB}_queens_${KIND} -n 8

RelProd

[!IMPORTANT] ZDDs are not supported for this benchmark (yet)!

Building on-top of the Apply benchmark, this benchmark loads a relation and a set of states stored in a binary format (as they are serialized by the LibBDD library) and then combines them with a Relational Product operation in either direction.

The benchmark can be configured with the following options:

./build/src/${LIB}_relprod_${KIND} -r benchmarks/relprod/self-loop/relation.bdd -s benchmarks/relprod/self-loop/states_all.bdd -o next

Tic-Tac-Toe

Solves the following problem:

Given N, in how many ways can Player 1 place N crosses in a 3D 4x4x4 cube and have a tie, when Player 2 places noughts in all remaining positions?

This benchmark stems from [Kunkle2010]. Here we keep an accumulated BDD on which we add one of the 76 constraints of at least one cross and one nought after the other. We add these constraints in a different order than [Kunkle2010], which does result in an up to 100 times smaller largest intermediate result.

The interesting thing about this benchmark is, that even though the BDDs grow near-exponentially, the initial BDD size grows polynomially with N, it always uses 64 variables number and 76 Apply operations.

./build/src/${LIB}_tic-tac-toe_${KIND} -n 20

Performance Regression Testing

This collection of benchmarks is not only a good way to compare implementations of BDD packages. Assuming there are no breaking changes in a pull request, this also provides a way to test for performance regression.

To this end, regression.py provides an interactive python program that fully automates downloading inputs and running and analysing the benchmark timings of two branches. For a non-interactive context, e.g., continuous integration, all arguments can be parsed through stdin. For example, the following goes through all the prompts for using the 8-Queens problem to test Adiar with 128 MiB origin/main (against itself) with no verbose build or runtime output and with at least 3 and at most 10 samples.

python regression.py <<< "queens
8
adiar
128
origin
main
origin
main
no
no
3
10
"

Between min and max number of samples are collected, until both the baseline and the under test branches have a standard deviation below 5%. For smaller instances, samples are collected for at least 30 minutes whereas for larger instances, the last sampling may start after 90 minutes.

The python script exits with a non-zero code, if there is a statistically relevant slowdown. A short and concise report is placed in regression_{package}.out which can be posted as a comment on the pull request.

License

The software files in this repository are provided under the MIT License.

Citation

If you use this repository in your work, we sadly do not yet have written a paper on this repository alone (but is planned). In the meantime, please cite the initial paper on Adiar.

@InProceedings{soelvsten2022:TACAS,
  title         = {{Adiar}: Binary Decision Diagrams in External Memory},
  author        = {S{\o}lvsten, Steffan Christ
               and van de Pol, Jaco
               and Jakobsen, Anna Blume
               and Thomasen, Mathias Weller Berg},
  year          = {2022},
  booktitle     = {Tools and Algorithms for the Construction and Analysis of Systems},
  editor        = {Fisman, Dana
               and Rosu, Grigore},
  pages         = {295--313},
  numPages      = {19},
  publisher     = {Springer},
  series        = {Lecture Notes in Computer Science},
  volume        = {13244},
  isbn          = {978-3-030-99527-0},
  doi           = {10.1007/978-3-030-99527-0\_16},
}

Publications

The following list of publications (given in order of publication date) have used this benchmarking suite for their evaluation:

References