Step 1: simple benchmarks using the current API

Overview

There are 3 variables to play with:

max number of threads for the multi-threaded tree build algorithm (note that this cannot yet be set in the public API https://github.com/silversixpence-crypto/dapol/issues/102)
tree height
number of users

There are 3 metrics we care about for each of tree building, proof generation, and proof verification:

compute time
memory usage
serialized file size

We want to know how the metrics change for each of the variables. Picture a 3-space graph for each metric, where the axes on the graph are each of the 3 variables. Building these 3-space graphs would be awesome, but not necessary.

We need to run the above test suite on different machines:

macbook air (or similar)
macbook pro (or similar)
beefy computer in the cloud

Ranges for the variables

Max number of threads

This number is limited by the available parallelism of the machine. We don't want to have too many threads. The highest this number should go is:

        // This is used to determine the number of threads to spawn in the
        // multi-threaded builder.
        crate::utils::DEFAULT_PARALLELISM_APPROX.with(|opt| {
            *opt.borrow_mut() = std::thread::available_parallelism()
                .map_err(|err| {
                    error!("Problem accessing machine parallelism: {}", err);
                    err
                })
                .map_or(None, |par| Some(par.get() as u8))
        });

https://github.com/silversixpence-crypto/dapol/blob/main/src/utils.rs#L158

The range should be something like (using Python list notation) [1:DEFAULT_PARALLELISM_APPROX:DEFAULT_PARALLELISM_APPROX / 10]. The divisor can be change to be suitable for the machine e.g. if you only have 8 cores then just step up in increments of 1, but if you have 256 cores maybe step up in increments of 50.

Tree height

[4,8,16,32,64]

Number of users

This number is the biggest contributor to compute time and memory usage. So the same range cannot be used for all machines, since if a machine doesn't have 64GB of memory and the tree build algorithm requires it then the program will crash. The benchmarking tests should take this into account and not try to perform tests using a number of users that will 'cause an out-of-memory error.

This is the full range of number of users (using Python notation):

[
  250_000_000,
  125_000_000,
  100_000_000,
  [n * 10_000_000 for n in range(1,10,2)],
  [n * 1_000_000 for n in range(1,10)],
  [n * 100_000 for n in range(1,10)],
  [n * 10_000 for n in range(1,10)],
]

Note on compute time and memory limits

As with the memory limits in regard to number of users.. the same check should be done for the other variables with memory. And a similar check should be done with compute time: we don't want to run a test that is taking more than 4 hours.

silversixpence-crypto / dapol