Benchmarking Overview - Githubissues

@jaffee commented on Thu Dec 08 2016

We've got a pretty good amount of benchmarking code, but there are still quite a number open questions about how it's all going to get tied together. I'd like to keep an overview in this ticket, break out the individual bits in other tickets and reference them here. Please comment and we can make edits as we find consensus.

The (likely to be renamed) bspawn command is the main entry point for running benchmarks. It handles cluster creation+teardown, agent creation+teardown, running benchmarks, aggregating results from all the agents (and the cluster if necessary), and storing or publishing them.

Rough Order of Operations for bspawn:

Generate a run_uuid. The run_uuid will be associated with the output of all the different pieces of this benchmark run. Each agent's output, the output of cluster creation, agent creation, and any stats from the cluster may be stored separately, but as long as they include the run_uuid, it will be possible to correlate all data from a single run. #204
Create the cluster. Cluster creation is managed by the pilosactl create command, but it's output should include the run_uuid, configuration parameters given, and information about the actual cluster created i.e. hostnames, detailed hardware info. Information about the version of pilosa running and any build parameters should also be included. #171
Create agents. This is similar to cluster creation, and should report similar information. #205
Run the various benchmarks specified in the bspawn config file. There may be a way to specify whether groups of benchmarks should be run in series or in parallel. The format #168 for the output of benchmarks should be specified well enough that it can be consumed automatically by further tools (i.e. visualization, anomaly detection, alerting).
Store all the output somewhere - this will probably have some configurability. #203
Tear down cluster, cluster infrastructure, and agent infrastructure - with the option to leave any of it in place (as an optimization for further use, to verify data, etc.)

Some general notes:

Start cluster and agents and get results over ssh (unless on localhost) - whatever cluster creation methodology is used will have to provide keys and ports. #202

@codysoyland commented on Tue Mar 14 2017

Hey @jaffee - should we move this overview doc to the tools repo? Thanks!

FeatureBaseDB / tools

Benchmarking Overview #10