At the moment the primary way to inspect microbenchmark regressions from the microbenchmark weekly job [1] is to look at the Google Sheets produced by the job. However, this does not always contain enough information to make an informed decision about a regression.
To improve upon this an effort should be made to have a way of displaying microbenchmark history alongside the results. There are a few options:
Something similar to roachperf [2]
Export to an existing system like Prometheus / Grafana / DataDog
Consider a 3rd party example like Golang's perf / microbenchmarks dashboard [3]
In regards to option 1, there are plans to deprecate roachperf eventually. Option 2 was considered at one point, but tools like Grafana and DataDog are not exactly meant for granular single point metrics that compare against a baseline, or have good ways of consolidating regressions and displaying the information in an easily consumable way.
This makes option 3 currently the most lucrative, and since much of the implementation details are publicly available, it's a good starting point.
Golang's perf dashboards use a combination of cloud storage and InfluxDB (timeseries database) to store its metrics. Cloud storage combined with a SQL database, that serves as a data index, is used to store the source of truth for the comparisons (the raw logs from the microbenchmarks). The comparisons (processed results comparing two revisions) are stored in InfluxDB to make it more readily available for display on the dashboards.
This issue mainly deals with the way in which we plan to store the metrics in InfluxDB for our own purposes. The format in which Golang currently inserts microbenchmark data points is fairly simple and something that we can reuse.
Field
Desc
low
lower bound of summary
center
center of summary
high
higher bound of summary
upload-time
timestamp of the run
baseline-commit
latest stable version of cockroach
experiment-commit
revision of performance that is evaluated against the baseline
benchmarks-commit
revision of the microbenchmark framework used (usually the same as experiment)
Tag
Desc
name
name of the microbenchmark
unit
unit of performance measured (ex., sec/op)
pkg
package the benchmark is from
repository
cockroach
branch
branch the benchmarks are from
goarch
machine architecture
goos
machine operating system
machine-type
the cloud vendor's type name for the machine (ex., n2-standard-32)
InfluxDB supports both fields and tags for a datapoint. Fields are not indexed, whereas tags are. Tags are useful for searching and filtering results.
A datapoint (measurement) will typically contain the commit SHAs for both the experiment, and baseline. Where experiment is the version being evaluated, and baseline is considered the stable version to evaluate against. In addition to storing the comparison we also capture metadata regarding the conditions and environment the microbenchmark was run in.
In order to be compatible with Golang's dashboard it is necessary to consider the queries that will be run against InfluxDB [4]. We can still optionally add our own metadata or other info to the datapoints that will get written to InfluxDB.
At the moment the primary way to inspect microbenchmark regressions from the microbenchmark weekly job [1] is to look at the Google Sheets produced by the job. However, this does not always contain enough information to make an informed decision about a regression.
To improve upon this an effort should be made to have a way of displaying microbenchmark history alongside the results. There are a few options:
roachperf
[2]In regards to option 1, there are plans to deprecate
roachperf
eventually. Option 2 was considered at one point, but tools like Grafana and DataDog are not exactly meant for granular single point metrics that compare against a baseline, or have good ways of consolidating regressions and displaying the information in an easily consumable way.This makes option 3 currently the most lucrative, and since much of the implementation details are publicly available, it's a good starting point.
Golang's perf dashboards use a combination of cloud storage and InfluxDB (timeseries database) to store its metrics. Cloud storage combined with a SQL database, that serves as a data index, is used to store the source of truth for the comparisons (the raw logs from the microbenchmarks). The comparisons (processed results comparing two revisions) are stored in InfluxDB to make it more readily available for display on the dashboards.
This issue mainly deals with the way in which we plan to store the metrics in InfluxDB for our own purposes. The format in which Golang currently inserts microbenchmark data points is fairly simple and something that we can reuse.
InfluxDB supports both fields and tags for a datapoint. Fields are not indexed, whereas tags are. Tags are useful for searching and filtering results.
A datapoint (measurement) will typically contain the commit SHAs for both the experiment, and baseline. Where
experiment
is the version being evaluated, andbaseline
is considered the stable version to evaluate against. In addition to storing the comparison we also capture metadata regarding the conditions and environment the microbenchmark was run in.In order to be compatible with Golang's dashboard it is necessary to consider the queries that will be run against InfluxDB [4]. We can still optionally add our own metadata or other info to the datapoints that will get written to InfluxDB.
[1] https://github.com/cockroachdb/cockroach/blob/master/build/teamcity/cockroach/nightlies/microbenchmark_weekly.sh [2] https://roachperf.crdb.dev/ [3] https://perf.golang.org/dashboard/ [4] https://github.com/golang/build/blob/master/perf/app/dashboard.go#L300
Jira issue: CRDB-41257