Closed linhvo closed 7 years ago
I was talking to @linhvo about this and wanted to capture some thoughts here. @jaffee please feel free to chime in.
Some recent discussions have shown a need for understanding performance in terms of three variables: slice width, slice count, and data density. I'm interested in comparing all three of these at once, so I'd like to collect performance data in a "parameter grid". For example, we could benchmark the performance of a big read query on all combinations of:
Varying sliceWidth directly with the benchmark tools is blocked right now, but we should be able to build a benchmark that can handle the other two parameters easily. We can then re-run that single benchmark on a series of servers, for the sake of getting this data sooner.
For simplicity, we can start with, for example,
With the result data, we could produce benchmark tables/graphs with different views, for example query time vs number of slices, with a different line plot for different slice widths.
@jaffee commented on Thu Dec 01 2016
load the same data into multiple, 1-node clusters, each with a different slice width (2^12, 2^16, 2^20, 2^22) perform the same set of queries on each cluster and compare times Same queries with increasing number of cores (GOMAXPROCS)
@codysoyland commented on Fri Dec 02 2016
Depends on #171 in order to rebuild cluster.