FeatureBaseDB / tools

Tools for development and ops
BSD 3-Clause "New" or "Revised" License
20 stars 14 forks source link

Benchmark the effect of slice width on performance #35

Closed linhvo closed 7 years ago

linhvo commented 7 years ago

@jaffee commented on Thu Dec 01 2016

load the same data into multiple, 1-node clusters, each with a different slice width (2^12, 2^16, 2^20, 2^22) perform the same set of queries on each cluster and compare times Same queries with increasing number of cores (GOMAXPROCS)


@codysoyland commented on Fri Dec 02 2016

Depends on #171 in order to rebuild cluster.

alanbernstein commented 7 years ago

I was talking to @linhvo about this and wanted to capture some thoughts here. @jaffee please feel free to chime in.

Some recent discussions have shown a need for understanding performance in terms of three variables: slice width, slice count, and data density. I'm interested in comparing all three of these at once, so I'd like to collect performance data in a "parameter grid". For example, we could benchmark the performance of a big read query on all combinations of:

Varying sliceWidth directly with the benchmark tools is blocked right now, but we should be able to build a benchmark that can handle the other two parameters easily. We can then re-run that single benchmark on a series of servers, for the sake of getting this data sooner.

For simplicity, we can start with, for example,

With the result data, we could produce benchmark tables/graphs with different views, for example query time vs number of slices, with a different line plot for different slice widths.