"standard" benchmark - Githubissues

Dieterbe commented 7 years ago

some thoughts:

chunkspan of 2 minutes, so that every 2 minutes we persist to cassandra, so that if we benchmark for a few minutes, it becomes "fair" that this will always be incorporated (but we must check that the queues can drain)
secondly data, so that chunks contain 120 points which is a good number.
vegeta must return 100% OK's. Mt must not log any errors
no aggregations - unless benchmark needs to verify something specific to aggregations - because they get saved at longer chunkspans which would make comparing results harder. plus, makes it easier to do benchmark with aggregations enabled and see effect of aggregations.
stats and profiletrigger can be set at 1s since this is what we recommend for prod.

should take snapshots which should be standard MT dashboard + cpu usage graph via collectd/snap so we can easily compare cpu, memory, golang GC, and vegeta output

the other remaining factor that can now be randomly part of a bench run, that I'm aware of, is our own GC routine which frees us metrics. we could set gc-interval to 0 to disable (but that's not realistic). looking at the code, this currently runs always at 1min after whatever the clean interval is based on the setting. this means that to incorporate this, we could set the interval to a minute.

if we run benchmarks for exactly 2 minutes, then the amount of persist runs as well as our GC routine should be fixed to two, though they may run at different times during the benchmark based on when we started the benchmark, but I think this shouldn't be a big problem.

replay commented 7 years ago

Having a standardized benchmark sounds great, and that all makes sense. The thing I'm concerned about at the moment is that when looking at the chunk cache I think there are quite a few more factors that matter. For example: Is the queried data bigger than the chunk cache size? If so then metrictank needs to evict stuff from cache, which might slow the overall cache down because of locks. Or: In order to have a realistic workload shouldn't we from time to time query metrics that have not been queried (and cached) yet. I'm not sure what's the best way to add this factor into a standardized test.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

grafana / metrictank

"standard" benchmark #440