Benchmarks - Githubissues

codebien commented 2 years ago

goos: linux
goarch: amd64
pkg: github.com/grafana/xk6-output-influxdb/pkg/influxdb
cpu: AMD Ryzen 7 4800H with Radeon Graphics
BenchmarkWritePoints-16          192       6670961 ns/op
PASS
ok      github.com/grafana/xk6-output-influxdb/pkg/influxdb 1.930s

with v1 I got ~220 iteration

codebien commented 2 years ago

LGTM but I think I'm not fully clear on the goal of the PR: if it's to compare v2 API to v1 API, shouldn't the benchmark for v1 be present as well?

@yorugac thanks for your review. The goal is to understand what we can expect in terms of performance from the integration between this extension and a real InfluxDB server. v1 comparison is something additional for getting a better overview and to know if we are hitting important performance differences.

We can evaluate if it makes sense to push (and maintain) the equivalent version of the code for v1 but before I would have a stable and reliable benchmark here.

About the reliability I have still some doubts:

Should we have a different number of samples? If yes, how much? Currently, it's pushing a static set of 10 samples. Probably, it isn't a representative case of the average of k6 iterations?
Same story for Tags, just small and static set.

Maybe, we should think of a common set of use-cases for benchmarking the k6 outputs, in this way we could normalize the benchmarks and have a comparison also between different outputs?

yorugac commented 2 years ago

Got it, thanks. Yes, checking larger set of samples might be beneficial too. There is one more thing I noticed: this benchmark is basically only for Influx API while batchFromSamples remains un-benchmarked. For example, in PRW output extension the main performance problem is on processing metrics pre-request and that is also the part that we can directly improve on. I.e we cannot do anything with limitations imposed by API of Influx or RW of Prometheus, etc.

So perhaps, something like this: benchmark our side of processing and have some evaluation on what can be reasonably expected with real server in question?

codebien commented 2 years ago

Which is one of the many reasons I do think this things should be tested by just running the whole k6 with just a bunch of custom metrics being added each iteration and waiting for k6 to finish while measuring how many iterations (metric emissions) it was actually possible to send with the given output.

@MStoykov I agree and that is exactly the attempt with TestOutputThroughput. I did it in Go just for ~~two~~ three reasons:

it's reproducible
we can see what is the achieved performance just by looking into the asserted value
Easier define a metric and tracking it

Can we achieve the same using thek6 run --out=... command? I guess we can have a bench.js file in the repo and then report in a section in the README the latest seen value with running the master branch, WDYT?

mstoykov commented 2 years ago

Yeah the idea is that we will likely make a script :

constant arrival rate for consistency and the ability to see if we drop iterations
maybe constant vus as well to put pressure on it, but in a different script/run :shrug:
I would probably just go with custom metrics of each time and just add samples to each of them on each iteration
then we look at how long it took to finish and how many iterations were actually finished, maybe memory and CPU usage for the constant arrival rate and w/e else you think of.

p.s. those were three reasons ;)

yorugac commented 2 years ago

I would argue also that the major problem of outputs currently is that k6 doesn't aggregate anything

Totally agree with this, from my observations from PRW output behavior.

A suggestion: it might also be possible to make an estimate of a "metrics rate" as the number of metric samples that are processed by the Output in each flush period. That would allow to re-formulate the problem of output's performance in terms of the metrics rate this given Output can process without errors / dropping samples / etc. For example, the end result would be something like "Output A can handle X samples per second with standard setup™ + addition of 1 custom metric raises the sample rate by Y". (The last part about custom metrics is likely independent of the Output in question.)

codebien commented 2 years ago

At the moment we don't have the time for focusing on this in short term, so I'm closing this and we will reopen maybe a new one better defined in the future. Feel free to reopen or create an issue if you have different opinions and/or ideas.

grafana / xk6-output-influxdb

Benchmarks #3