risingwavelabs / risingwave

SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
https://www.risingwave.com/slack
Apache License 2.0
6.57k stars 536 forks source link

A tool for pinpointing regression commits #9285

Open kwannoel opened 1 year ago

kwannoel commented 1 year ago

Run e2e benchmark per commit is expensive. Instead after #9216 is complete, we will have automated e2e standalone bench in our ci.

We can utilize that to build a tool which does git bisect + run benchmark (lightweight version, just 100M records).

We can use that to pinpoint the commit where regression occurred.

kwannoel commented 1 year ago

More specifically the tool runs git bisect locally with START and END commits. At each bisect step, it will trigger a nexmark benchmark build on that commit to buildkite using the buildkite API, and obtain the throughput from it.

It should have upper bound of 8 commits (since each benchmark takes a while to complete ~35min). That means it can cover 2^7 or 128 commits in a day.

It can be implemented in risedev.

kwannoel commented 1 year ago

it will trigger a nexmark benchmark build

~Should use nexmark-bench or have a new workflow in test pipeline, instead of the one we have in our ci, since it is not clear if we will support performance benchmarks in it, perhaps just flamegraph generation. See https://github.com/risingwavelabs/risingwave/pull/9216#discussion_r1176293065.~

Edit: Still possible. Since we are just seeing if throughput regressed, we can still use this ci pipeline, since we can set the performance baseline on the "good" commit.