rerun-io / rerun

Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.
https://rerun.io/
Apache License 2.0
6.51k stars 327 forks source link

Report logging benchmarks for C++/Python/Rust to CI #4100

Open Wumpf opened 1 year ago

Wumpf commented 1 year ago

Keep things super simple and do end-to-end profing: Profile running a benchmark binary that internally has a bunch of test cases. This way we can integrate results from all benchmarks in the same way into our CI generated benchmark stats.

We execute it with different parameters to check for different test cases. Basic set of test cases we should start with:

In all cases (unless configured otherwise) log to a memory recording. (profiling other parts of the flow should be part of a different Rust benchmark)

Since we want to simply time process from spawn to end must make sure that data generation is super fast. Maybe print out additional timings in each language where appropriate - this is harder to integrate into CI graphs, but nice for debugging.

Ideally same data on all SDKs. There might be variations though in logging flow that don't map to each of them.

emilk commented 1 year ago

Very similar to:

Wumpf commented 10 months ago

We have now benchmarks for Python/Rust/C++ but we still don't upload the results

emilk commented 9 months ago

For reference, here are our performance targets:

emilk commented 9 months ago

We also want to explicitly benchmark logging scalars, including setting a timeline value for each logged scalar, i.e. something like

for frame_nr in range(0, 1_000_000) {
    rr. set_time_sequence("frame", frame_nr)
    rr.log("scalar", rr.TimeSeriesScalar(sin(frame_nr / 1000.0)))
}
emilk commented 9 months ago
just rs-plot-dashboard --num-plots 10 --num-series-per-plot 5 --num-points-per-series 5000 --freq 1000
emilk commented 9 months ago

We also want to check the memory use in the viewer when we have logged 100M scalars or so, to measure the RAM overhead.

emilk commented 9 months ago

This is closed when we have an easy way to run benchmarks for all languages, and those results are published (perhaps manually) somewhere publicly.