rerun-io / rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.
https://rerun.io/
Apache License 2.0
5.77k stars 260 forks source link

Document e2e logging performance for time series data #4889

Open nikolausWest opened 5 months ago

nikolausWest commented 5 months ago

We want to benchmark logging scalars, including setting a timeline value for each logged scalar, i.e. something like

for frame_nr in range(0, 1_000_000) {
    rr. set_time_sequence("frame", frame_nr)
    rr.log("scalar", rr.TimeSeriesScalar(sin(frame_nr / 1000.0)))
}

We have the tool for it:

just rs-plot-dashboard --num-plots 10 --num-series-per-plot 5 --num-points-per-series 5000 --freq 1000

For each language (C++, Python, Rust), measure the max throughputs (scalars per second), end-to-end (logging -> visualization) for single-threaded/single-plot and multi-threaded logging (so 3 x 2 throughput figures).

We also want to check the memory use in the viewer when we have logged 100M scalars or so, to measure the RAM overhead.


manually document this somewhere in our docs, i.e.:

On a 2023 MacBook M1:

Language Single-threaded Multi-threaded
C++ ? kHz ? kHz
Python ? kHz ? kHz
Rust ? kHz ? kHz

Viewing 100M scalars use up ?GB of RAM in the native viewer.

Very rough numbers is fine, e.g. "~10 M scalars / second"

emilk commented 5 months ago

We should link to https://github.com/rerun-io/rerun/issues/4423 too

emilk commented 5 months ago

I know there was some decision to punt on this (and it was moved to Triage), so I'm moving this down in urgency.

It would be nice with a short comment explaining why we are punting on this though.