The benchmarks tracking issue

orottier commented 7 months ago

We have currently benchmarked our implementation against two well know suites:

Results of the padenot suite

Some quick and dirty results from the Spotify suite:

These benchmark have been run beginning of March 2023, I guess on revision 3b26ae63e7

[x] Compare performance of the lib in that period to the current main to see where we have progressed and regressed (spoiler alert: probably a regression in panning now we support stereo properly).
[x] Make a new chart comparing to the browser implementations
[ ] Look into more Spotify tests

orottier commented 7 months ago

A very nice overall performance gain compared to our last publication, no relevant regressions!

orottier commented 7 months ago

This is confirmed with our criterion test runner:


     Running benches/my_benchmark.rs (target/release/deps/my_benchmark-7e7f1db1fa6b596a)
bench_ctor              time:   [483.37 µs 484.07 µs 484.96 µs]
                        change: [-31.453% -31.138% -30.832%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

bench_audio_buffer_decode
                        time:   [7.9998 µs 8.0067 µs 8.0151 µs]
                        change: [+6.3822% +6.6569% +6.9091%] (p = 0.00 < 0.05)
                        Performance has regressed.

bench_sine              time:   [3.6357 ms 3.6378 ms 3.6401 ms]
                        change: [-14.156% -14.080% -14.009%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  6 (6.00%) high mild
  4 (4.00%) high severe

bench_sine_gain         time:   [4.0017 ms 4.0067 ms 4.0115 ms]
                        change: [-21.188% -21.061% -20.949%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  9 (9.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild

bench_sine_gain_delay   time:   [6.1067 ms 6.1107 ms 6.1152 ms]
                        change: [-30.086% -30.030% -29.973%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Benchmarking bench_buffer_src: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.4s, enable flat sampling, or reduce sample count to 60.
bench_buffer_src        time:   [1.0812 ms 1.0816 ms 1.0821 ms]
                        change: [-30.176% -29.510% -28.803%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe

bench_buffer_src_delay  time:   [4.2621 ms 4.2638 ms 4.2657 ms]
                        change: [-16.384% -16.294% -16.216%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

bench_buffer_src_iir    time:   [7.8227 ms 7.8262 ms 7.8303 ms]
                        change: [-6.1397% -6.0782% -6.0064%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

bench_buffer_src_biquad time:   [5.1760 ms 5.1789 ms 5.1820 ms]
                        change: [-13.607% -13.533% -13.460%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

bench_stereo_positional time:   [6.4384 ms 6.4412 ms 6.4441 ms]
                        change: [-13.780% -13.721% -13.664%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe

Benchmarking bench_stereo_panning_automation: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, enable flat sampling, or reduce sample count to 50.
bench_stereo_panning_automation
                        time:   [1.4068 ms 1.4090 ms 1.4118 ms]
                        change: [-30.551% -30.381% -30.170%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

bench_analyser_node     time:   [2.0953 ms 2.0969 ms 2.0987 ms]
                        change: [-17.929% -17.818% -17.705%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

bench_hrtf_panners      time:   [27.405 ms 27.414 ms 27.424 ms]
                        change: [-1.7525% -1.5258% -1.3059%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) high mild
  9 (9.00%) high severe

b-ma commented 7 months ago

Whow Cool! Do you confirm this is all on the same machine?

b-ma commented 7 months ago

I'm a bit confused with how the "faster?" is actually computed

e.g. for granular numbers look like twice faster (so 200% in my head) but the column report 48%... I can imagine there is some underlying factor of 2 but which one?

orottier commented 7 months ago

I'm a bit confused with how the "faster?" is actually computed

Huh very good point. My brain bricked for a second.

In my table, 'faster' is good old (old - new) / old which is actually 'decrease in time spent' Your metric speedup = time_old / time_new is probably more intuitive (and more impressive).

In any case, it's not the most important. I only wanted to check if we somehow regressed anywhere. I will now focus on the comparison to the web browsers to answer the question What are the remaining performance bottlenecks in your implementation and how do you plan to overcome them?. Perhaps the synthetic benchmark suite (padenot) will not answer that (only: convolution and granular synth is slow) and the real gain is still left in complex graph processing. Let's see!

orottier commented 7 months ago

Whow Cool! Do you confirm this is all on the same machine?

Yes. Only a more recent rust compiler - which of course could also be a fair gain

orottier commented 7 months ago

The following is a not-so-scientific (only measured once) comparison of us to the other browsers:

(yellow is slowest, green is fastest)

It shows we have work to do regarding AudioParams (positional/automation), large graphs (mixing many nodes) and convolution.

orottier commented 7 months ago

@b-ma How did you run the Spotify suite again?

b-ma commented 7 months ago

I had to modify the source code a bit to make it work, my fork is there: https://github.com/b-ma/web-audio-bench (as the original repo is archived, this is really ok I think)

I just cleaned a bit and introduced some command line options, let me know if something is not clear in the README

orottier commented 6 months ago

With #470 audio param optimizations things are looking nice. We're only the slowest implementation now for 'mixing' and 'convolution'.

b-ma commented 6 months ago

Cool!

orottier / web-audio-api-rs

The benchmarks tracking issue #458