LilithHafner / Chairmarks.jl

Benchmarks with back support
GNU General Public License v3.0
81 stars 8 forks source link

Add a method to `Statistics.std` for Benchmarks #95

Open asinghvi17 opened 5 months ago

asinghvi17 commented 5 months ago

Pretty simple change, just what it says on the tin.

LilithHafner commented 5 months ago

Out of curiosity, what's the usecase?

I'm happy to pass through the corrected keyword argument, but the mean keyword argument, as written, must be a real number which doesn't make sense. I would prefer to just not accept the mean keyword argument, though it's also possible to take a Sample as a mean, but that would require special handling.

Also needs tests.

LilithHafner commented 2 weeks ago

The primary reason I haven't merged this as-is is that it will have strange behavior when displaying/interpreting a sample. Even more strange than computing mean or median on a benchmark.

julia> x = @be 1+1
Benchmark: 6326 samples with 10735 evaluations
min    1.133 ns
median 1.292 ns
mean   1.381 ns
max    6.979 ns

julia> Chairmarks.elementwise(std, x)
0.271 ns (without a warmup)
asinghvi17 commented 2 weeks ago

I didn't realize it looks so strange when displayed!

My primary usecase was visualizing error bars of benchmarks, where the standard deviation defines the height of the error bar. BenchmarkTools already has this particular integration, and because you can feed Chairmarks benchmarks into BenchmarkGroups, it would be very easy for the plotting end of my benchmarks to be agnostic between Chairmarks and BenchmarkTools.

julia> using BenchmarkTools

julia> be = @benchmark randn(1000)
using SBenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.692 μs …  4.269 ms  ┊ GC (min … max):  0.00% … 99.87%
 Time  (median):     2.388 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   3.446 μs ± 44.716 μs  ┊ GC (mean ± σ):  23.29% ±  3.28%

  t      ▅▇█▅▁
  ▂▄▃▃▃▆█████▇▅▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂ ▃a
  1.69 μs        Histogram: frequency by time        6.43 μs <

 Memory estimate: 8.00 KiB, allocs estimate: 1.

julia> using Statistics

julia> std(be)
BenchmarkTools.TrialEstimate:
  time:             44.716 μs
  gctime:           44.579 μs (99.69%)
  memory:           8.00 KiB
  allocs:           1

That being said, I should update this and add some tests for sure!

LilithHafner commented 2 weeks ago

It looks like BenchmarkTools is using std for time and gctime and something line only(unique(x)) for allocs and memory. I imagine Chairmarks should do something similar, at least for fields such as warmup if we were to support std.