LilithHafner / Chairmarks.jl

Benchmarks with back support
GNU General Public License v3.0
78 stars 6 forks source link

Detect cases where first eval is slower than subsequent evals #102

Open LilithHafner opened 3 months ago

LilithHafner commented 3 months ago

If I have something like @b rand(1000) sort!, the first eval is much slower than subsequent evals within a given sample, which violates benchmarking assumptions and results in weird results. For example, @b rand(1000) sort! reports a super fast runtime while @b rand(100_000) sort! is realistic.

See: https://github.com/withbayes/Tapir.jl/issues/140

julia> @be rand(100_000) sort!
Benchmark: 100 samples with 1 evaluation
min    761.379 μs (6 allocs: 789.438 KiB)
median 871.046 μs (6 allocs: 789.438 KiB)
mean   890.113 μs (6 allocs: 789.438 KiB, 2.74% gc time)
max    1.223 ms (6 allocs: 789.438 KiB, 14.46% gc time)

julia> @be rand(1000) sort!
Benchmark: 2943 samples with 7 evaluations
min    2.345 μs (0.86 allocs: 1.429 KiB)
median 3.208 μs (0.86 allocs: 1.429 KiB)
mean   4.221 μs (0.86 allocs: 1.434 KiB, 0.25% gc time)
max    701.837 μs (0.86 allocs: 1.714 KiB, 98.49% gc time)
gdalle commented 3 months ago

I guess most of these cases can be detected by systematically running a second evaluation after the first one? Of course it's debatable whether the benefit outweighs the cost