Open alfaromartino opened 2 months ago
Thanks for bringing this up! All of these departures are intentional, the rest of this comment is both a response to you and a draft for inclusion in https://chairmarks.lilithhafner.com/stable/explanations for future reference. I'm open to discussion on all of these points.
When there are conflicts between compatibility/alignment with BenchmarkTools
and producing the best experience I can for folks who are not coming for BenchmarkTools or using BenchmarkTools simultaneously, I put much more weight on the latter. One reason for this is folks who want something like BenchmarkTools should use BenchmarkTools, it's a great package that is reliable, mature, and has been stable for a long time. A diversity of design choices lets users pick pakcages based on their own preferences. Another reason for this is that I aim to work toward the best long term benchmarking solution possible (perhaps in some years there will come a time where another package makes both BenchmarkTools.jl and Chairmarks.jl obsolete). To this end, carrying forward design choices I disagree with is not beneficial. All that said, I do not want to break compatibility or change style just to stand out. Almost all of BenchmarkTools' design decisions are solid and worth copying. Things like automatic tuning, the ability to bypass that automatic tuning, a split evals/samples structure, the ability to run untimed setup code before each sample, and many more mundane details we take for granted were once clever design decisions made in BenchmarkTools or its predecessors.
Below, I'll list some specific design departures and why I made them
Chairmarks uses the abbreviated macros @b
and @be
. Descriptive names are almost always better than terse one-letter names. However I maintain that macros defined in packages and designed to be typed repeatedly at the REPL are one of the few exceptions to this "almost always". At the REPL, these macros are often typed once and never read. In this case, concision does matter and readability does not. When naming these macros I did anticipated this usage would be much more common thanin packages or reused scripts. However, if and as this changes it may be worth adding longer names for them and possibly restricting the shorter names to interactive use only.
@be
, like BenchmarkTools.@benchmark
, returns a Benchmark
object. @b
, unlike BenchmarkTools.@btime
returns a composite sample formed by computing the minimum statistic over the benchmark, rather than returning the expression result and printing runtime statistics. The reason I originally considered making this decision is that I've typed @btime sort!(x) setup=(x=rand(1000)) evals=1
into the REPL and seen the whole screen fill with random numbers one too many times. But let's consider the etymology of @time
to justify this decision further. @time
is a lovely macro that can be placed around an arbitrary long-running chunk of code or expression to report its runtime to stdout. @time
is the print statement of profiling. @btime
and @b
can very much not fill that role for three major reasons: first, most long-running code has side effects, and those macros run the code repeatedly, which could break things that rely on their side effects; second, @btime
, and to a lesser extent @b
, take ages to run; and third, only applying to @btime
, @btime
runs its body in global scope, not the scope of the caller. @btime
and @b
are not noninvasive tools to measure runtime of a portion of an algorithm, they are top-level macros to measure the runtime of an expression or function call. Their primary result is the runtime statistics of expression under benchmarking, and the conventional way to report the primary result of a macro of function call to the calling context is with a return value. Consequently @b
returns an aggregated benchmark result rather than following the pattern of @btime
.
If you are writing a script that computes some values and want to display those values to the user, you generally have to call display. Chairmarks in not an exception. If it were possible, I would consider special-casing @show @b blah
.
No Julia objects that I know of display
with a leading space on the first line. Sample
(returned by @b
) is no different. See above for why @b
returns a Sample
instead of displaying in the style of @time
.
The fact that @be
does not have leading spaces on trailing lines is an oversight.
BenchmarkTools.jl's short display mode (@btime
) displays runtime and allocations. Chairmark's short mode display mode (displaying a sample, or simply @b
at the REPL) follows Base.@time
instead and captures a wide variety of information, displaying only nonzero values. Here's a selection of the diversity of information Charimarks makes available to users
julia> @b 1+1
1.644 ns
julia> @btime 1+1;
1.083 ns (0 allocations: 0 bytes)
julia> @b rand(10)
110.621 ns (2 allocs: 144 bytes)
julia> @btime rand(10);
45.037 ns (2 allocations: 144 bytes)
julia> @b rand(10_000_000)
11.445 ms (3 allocs: 76.294 MiB, 5.43% gc time)
julia> @btime rand(10_000_000);
8.843 ms (3 allocations: 76.29 MiB)
julia> @b @eval 1+1
88.375 μs (51 allocs: 1.828 KiB)
julia> @btime @eval 1+1;
34.501 μs (47 allocations: 1.72 KiB)
julia> @b @eval begin f(x) = x+1; f(1) end
1.328 ms (863 allocs: 38.766 KiB, 56.93% compile time)
julia> @btime @eval begin f(x) = x+1; f(1) end;
1.402 ms (859 allocations: 38.66 KiB)
julia> @time @b sleep(1)
1.269034 seconds (38.04 k allocations: 1.944 MiB, 20.59% compilation time)
1.002 s (4 allocs: 112 bytes, without a warmup)
julia> @time @btime sleep(1)
1.002 s (4 allocations: 112 bytes)
16.438018 seconds (27.50 k allocations: 1.341 MiB, 24.21% gc time, 2.16% compilation time)
It would be a loss restrict ourselves to only runtime and allocations, it would be distracting to include "0% compilation time" in all of these outputs, and it would be inconsistent to make some fields (e.g. allocation count and amount) always display while others are only displayed when non-zero. Sparse display is the compromise I've chosen to get the best of both worlds.
Thanks for your quick response! I completely understand there are trade offs, that's why I labelled them as suggestions.
I'd still insist on the importance of adding "0 allocations". If you see a line like
11.445 ms
could be easily confused with some option that only provides the time, without indicating any other information.
Many thanks for such a great package! It's saving me tons of time in my workflow.
I prefer following @time
rather than @btime
, in part because I expect folks to figure out pretty quickly that @b
tracks allocations (and other things). There is no functionality I know of anywhere in the Julia ecosystem that prints time in that format (with trailing abbreviated units) and does not track allocations.
Nevertheless, I'm going to solicit other opinions on this matter of opinion :)
FWIW, I initially agreed with 1., 2., and 3. with @alfaromartino, reconsidering after the great response of @LilithHafner I totally dropped 1. and 2., but I actually need to say that @b
and @be
are really too short and I prefer the verbosity of BenchmarkTools
for the clarity it generates (also in my head). The problematic aspect of simultaneous use is unfortunate though. But I think that it should not happen that often that both packages will be used at the same time, after Chairmarks.jl matures.
What sort of longer names do folks want?
A. @btime/@benchmark
does not make sense because @b
, unlike BencharmkTools.@btime
, neither prints runtime data nor returns the evaluation result
B. @benchmark_summary
/@benchmark_full
works, though very verbose, and precludes adding an even more informative version that saves more data
C. @bench
/@benchmark
is fun and silly and concise while still being clear that we're benchmarking, but still relies on the pun: shorter name = shorter results.
D. @youaskedforalongnameyougetalongnamehahahaha
/@youaskedforalongnameyougetalongnamehahahaha2
is an option :P
E. @benchmark_results
/@min_bench
an AI generated idea
F. @full_bench
and @quick_bench
another AI idea
I prefer B or C
I would implement this by adding long names as an option and then maybe deprecating the use of short names in scripts. I do not plan to deprecate the use of short names at the REPL.
I prefer the human generated C. and I also like the pun ;)
Didn't know that specific names/macros could be exclusive to REPL. Keeping @b
for the REPL would be great, as the names proposed would be used only for sharing code.
In that case, I think that F. would be a good option (or @full_bench
and @brief_bench
).
Some other suggestions with the help of AI would @full_bench
along with
@speedmark
@timetest
@marktime
@timetrial
In any case, I think C or F are fine.
Just a couple of suggestions for this superb blazingly fast package! In summary, I propose following
BenchmarkTools
more closely regarding layout.Thanks for your work!
@btime
and is also consistent when there's at least one allocation with @b. Examplebut
34.694 ns (1 allocation: 896 bytes) # this is @btime 33.846 ns (1 allocs: 896 bytes) # this is @b
@b
for a more explicit name? I can imagine that reading code with@b
without any context calls for confusion.