Benchmarking in local scopes includes dynamic dispatch to inner function

mbauman commented 9 years ago

When we define an inner @noinline function in the @benchmarkable macro, it solves quite a few strange issues with inlining and optimizations. However, because it means that we're defining the @noinline inner function at the @benchmark site, the const is ignored and we end up with dynamic dispatch… which may include allocations to box arguments.

I'm not sure if this is a solvable problem. We also can't define new types at local scope, so any fast-anonymous-like tricks won't work, either. And if we eval the inner function at global scope, it's possible that the passed function won't be defined. We may have to choose between @noinline inner functions and benchmarking in for loops.

johnmyleswhite commented 9 years ago

What's the main use case of for loops? Benchmarking a function against many inputs?

I also wonder if @JeffBezanson or @vtjnash have any insight into how we should set up the basic benchmarking machinery that records the time it takes to evaluate a single function call. Since they probably have no context, the current machinery that @mbauman is referring to is at https://github.com/johnmyleswhite/Benchmarks.jl/blob/ddd64a35e26b8027c1e746316102c2f901114cae/src/benchmarkable.jl

mbauman commented 9 years ago

The usecase is https://github.com/staticfloat/Perftests.jl: https://github.com/staticfloat/Perftests.jl/blob/a8d48b6cf4efd86fb81ce0d956fb90d491cf7621/benchmarks/array/perf.jl#L29-L32. The @perf macro wraps @benchmark.

There may be ways to support this by doing setup before the loop.

staticfloat commented 9 years ago

I have much more deeply nested calls than the one shown above; check out the sort benchmarks, they're crazy nested.

johnmyleswhite / Benchmarks.jl

Benchmarking in local scopes includes dynamic dispatch to inner function #27