johnmyleswhite / Benchmarks.jl

A new benchmarking library for Julia
Other
45 stars 15 forks source link

Use a no-inline inner function #17

Closed mbauman closed 9 years ago

mbauman commented 9 years ago

Positive effects:

Negative effects:

Demo. Minimum time is function call:

julia> @benchmark 1
================ Benchmark Results ========================
   Average elapsed time: 3.25 ns
     95% CI for average: [3.18 ns, 3.32 ns]
   Minimum elapsed time: 8.64 ns
                GC time: 0.00%
       Memory allocated: 0 bytes
  Number of allocations: 0 allocations
      Number of samples: 3401
        R² of OLS model: 0.959
Time used for benchmark: 0.10s
            Precompiled: true
       Multiple samples: true
       Search performed: true

And the pathological cases over at #16 are fixed:

julia> A = zeros(10,10);
julia> const B = zeros(10,10);

julia> @benchmark checkbounds(A, 1)
================ Benchmark Results ========================
   Average elapsed time: 3.73 ns
     95% CI for average: [3.65 ns, 3.81 ns]
   Minimum elapsed time: 7.13 ns
                GC time: 0.00%
       Memory allocated: 0 bytes
  Number of allocations: 0 allocations
      Number of samples: 3901
        R² of OLS model: 0.954
Time used for benchmark: 0.03s
            Precompiled: true
       Multiple samples: true
       Search performed: true

julia> @benchmark checkbounds(B, 1)
================ Benchmark Results ========================
   Average elapsed time: 3.87 ns
     95% CI for average: [3.79 ns, 3.94 ns]
   Minimum elapsed time: 6.27 ns
                GC time: 0.00%
       Memory allocated: 0 bytes
  Number of allocations: 0 allocations
      Number of samples: 4201
        R² of OLS model: 0.956
Time used for benchmark: 0.04s
            Precompiled: true
       Multiple samples: true
       Search performed: true

julia> @noinline f(A) = checkbounds(A, 1)
f (generic function with 1 method)

julia> @benchmark f(A)
================ Benchmark Results ========================
   Average elapsed time: 6.39 ns
     95% CI for average: [6.29 ns, 6.49 ns]
   Minimum elapsed time: 6.10 ns
                GC time: 0.00%
       Memory allocated: 0 bytes
  Number of allocations: 0 allocations
      Number of samples: 7701
        R² of OLS model: 0.952
Time used for benchmark: 0.09s
            Precompiled: true
       Multiple samples: true
       Search performed: true

julia> @benchmark f(B)
================ Benchmark Results ========================
   Average elapsed time: 6.44 ns
     95% CI for average: [6.33 ns, 6.55 ns]
   Minimum elapsed time: 6.67 ns
                GC time: 0.00%
       Memory allocated: 0 bytes
  Number of allocations: 0 allocations
      Number of samples: 6501
        R² of OLS model: 0.950
Time used for benchmark: 0.09s
            Precompiled: true
       Multiple samples: true
       Search performed: true
johnmyleswhite commented 9 years ago

Finally have some time to finish this package. I think this approach is reasonable and I'm willing to accept the cost of a function call to minimize weird effects elsewhere. Would be great to get this rebased whenever you have some time.

The one thing I'm worried about is that this seems to remove the possibility of benchmarking expressions that depend upon variables that have to be setup in the setup expression, because those variables are now globals being accessed by the inner function.

Or is that concern not relevant for a reason that's not obvious to me?

mbauman commented 9 years ago

Ah, no, you're right. I didn't think about the setup expr — I'm not sure how to best address that. In fact, I think benchmarks that require variables from a setup expr will now fail (since they aren't global). I'm not sure how to best deal with that.

One possible alternative would be to only permit benchmarking single functions with the simple @benchmark macro, and then evaluate all arguments to that function as "setup". This is similar to what @Yuyichao proposed in https://github.com/johnmyleswhite/Benchmarks.jl/issues/16#issuecomment-136337077. The constant binding/literal vs mutable binding heuristic I use here is a little subtle.

johnmyleswhite commented 9 years ago

I feel like getting this totally right is slightly above my current level of understanding of the compiler. I might try to start an e-mail thread with some core compiler folks to figure this out.