johnmyleswhite / Benchmarks.jl

A new benchmarking library for Julia
Other
45 stars 15 forks source link

Benchmarking no-ops #5

Closed sglyon closed 9 years ago

sglyon commented 9 years ago

Accidentally posted on johnmyleswhite/Benchmark.jl#28

Original post

An example:

julia> _f1(x,y) = x > 0.0 && y > 0.0
_f1 (generic function with 1 method)

julia> _f2(x,y) = x*y > 0.0
_f2 (generic function with 1 method)

julia> @benchmark _f1(1.0, 4.134124343)
ERROR: InexactError()
 in trunc at ./float.jl:362
 in execute at /Users/sglyon/.julia/v0.4/Benchmarks/src/execute.jl:130
 in execute at /Users/sglyon/.julia/v0.4/Benchmarks/src/execute.jl:42

julia> @benchmark _f2(1.0, 4.134124343)
ERROR: InexactError()
 in trunc at ./float.jl:362
 in execute at /Users/sglyon/.julia/v0.4/Benchmarks/src/execute.jl:130
 in execute at /Users/sglyon/.julia/v0.4/Benchmarks/src/execute.jl:42
sglyon commented 9 years ago

Did some digging. Looks like the problem does not have to do with storing the UInt64 from time_ns, but actually happens in the loop where we do the OLS fitting to estimate the evaluation per time.

The n_evals increases by a factor of 1.1 each iteration, and in the example I gave ends it up getting to 8467170729913085952.0*1.1 where it throws an error because that overflows Int64.

I tried switching convert(Integer, n_evals) to convert(Uint64, n_evals). This allowed it to go through more iterations, but also ended in the same error.

If you run execute with verbose=true you see that b goes to 0 and the r^2 goes negative -- indicating problems in the regression.

Any ideas?

johnmyleswhite commented 9 years ago

How long did that take to run? It seems like we should never be able to reach anything close to that number of iterations before we need to rethink our analysis strategy.

sglyon commented 9 years ago

Yeah, it did seem crazy.

Turns out llvm is the culprit. Consider this example:

julia> _f2(x,y) = x*y > 0.0
_f2 (generic function with 1 method)

julia> Benchmarks.@benchmark _f2(1.0, 4.13432)
num_evals = ceil(Integer,n_evals) = 2
num_evals = ceil(Integer,n_evals) = 3
num_evals = ceil(Integer,n_evals) = 3
num_evals = ceil(Integer,n_evals) = 3
num_evals = ceil(Integer,n_evals) = 3
num_evals = ceil(Integer,n_evals) = 4
num_evals = ceil(Integer,n_evals) = 4
num_evals = ceil(Integer,n_evals) = 4
num_evals = ceil(Integer,n_evals) = 5
num_evals = ceil(Integer,n_evals) = 5
num_evals = ceil(Integer,n_evals) = 6
num_evals = ceil(Integer,n_evals) = 6
num_evals = ceil(Integer,n_evals) = 7
num_evals = ceil(Integer,n_evals) = 7
num_evals = ceil(Integer,n_evals) = 8
num_evals = ceil(Integer,n_evals) = 9
...
MANY LINES REMOVED
...
num_evals = ceil(Integer,n_evals) = 8467170729913085952
ERROR: InexactError()
 in trunc at ./float.jl:362
 in execute at /Users/sglyon/.julia/v0.4/Benchmarks/src/execute.jl:100
 in execute at /Users/sglyon/.julia/v0.4/Benchmarks/src/execute.jl:42

julia> function _goo()
       out = false
       for i=1:8467170729913085952
           out =_f2(1.0, 3.41343)
       end

       out
       end
_goo (generic function with 1 method)

julia> @code_llvm _goo()

define i1 @julia__goo_22383() {
top:
  ret i1 true
}

For my overly simple function it looks like the compiler was smart enough to realize the result of the code in the loop was deterministic and it just returns true straight-away.

When I have a less trivial, but equally fast function to benchmark, everything works smoothly.

I think this can be closed if you agree.

johnmyleswhite commented 9 years ago

Maybe we should retitle this issue? I think we need a coherent strategy for dealing with functions that get transformed into no-ops. A lot of the current design was chosen precisely so you could get "ecologically valid" timings like you'd get when the compiler makes its full optimization passes. But this is an edge case that I think we need to solve. Just not clear how yet.

johnmyleswhite commented 9 years ago

Another possible example:

julia> using Benchmarks

julia> @benchmark (1,2,3)
mbauman commented 9 years ago

This is fixed as long as we use a @noinline inner function.