Closed sglyon closed 9 years ago
Did some digging. Looks like the problem does not have to do with storing the UInt64
from time_ns
, but actually happens in the loop where we do the OLS fitting to estimate the evaluation per time.
The n_evals
increases by a factor of 1.1 each iteration, and in the example I gave ends it up getting to 8467170729913085952.0*1.1
where it throws an error because that overflows Int64
.
I tried switching convert(Integer, n_evals)
to convert(Uint64, n_evals)
. This allowed it to go through more iterations, but also ended in the same error.
If you run execute
with verbose=true
you see that b
goes to 0 and the r^2
goes negative -- indicating problems in the regression.
Any ideas?
How long did that take to run? It seems like we should never be able to reach anything close to that number of iterations before we need to rethink our analysis strategy.
Yeah, it did seem crazy.
Turns out llvm is the culprit. Consider this example:
julia> _f2(x,y) = x*y > 0.0
_f2 (generic function with 1 method)
julia> Benchmarks.@benchmark _f2(1.0, 4.13432)
num_evals = ceil(Integer,n_evals) = 2
num_evals = ceil(Integer,n_evals) = 3
num_evals = ceil(Integer,n_evals) = 3
num_evals = ceil(Integer,n_evals) = 3
num_evals = ceil(Integer,n_evals) = 3
num_evals = ceil(Integer,n_evals) = 4
num_evals = ceil(Integer,n_evals) = 4
num_evals = ceil(Integer,n_evals) = 4
num_evals = ceil(Integer,n_evals) = 5
num_evals = ceil(Integer,n_evals) = 5
num_evals = ceil(Integer,n_evals) = 6
num_evals = ceil(Integer,n_evals) = 6
num_evals = ceil(Integer,n_evals) = 7
num_evals = ceil(Integer,n_evals) = 7
num_evals = ceil(Integer,n_evals) = 8
num_evals = ceil(Integer,n_evals) = 9
...
MANY LINES REMOVED
...
num_evals = ceil(Integer,n_evals) = 8467170729913085952
ERROR: InexactError()
in trunc at ./float.jl:362
in execute at /Users/sglyon/.julia/v0.4/Benchmarks/src/execute.jl:100
in execute at /Users/sglyon/.julia/v0.4/Benchmarks/src/execute.jl:42
julia> function _goo()
out = false
for i=1:8467170729913085952
out =_f2(1.0, 3.41343)
end
out
end
_goo (generic function with 1 method)
julia> @code_llvm _goo()
define i1 @julia__goo_22383() {
top:
ret i1 true
}
For my overly simple function it looks like the compiler was smart enough to realize the result of the code in the loop was deterministic and it just returns true
straight-away.
When I have a less trivial, but equally fast function to benchmark, everything works smoothly.
I think this can be closed if you agree.
Maybe we should retitle this issue? I think we need a coherent strategy for dealing with functions that get transformed into no-ops. A lot of the current design was chosen precisely so you could get "ecologically valid" timings like you'd get when the compiler makes its full optimization passes. But this is an edge case that I think we need to solve. Just not clear how yet.
Another possible example:
julia> using Benchmarks
julia> @benchmark (1,2,3)
This is fixed as long as we use a @noinline
inner function.
Accidentally posted on johnmyleswhite/Benchmark.jl#28
Original post
An example: