Closed johnnychen94 closed 4 years ago
If you come up with a more "inspectable" benchmark (e.g., a manually-written loop) that still shows the same weird effect, then we can analyze why it happens and if necessary report it as a Julia issue.
I'm extremely unsure of it being a benchmark noise, in my local macOS
# without T
julia> for _ in 1:10
t= @belapsed getindex($Ap, 1, 1);
println(t)
end
2.473e-9
3.292e-9
3.294e-9
3.289e-9
3.289e-9
3.292e-9
3.298e-9
3.283e-9
3.291e-9
3.291e-9
# with T
julia> for _ in 1:10
t= @belapsed getindex($Ap, 1, 1);
println(t)
end
2.4710000000000003e-9
2.476e-9
2.476e-9
2.4740000000000003e-9
2.473e-9
2.4710000000000003e-9
2.476e-9
2.4780000000000003e-9
2.476e-9
2.476e-9
and in my remote linux server:
julia> versioninfo()
Julia Version 1.5.2
Commit 539f3ce943 (2020-09-23 23:17 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, haswell)
Environment:
JULIA_DEPOT_PATH = /home/jc/local/.julia
JULIA_EDITOR = vim
JULIA_NUM_THREADS = 4
JULIA_PKG_SERVER = https://mirrors.lflab.cn/julia
# without
julia> for _ in 1:10
t= @belapsed getindex($Ap, 1, 1);
println(t)
end
4.037e-9
3.672e-9
3.675e-9
3.675e-9
3.671e-9
3.672e-9
3.674e-9
3.669e-9
3.674e-9
3.907e-9
# with
julia> for _ in 1:10
t= @belapsed getindex($Ap, 1, 1);
println(t)
end
2.7690000000000003e-9
2.762e-9
2.766e-9
2.758e-9
2.774e-9
2.76e-9
2.9409999999999997e-9
2.759e-9
2.768e-9
2.7570000000000003e-9
I've tested it on 1.0.5, 1.1.1, 1.2.0, 1.3.1, 1.4.2, 1.5.2 and 1.6.0-DEV.1177. The only version I can't see a difference is 1.0.5.
I'll try and see if I can make more reproducible results on other examples.
Still unknown to me and is not reproducible in other similar cases.
Running out of time, might just leave it as noise and not investigate anymore.
Sounds good. Perhaps I should have explained that the reason this didn't make sense to me is that a return-type annotation causes Julia's lowering to change foo(args..)
to convert(T, foo(args...))::T
. In cases where foo
is poorly-inferred, this can be very helpful because later code can be specialized on ::T
rather than having to be crafted to allow ::Any
(for example, later calls can be made by compile-time dispatch rather than runtime-dispatch).
But when possible, generally a much better strategy is to make foo
itself inferrable. When foo
is inferred to return ::T
---the same T
as in the return-type annotation---then the optimization phase of the compiler should just elide the convert
and type-assert. Which, of course, is the same as not having the return-type annotation at all.
I'm not that confident about the benchmark result, though. Probably I'm just benchmarking the cached performance here... but I guess it still explains some differences.
Unsure the cause here, adding
T
notation gives a better performance formissing
case: