Open topolarity opened 8 months ago
This leads to another case where using check_allocs(...)
can be misleading.
You might expect that check_allocs(sum_args, (Int, Int, Int))
would check allocations for a call like sum_args(1,2,3)
:
julia> check_allocs(sum_args, (Int, Int, Int))
Any[] # Hooray, no allocations! ...Right?
In reality, the correct signature to check allocations for sum_args(1,2,3)
is:
julia> check_allocs(sum_args, (Int, Vararg{Int}))
2-element Vector{Any}:
Allocating runtime call to "jl_f_tuple" in unknown location
Dynamic dispatch to function sum in ./REPL[82]:1
That signature depends on how the Julia compiler chooses to monomorphize though, which is neither stable nor documented. So you can't really anticipate this as a user, except in cases where the compiler is reasonably expected to fully monomorphize.
Could check_allocs
somehow be taught
how the Julia compiler chooses to monomorphize
?
which is neither stable nor documented
Can this choice vary in-between julia sessions on the same julia version, or even within the same Julia session?
Can this choice vary in-between julia sessions on the same julia version, or even within the same Julia session?
Yeah, in general it can even depend on nearby methods:
julia> @noinline foo(args...) = error(args)
julia> bar1(x) = foo(x+1,x+2,x+3)
julia> bar1(0)
ERROR: (1, 2, 3)
Stacktrace:
[1] error(s::Tuple{Int64, Int64, Int64})
@ Base ./error.jl:44
[2] foo(::Int64, ::Vararg{Int64})
@ Main ./REPL[1]:1
[3] bar1(x::Int64)
@ Main ./REPL[2]:1
[4] top-level scope
@ REPL[3]:1
julia> foo(x::Int64, y::Int64) = error((x, y)) # Add an explicitly-expanded definition
julia> bar2(x) = foo(x+1,x+2,x+3)
julia> bar2(0)
ERROR: (1, 2, 3)
Stacktrace:
[1] error(s::Tuple{Int64, Int64, Int64})
@ Base ./error.jl:44
[2] foo(::Int64, ::Int64, ::Int64) # foo was more heavily-specialized now
@ Main ./REPL[1]:1
[3] bar2(x::Int64)
@ Main ./REPL[5]:1
[4] top-level scope
@ REPL[6]:1
The explicitly-expanded definition affects specialization, even though it's never called.
You can call jl_normalize_to_compilable_sig
to get the signature that we would compile for.
This code will compile 1000 different (non-allocating) copies of
foo
, where typically Julia would typically limit the expansion to just one or two extra arguments.For comparison, this code without
@check_allocs
is many, many times faster: