`@check_allocs` bypasses native Julia monomorphization limits

JuliaLang / AllocCheck.jl

AllocCheck

Other

209 stars 8 forks source link

`@check_allocs` bypasses native Julia monomorphization limits #62

Open topolarity opened 8 months ago

topolarity commented 8 months ago

@check_allocs sum_args(args...) = sum(args)
for x=1:1000
    v = collect(1:x)
    s = sum_args(v...)
    println("sum(1:$(x)) = ", s)
end

This code will compile 1000 different (non-allocating) copies of foo, where typically Julia would typically limit the expansion to just one or two extra arguments.

For comparison, this code without @check_allocs is many, many times faster:

@noinline sum_args(args...) = sum(args)
for x=1:1000
    v = collect(1:x)
    s = sum_args(v...)
    println("sum(1:$(x)) = ", s)
end

topolarity commented 8 months ago

This leads to another case where using check_allocs(...) can be misleading.

You might expect that check_allocs(sum_args, (Int, Int, Int)) would check allocations for a call like sum_args(1,2,3):

julia> check_allocs(sum_args, (Int, Int, Int))
Any[] # Hooray, no allocations! ...Right?

In reality, the correct signature to check allocations for sum_args(1,2,3) is:

julia> check_allocs(sum_args, (Int, Vararg{Int}))
2-element Vector{Any}:
 Allocating runtime call to "jl_f_tuple" in unknown location

 Dynamic dispatch to function sum in ./REPL[82]:1

That signature depends on how the Julia compiler chooses to monomorphize though, which is neither stable nor documented. So you can't really anticipate this as a user, except in cases where the compiler is reasonably expected to fully monomorphize.

baggepinnen commented 8 months ago

Could check_allocs somehow be taught

how the Julia compiler chooses to monomorphize

which is neither stable nor documented

Can this choice vary in-between julia sessions on the same julia version, or even within the same Julia session?

topolarity commented 8 months ago

Can this choice vary in-between julia sessions on the same julia version, or even within the same Julia session?

Yeah, in general it can even depend on nearby methods:

julia> @noinline foo(args...) = error(args)

julia> bar1(x) = foo(x+1,x+2,x+3)
julia> bar1(0)
ERROR: (1, 2, 3)
Stacktrace:
 [1] error(s::Tuple{Int64, Int64, Int64})
   @ Base ./error.jl:44
 [2] foo(::Int64, ::Vararg{Int64})
   @ Main ./REPL[1]:1
 [3] bar1(x::Int64)
   @ Main ./REPL[2]:1
 [4] top-level scope
   @ REPL[3]:1

julia> foo(x::Int64, y::Int64) = error((x, y)) # Add an explicitly-expanded definition

julia> bar2(x) = foo(x+1,x+2,x+3)
julia> bar2(0)
ERROR: (1, 2, 3)
Stacktrace:
 [1] error(s::Tuple{Int64, Int64, Int64})
   @ Base ./error.jl:44
 [2] foo(::Int64, ::Int64, ::Int64) # foo was more heavily-specialized now
   @ Main ./REPL[1]:1
 [3] bar2(x::Int64)
   @ Main ./REPL[5]:1
 [4] top-level scope
   @ REPL[6]:1

The explicitly-expanded definition affects specialization, even though it's never called.

JeffBezanson commented 8 months ago

You can call jl_normalize_to_compilable_sig to get the signature that we would compile for.