JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.52k stars 5.46k forks source link

Common code not included in optionally-generated function #55101

Open martinholters opened 2 months ago

martinholters commented 2 months ago

Consider the example from the metaprogramming docs:

julia> function sub2ind_gen_impl(dims::Type{T}, I...) where T <: NTuple{N,Any} where N
           ex = :(I[$N] - 1)
           for i = (N - 1):-1:1
               ex = :(I[$i] - 1 + dims[$i] * $ex)
           end
           return :($ex + 1)
       end;

julia> function sub2ind_gen_fallback(dims::NTuple{N}, I) where N
           ind = I[N] - 1
           for i = (N - 1):-1:1
               ind = I[i] - 1 + dims[i]*ind
           end
           return ind + 1
       end;

julia> function sub2ind_gen(dims::NTuple{N}, I::Integer...) where N
           length(I) == N || error("partial indexing is unsupported")
           if @generated
               return sub2ind_gen_impl(dims, I...)
           else
               return sub2ind_gen_fallback(dims, I)
           end
       end;

julia> sub2ind_gen((3, 5), 1, 2)
4

With this explanation: https://github.com/JuliaLang/julia/blob/ad407a6d2198c999f8f7b48a85d190694e392eb5/doc/src/manual/metaprogramming.md?plain=1#L1482-L1484

But that seems to be wrong. The error check is not run (in the generated case):

julia> sub2ind_gen((3, 5), 1)
ERROR: BoundsError: attempt to access Tuple{Int64} at index [2]
Stacktrace:
 [1] getindex
   @ ./tuple.jl:31 [inlined]
 [2] macro expansion
   @ ./REPL[3]:0 [inlined]
 [3] sub2ind_gen(dims::Tuple{Int64, Int64}, I::Int64)
   @ Main ./REPL[3]:1
 [4] top-level scope
   @ REPL[5]:1

julia> sub2ind_gen((3, 5), 1, 2, 3)
4

Interestingly, it seems this hasn't regressed, but has just never worked. It's somewhat surprising this has gone unnoticed for so long - or it hasn't and this can be closed as a duplicate of an issue I've failed to find.

martinholters commented 2 months ago

A couple of cases where base includes such apparently ineffective error-checks: EDIT: These are all ok and work as intended it seems, see next comment. https://github.com/JuliaLang/julia/blob/ad407a6d2198c999f8f7b48a85d190694e392eb5/base/combinatorics.jl#L46-L48 https://github.com/JuliaLang/julia/blob/ad407a6d2198c999f8f7b48a85d190694e392eb5/base/multidimensional.jl#L1184-L1198 https://github.com/JuliaLang/julia/blob/ad407a6d2198c999f8f7b48a85d190694e392eb5/base/ntuple.jl#L69-L72 https://github.com/JuliaLang/julia/blob/ad407a6d2198c999f8f7b48a85d190694e392eb5/base/ntuple.jl#L80-L84 The one multidimensional.jl seems particularly problematic as the @generated branch applies @inbounds, but the bounds checks are not actually included in the code. Fortunately, the @generated branch also accesses the variables defined in to common code which fails, causing fall-back to the non-@generated branch.

martinholters commented 2 months ago

Ah, it's a matter of the return in the @generated branch:

julia> function foo()
           error("oops")
           if @generated
               return :(nothing)
           else
               nothing
           end
       end
foo (generic function with 1 method)

julia> function bar()
           error("oops")
           if @generated
               :(nothing)
           else
               nothing
           end
       end
bar (generic function with 1 method)

julia> foo()

julia> bar()
ERROR: oops

So maybe the behavior is correct, it's just the documentation that needs adjustment? But having the explicit return prevent inclusion of the common block appears to be a massive foot-gun. Could that be a feature, or is it a pure mis-feature? (Good news, none of the uses I cited above have this return, so they should all be ok.)

Relatedly, if there are multiple if @generated, only the first returned expression will be included in the generated code. Only if no expression is explicitly returned, all of them are concatenated. So this is consistent, but to me looks neither intuitive nor useful.

MilesCranmer commented 2 months ago

I think this a bug. The presence of an explicit return on the last expression should not affect the behavior, since it doesn't affect behavior anywhere else. Some Julia packages even assume that using return on the last line is a stylistic choice – for example, JuliaFormatter even has a setting to rewrite this https://domluna.github.io/JuliaFormatter.jl/dev/#always_use_return

adienes commented 2 months ago

there is an enormous number of footguns and wats on return. see also https://github.com/JuliaLang/julia/issues/50415

martinholters commented 2 months ago

What happens, I believe, is this: Consider code like

function f()
    common_code
    if @generated
        if_generated_code
    else
        if_non_generated_code
    end
end

The generator basically becomes Expr(:block, :(common_code), if_generated_code). Now if the if_generated_code contains (unquoted!) return, that returns straight out of the generator. E.g. for foo and bar as above, we have

julia> code_lowered(first(methods(foo)).generator.gen)
1-element Vector{Core.CodeInfo}:
 CodeInfo(
1 ─      nothing
│   %2 = $(Expr(:copyast, :($(QuoteNode(:(error("oops")))))))
└──      return :nothing
2 ─ %4 = Core._expr(:block, $(QuoteNode(:(#= REPL[1]:2 =#))), %2, $(QuoteNode(:(#= REPL[1]:3 =#))), nothing)
└──      return %4
)

julia> code_lowered(first(methods(bar)).generator.gen)
1-element Vector{Core.CodeInfo}:
 CodeInfo(
1 ─      nothing
│   %2 = $(Expr(:copyast, :($(QuoteNode(:(error("oops")))))))
│   %3 = Core._expr(:block, $(QuoteNode(:(#= REPL[2]:2 =#))), %2, $(QuoteNode(:(#= REPL[2]:3 =#))), :nothing)
└──      return %3
)

I think the final nothing in %4 in the former is actually %3, i.e. the result of the return, so yes, this is closely related to #50415 indeed.

adienes commented 1 month ago

surely this is a bug? why remove the labels? the docs explicitly say

Notice that we added an error check to the top of the function. This code will be common to both versions, and is run-time code in both versions (it will be quoted and returned as an expression from the generated version).

which is not matched by the implementation

vtjnash commented 1 month ago

The @generated is just a macro, and like other macros, this is just a common gotcha that return can skip followup code

adienes commented 1 month ago

I hear you --- but it's not really "followup code" imo --- the assertions here lexically precede the return.