Feature request: concrete-only dispatch

timholy commented 2 years ago

Invalidation is the major threat to useful precompilation, and aside from outright piracy (which is rare) it overwhelmingly comes from poorly-inferred calls that are handled without using runtime dispatch. For example, for method definitions

f(::Int) = 1
f(::String) = "hello"
g(c) = f(c[1])

a check with @code_typed reveals that g(Any[0]) performs "world-splitting" to look up the callees in advance. The problem arises for a package that defines a new method of f, forcing invalidation of the previously-compiled code for g. There are analogs of this behavior for packages that define new subtypes of abstract types, too.

Currently, we can use @invokelatest to force runtime dispatch at the call site. However, it would be nice to be able to say "use runtime dispatch unless the argument types are known concretely, in which case you can use ordinary dispatch." Moreover, it might be nice to be able to mark certain methods as being inferrable only for concrete arg types, so that you avoid any kind of abstract inference (which is itself slow as well as introducing vulnerability to invalidation). A possible demo of syntax:

To mark a call as requiring runtime dispatch if any of the arguments are not concretely-inferred:

x, y = @concrete(f(a, b, c))

To mark a method as dispatchable only with concretely-inferred argtypes:

@concrete f(a, b, c) = ...

Discussed with @aviatesk

aviatesk commented 2 years ago

Maybe better to begin with a definition site annnotation, as this is somewhat very similar to @nospecialize. Some implementation ideas:

widen the return type inference to Any and avoid adding backedge if the method is annotated as @concrete but the signature isn't fully concrete
prevents inlining entirely of non-concrete callsites of a @concrete-method (even if there is only a single matching method)
we also need to think about how to deal with @nospecialize-arguments?
```
@concrete f(a, @nospecialize b) = ...
```
a is required to be concrete, but we should also require b to be concrete for this case?

aviatesk commented 2 years ago

We talked about this idea on a call and wondered if we really need new annotation for this as we already have some tools for suppressing inference (like module-wide Base.Experimental.@optlevel and method-wise Base.Experimental.@max_methods annotations). As for #45051, Jameson said we may want to tweak method lookup algorithm instead to fix the invalidations reported there.

timholy commented 2 years ago

Wouldn't that have to be done at the time of g definition? What happens, then, if a package adds a new f method but g is defined in Base?

JeffBezanson commented 2 years ago

The method definition annotation makes the most sense to me; you're basically hinting "this code does not benefit from abstract inference", which can reasonably be considered local to that definition. There's a bit of risk of worsening types in callers, but it's not much worse than module-level max_methods.

The call site annotation is somewhat ok, since it refers to the arguments (which are owned by the caller), but I think harder to use and worse from a code-uglification viewpoint.

a is required to be concrete, but we should also require b to be concrete for this case?

Strictly speaking I think yes, since nospecialize has always referred to compilation instead of inference.

StefanKarpinski commented 2 years ago

What about generating code that wouldn't be invalidated by additional methods? Ie it works for the concrete types that exist at compilation time but has a branch for other types of they happen? Or am I misunderstanding the problem?

timholy commented 2 years ago

We kind of do that now, but we aggressively recompile methods to "re-split the world" when new methods are defined. There's a walk-through of this in https://julialang.org/blog/2020/08/invalidations/#an_example_compilation_for_containers_with_unknown_element_types and the "Triggering method invalidation" section thereafter. (In that example all the added methods return an Int; it's instructive to modify that demo by defining a method that doesn't, like f(::Bool) = "hello", and watch how clever our codegen is for such cases.) We do this to enhance runtime performance, and I don't doubt there are cases where it's hugely helpful.

Once there are >3 applicable methods we stop trying to apply such optimizations. At that point, the compiled code indeed becomes resistant to invalidation, as you suggest. The only real problem is that we don't currently have a way of signaling that we should just generate this form of code at the outset, and here I'm proposing that we really need this if precompilation is to be successful in the way I think we'd like it to be.

To be clear, while it would take me some digging to figure out how to implement this (I'm grateful for @aviatesk's self-assignment!), for someone with the right expertise I think/hope this won't be a hugely-difficult thing to implement. But because we already have a growing stable of manual annotations (@inline, @noinline @constprop, @nospecialize, ...), before adding yet more it's worth pondering over whether there are potential alternatives.

timholy commented 2 years ago

@StefanKarpinski's point makes me think: how different would this be in practice from having a per-method MAX_METHODS setting?

aviatesk commented 2 years ago

Having spent a bit more time on this, I think a method-wise max_method configuration might be a best current option for us to have a reasonable performance for well-inferred code while avoiding invalidations in poorly-inferred situations. A detailed explanation follows.

For the sake of explanations below, I'd like to use this utility AbstractInterpreter, which behaves almost same as the NativeInterpreter but reports invalidations verbosely:

[CLICKME] InvalidationExplorer.jl

```julia const CC = Core.Compiler import Core: MethodInstance, CodeInstance import .CC: WorldRange, WorldView function add_callback!(linfo, dict) invalidate! = geninvalidate!(dict) if !isdefined(linfo, :callbacks) linfo.callbacks = Any[invalidate!] else callbacks = linfo.callbacks::Vector{Any} if !any(function (@nospecialize(cb),) cb === invalidate! end, callbacks) push!(callbacks, invalidate!) end end return nothing end function geninvalidate!(interp) return function invalidate!(replaced, max_world, depth = 0) delete!(interp.cache.dict, replaced) name = nameof(typeof(interp)) Core.println('[', name, ']', " invalidated: ", replaced) if isdefined(replaced, :backedges) for mi in replaced.backedges mi = mi::MethodInstance invalidate!(mi, max_world, depth+1) end end return nothing end end struct InvalidationExplorerCache dict::IdDict{MethodInstance,CodeInstance} end const GLOBAL_CACHE = InvalidationExplorerCache(IdDict{MethodInstance,CodeInstance}()) struct InvalidationExplorer <: CC.AbstractInterpreter interp::CC.NativeInterpreter cache::InvalidationExplorerCache InvalidationExplorer(world = Base.get_world_counter(); interp = CC.NativeInterpreter(world), cache = GLOBAL_CACHE, ) = new(interp, cache) end CC.InferenceParams(interp::InvalidationExplorer) = CC.InferenceParams(interp.interp) CC.OptimizationParams(interp::InvalidationExplorer) = CC.OptimizationParams(interp.interp) CC.get_world_counter(interp::InvalidationExplorer) = CC.get_world_counter(interp.interp) CC.get_inference_cache(interp::InvalidationExplorer) = CC.get_inference_cache(interp.interp) CC.code_cache(interp::InvalidationExplorer) = WorldView(interp.cache, WorldRange(CC.get_world_counter(interp))) CC.get(wvc::WorldView{<:InvalidationExplorerCache}, mi::MethodInstance, default) = get(wvc.cache.dict, mi, default) CC.getindex(wvc::WorldView{<:InvalidationExplorerCache}, mi::MethodInstance) = getindex(wvc.cache.dict, mi) CC.haskey(wvc::WorldView{<:InvalidationExplorerCache}, mi::MethodInstance) = haskey(wvc.cache.dict, mi) CC.setindex!(wvc::WorldView{<:InvalidationExplorerCache}, ci::CodeInstance, mi::MethodInstance) = setindex!(wvc.cache.dict, ci, mi) function CC.cache_result!(interp::InvalidationExplorer, result::CC.InferenceResult) add_callback!(result.linfo, interp) Base.@invoke CC.cache_result!(interp::CC.AbstractInterpreter, result::CC.InferenceResult) end ```

[CLICKME] A showcase of the usage

The examples were adapted from [the blog post](https://julialang.org/blog/2020/08/invalidations/#triggering_method_invalidation): ```julia julia> f(x::Int) = 1 f (generic function with 1 method) julia> applyf(container) = f(container[1]) applyf (generic function with 1 method) julia> @code_typed interp=InvalidationExplorer() applyf(Any[100]) CodeInfo( 1 ─ %1 = Base.arrayref(true, container, 1)::Any │ %2 = (isa)(%1, Int64)::Bool └── goto #3 if not %2 2 ─ goto #4 3 ─ %5 = Main.f(%1)::Int64 └── goto #4 4 ┄ %7 = φ (#2 => 1, #3 => %5)::Int64 └── return %7 ) => Int64 julia> @code_typed interp=InvalidationExplorer() applyf([true]) CodeInfo( 1 ─ %1 = Base.arrayref(true, container, 1)::Bool │ Main.f(%1)::Union{} └── unreachable ) => Union{} julia> f(::Bool) = 2 [:InvalidationExplorer] invalidated: applyf(Array{Any, 1}) from applyf(Any) [:InvalidationExplorer] invalidated: applyf(Array{Bool, 1}) from applyf(Any) f (generic function with 2 methods) julia> @code_typed interp=InvalidationExplorer() applyf([true]) CodeInfo( 1 ─ Base.arrayref(true, container, 1)::Bool └── return 2 ) => Int64 julia> @code_typed interp=InvalidationExplorer() applyf(Any[true]) CodeInfo( 1 ─ %1 = Base.arrayref(true, container, 1)::Any │ %2 = (isa)(%1, Int64)::Bool └── goto #3 if not %2 2 ─ goto #6 3 ─ %5 = (isa)(%1, Bool)::Bool └── goto #5 if not %5 4 ─ goto #6 5 ─ %8 = Main.f(%1)::Int64 └── goto #6 6 ┄ %10 = φ (#2 => 1, #4 => 2, #5 => %8)::Int64 └── return %10 ) => Int64 julia> f(::String) = 3 [:InvalidationExplorer] invalidated: applyf(Array{Any, 1}) from applyf(Any) [:InvalidationExplorer] invalidated: applyf(Array{Any, 1}) from applyf(Any) f (generic function with 3 methods) julia> f(::Dict) = 4 f (generic function with 4 methods) julia> @code_typed interp=InvalidationExplorer() applyf(Any[true]) CodeInfo( 1 ─ %1 = Base.arrayref(true, container, 1)::Any │ %2 = Main.f(%1)::Any └── return %2 ) => Any julia> f(::Nothing) = 5 # no more invalidation f (generic function with 5 methods) ```

In order to discuss the situation, it might be helpful to first understand that there are two different places where we need to account for invalidations:

(IPO context) when abstract interpretation figures out return type of a call: new method can refine return type inference
(local context) when the optimizers inlines callees: new method can make union-split transformation invalid

And since the local optimization (2) really relies on the preceded interprocedural abstract interpretation (1), it is most easier to suppress (1) when trying o avoid the risk of invalidation[^1]. [^1]: Technically, we can encounter a situation where we need to account for invalidations imposed (2) but not for (1), e.g. when abstract interpretation figures out the the precise return type is Any but still also figures out there are only few numbers of methods. But I think this kind of situation happens very rarely and even non-deterministically.

Ie it works for the concrete types that exist at compilation time but has a branch for other types of they happen?

We kind of do this already to some extent. E.g. we don't need to account for invalidations from (1) when abstract interpretation figures out a return value isn't used at all and so it would be okay to discard the return type information so that no refinement won't happen in the future. In the example below the inlining optimization is still enabled using the few concrete information available at the time of the first compilation. Note that union-split generates a fallback branch for dynamic dispatch and thus the invalidation risks from (2) is resilient against a new method definition:

julia> @noinline onlyunionsplit(::Int) = println(:Int)
onlyunionsplit (generic function with 1 method)

julia> @noinline onlyunionsplit(::String) = println("string")
onlyunionsplit (generic function with 2 methods)

julia> @inline call_onlyunionsplit(xs) = (onlyunionsplit(xs[1]); nothing) # the result of `onlyunionsplit(xs[1])` isn't used, and so IPO abstract interpretation can just mark it as `::Any`-typed (and so no possible refinement in the future)
call_onlyunionsplit (generic function with 1 method)

julia> code_typed((Vector{Any},); interp=InvalidationExplorer()) do xs
           call_onlyunionsplit(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1  = Base.arrayref(true, xs, 1)::Any
│   %2  = Main.onlyunionsplit::typeof(onlyunionsplit)
│   %3  = (isa)(%1, Int64)::Bool
└──       goto #3 if not %3
2 ─ %5  = π (%1, Int64)
│         invoke %2(%5::Int64)::Any
└──       goto #6
3 ─ %8  = (isa)(%1, String)::Bool
└──       goto #5 if not %8
4 ─ %10 = π (%1, String)
│         invoke %2(%10::String)::Any
└──       goto #6
5 ─       Main.onlyunionsplit(%1)::Any
└──       goto #6
6 ┄ %15 = Main.nothing::Nothing
└──       goto #7
7 ─       return %15
) => Nothing

julia> @noinline onlyunionsplit(::Float64) = println(:Float64) # no invalidation happens here
onlyunionsplit (generic function with 3 methods)

julia> code_typed((Vector{Any},); interp=InvalidationExplorer()) do xs # we get the same output as before
           call_onlyunionsplit(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1  = Base.arrayref(true, xs, 1)::Any
│   %2  = Main.onlyunionsplit::typeof(onlyunionsplit)
│   %3  = (isa)(%1, Int64)::Bool
└──       goto #3 if not %3
2 ─ %5  = π (%1, Int64)
│         invoke %2(%5::Int64)::Any
└──       goto #6
3 ─ %8  = (isa)(%1, String)::Bool
└──       goto #5 if not %8
4 ─ %10 = π (%1, String)
│         invoke %2(%10::String)::Any
└──       goto #6
5 ─       Main.onlyunionsplit(%1)::Any
└──       goto #6
6 ┄ %15 = Main.nothing::Nothing
└──       goto #7
7 ─       return %15
) => Nothing

@StefanKarpinski's point makes me think: how different would this be in practice from having a per-method MAX_METHODS setting?

The per-method MAX_METHODS directly configures IPO abstract interpretation (1) and so would be useful to enable abstract interpretation/optimizations only for well-inferred code and avoid invalidations for other poorly-inferred code. And I think Base.Experimental.@max_methods 1 function f end could achieve most parts of the semantics for @concrete function f end discussed above:

julia> function f3 end # default configuration

julia> f3(::Int) = 1
f3 (generic function with 1 method)

julia> f3(::String) = "hello"
f3 (generic function with 2 methods)

julia> @inline g3(xs) = f3(xs[1])
g3 (generic function with 1 method)

julia> code_typed((Vector{Int},); interp=InvalidationExplorer()) do xs # fully optimized for fully-inferred code
           g3(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─     Base.arrayref(true, xs, 1)::Int64
└──     return 1
) => Int64

julia> code_typed((Vector{Any},); interp=InvalidationExplorer()) do xs # still optimized for poorly-inferred code
           g3(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1  = Base.arrayref(true, xs, 1)::Any
│   %2  = (isa)(%1, Int64)::Bool
└──       goto #3 if not %2
2 ─       goto #6
3 ─ %5  = (isa)(%1, String)::Bool
└──       goto #5 if not %5
4 ─       goto #6
5 ─ %8  = Main.f3(%1)::Union{Int64, String}
└──       goto #6
6 ┄ %10 = φ (#2 => 1, #4 => "hello", #5 => %8)::Union{Int64, String}
└──       goto #7
7 ─       return %10
) => Union{Int64, String}

julia> f3(::Float64) = 2. # invalidation happens here
[:InvalidationExplorer] invalidated: g3(Array{Any, 1}) from g3(Any)
[:InvalidationExplorer] invalidated: #8(Array{Any, 1}) from #8(Any)
f3 (generic function with 3 methods)

julia> code_typed((Vector{Any},); interp=InvalidationExplorer()) do xs
           g3(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1  = Base.arrayref(true, xs, 1)::Any
│   %2  = (isa)(%1, Int64)::Bool
└──       goto #3 if not %2
2 ─       goto #8
3 ─ %5  = (isa)(%1, String)::Bool
└──       goto #5 if not %5
4 ─       goto #8
5 ─ %8  = (isa)(%1, Float64)::Bool
└──       goto #7 if not %8
6 ─       goto #8
7 ─ %11 = Main.f3(%1)::Union{Float64, Int64, String}
└──       goto #8
8 ┄ %13 = φ (#2 => 1, #4 => "hello", #6 => 2.0, #7 => %11)::Union{Float64, Int64, String}
└──       goto #9
9 ─       return %13
) => Union{Float64, Int64, String}

julia> Base.Experimental.@max_methods 1 function f1 end # @concrete configuration
0x01

julia> f1(::Int) = 1
f1 (generic function with 1 method)

julia> f1(::String) = "hello"
f1 (generic function with 2 methods)

julia> @inline g1(xs) = f1(xs[1])
g1 (generic function with 1 method)

julia> code_typed((Vector{Int},); interp=InvalidationExplorer()) do xs # fully optimized for fully-inferred code
           g1(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─     Base.arrayref(true, xs, 1)::Int64
└──     return 1
) => Int64

julia> code_typed((Vector{Union{Int,String}},); interp=InvalidationExplorer()) do xs # fully optimized for concrete-union-split code
           g1(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1  = Base.arrayref(true, xs, 1)::Union{Int64, String}
│   %2  = (isa)(%1, Int64)::Bool
└──       goto #3 if not %2
2 ─       goto #6
3 ─ %5  = (isa)(%1, String)::Bool
└──       goto #5 if not %5
4 ─       goto #6
5 ─       Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}
└──       unreachable
6 ┄ %10 = φ (#2 => 1, #4 => "hello")::Union{Int64, String}
└──       goto #7
7 ─       return %10
) => Union{Int64, String}

julia> code_typed((Vector{Any},); interp=InvalidationExplorer()) do xs # not optimized for poorly-inferred code
           g1(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Base.arrayref(true, xs, 1)::Any
│   %2 = Main.f1(%1)::Any
└──      return %2
) => Any

julia> f1(::Float64) = 2. # no invalidation!
f1 (generic function with 3 methods)

julia> code_typed((Vector{Any},); interp=InvalidationExplorer()) do xs
           g1(xs)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Base.arrayref(true, xs, 1)::Any
│   %2 = Main.f1(%1)::Any
└──      return %2
) => Any

vtjnash commented 2 years ago

The max_methods setting is already per function type, so we possibly already have that? The main difference being we don't currently consider it a problem to pass abstract types to those methods, as long as we only select one method body. Perhaps max_methods should apply internal to the method (callee vs caller effect)? I think that is what you already said above, but I am just restating it more briefly too.

o314 commented 1 year ago

Label "types and dispatch" should be applied there

timholy commented 1 year ago

The more I think about it, the more I think @concrete annotations (and probably even callsite ones as well as definition ones) make sense. It's a way of effectively declaring a kernel barrier and that you think knowing types concretely matters more to the callees than the caller. It basically means "shut off abstract inference for this call" but allows standard invoke if the type all happen to be concretely-inferrable in the caller.

JuliaLang / julia

Feature request: concrete-only dispatch #45099