JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.81k stars 5.49k forks source link

Decouple dispatch and specialization #11339

Open timholy opened 9 years ago

timholy commented 9 years ago

aka "Generalize the ANY mechanism" aka "Completely destroy Jeff's life forever" (continuing the heresy in https://github.com/JuliaLang/julia/issues/4774#issuecomment-103286221 in a direction that might be more profitable, but much harder)

What if we could encode an Array's size among its type parameters, but not pay the compilation price of generating specialized code for each different size?

Here's a candidate set of rules (EDITED):

  1. In the type declaration, you specify the default specialization behavior for the parameters: type Bar{T,K*} means that, by default, functions are specialized on T but not on K. Array might be defined as type Array{T,N,SZ*<:NTuple{N,Int}} (in a future world of triangular dispatch).
  2. In a function signature, if a type's parameter isn't represented as a "TypeVar", it uses the default behavior. So foo(A::Array) specializes the element type and the dimensionality, but not the size.
  3. Specifying a parameter overrides the default behavior. foo{T,K}(b::Bar{T,K}) would specialize on K, for example. Conversely, putting * after a type parameter causes the parameter to be used in dispatch (subtying and intersection) but prevents specialization. This is much like the ANY hack now. matmul{S,T,m*,k*,n*}(A::Array{S,2,(m,k)}, B::Array{T,2,(k,n)} would perform the size check at compile time but not specialize on those arguments.

I'm aware that this is more than a little crazy and has some serious downsides (among others, it feels like performance could get finicky). But to go even further out on a limb, one might additionally have matmul{S,T,m<=4,k<=4,n<=4}(...) to generate specialized algorithms (note the absence of *s) for small sizes and use the unspecialized fallbacks for larger sizes. This way one wouldn't need a specialized FixedSizeArray type, and could just implement such algorithms on Array.

A major downside to this proposal is that calls in functions that lack complete specialization might have to be expanded like this:

if size(A,1) <= 4 && size(A,2) <= 4 && size(B,2) <= 4
    C = matmul_specialized(A,B)
else
    C = matmul_unspecialized(A,B)
end
tkelman commented 9 years ago

If a type's parameter isn't represented as a "TypeVar", it doesn't get specialized.

Would this potentially require type parameters to be mandatory for good performance in many more situations than they are now? I worry whether this would make Julia's performance characteristics even more complicated to understand for new users. Or maybe this would somehow end up becoming more intuitive in the end? Not sure. (Is this what you meant by "performance could get finicky"?)

timholy commented 9 years ago

Yes, that was my concern. I edited the rules above so that I think we could maintain julia's current behavior but add new ones.

JeffBezanson commented 9 years ago

I think specialization has to be a property of functions, not the type itself.

The better version of this is to automatically figure out which parameters the function doesn't depend on, and avoid specializing on those (in other words, discover "parametricity" in the technical sense).

timholy commented 8 years ago

I just spent a lovely hour+ re-reading #4774. What bliss. Surprisingly, I found myself back this issue, which I'd forgotten about entirely.

I think @JeffBezanson's comment above is potentially important, and I'm just checking two things:

For the latter, consider a simple example:

function fill!{T}(A::AbstractArray{T}, val)
    valT = convert(T, val)
    for i in eachindex(A)
        A[i] = valT
    end
    A
end

For all LinearFast arrays, the dimensionality of A is an irrelevant parameter, and one might be able to use an unspecialized function (modulo the inlining of that call to setindex!) that can handle all dimensionalities without a performance cost. But when eachindex returns a CartesianRange, dimensionality matters for the generated code, particularly given the demonstrated importance of inlining next and done.

In other words, for those of us not working on the compiler, is this worth being concerned about as we contemplate different paths forward? I'm guessing the answer is no, but it would be interesting to hear reflections on this. My reasoning is that presumably this has to be solved not at the level of the method, but at the level of individual cached (compiled) instantiations. In that case one can specialize for certain types of inputs but use more general code for others, and that solves the problem I'm worrying about above.

timholy commented 8 years ago

Thinking just a little bit more: presumably the ideal way to implement this corresponds to generating a method for Array{Float64, 3} but then performing an extra step to realize (prior to cache insertion) that the generated code doesn't depend on 3 and replacing the method-matching with Array{Float64,N}.

JeffBezanson commented 7 years ago

I think a good item for 1.0 here is to deprecate ANY and replace it with @nospecialize args... metadata at the top of a method.

timholy commented 7 years ago

Would args be able to be a type parameter, too? Or just one of the arguments to the function?

JeffBezanson commented 7 years ago

Eventually, yes.

andyferris commented 7 years ago

I'd have to say, this would be pretty amazing, to have in-built static arrays like this. We'd need non-resizable Array, of course...

One problem to solve is that you'd only ever want to dispatch to a static-sized method if the size typevar was known to the compiler. To avoid dynamic dispatch, you'd need another annotation on methods so that the method is only considered when inference can infer the relevant information. The annotation would imply a promise to the compiler that the method is functionally identical to the one it specializes, or else we'd have determinism problems.

We'd also need to store such abstract types unboxed, for performance.

This could solve a range of problems, such as a Val that is equally as fast as possible when it is a dynamic or static value, or it could replace the literal_pow mechanism...

JeffBezanson commented 7 years ago

1.0 part of this is done.

AriMKatz commented 2 years ago

This would be very useful for shape checking ML programs, where runtime shape errors are notoriously vexing.

See this: https://github.com/aviatesk/JET.jl/issues/82#issuecomment-775325146 for an example where it "just works" with JET.jl and sized arrays, but again compilation cost is an issue.

Would also be helpful for IPO Escape analysis, which according to the docs would do better with propagating array shape info. And This issue here: https://github.com/JuliaLang/julia/issues/43642

Is it possible to have this be opt in so as to not be a breaking change and be available during 1.x?

vtjnash commented 2 years ago

@nospecialize now does this (using exactly the type given in the signature for inference, and usually dispatch also, if possible)

AriMKatz commented 2 years ago

@nospecialize now does this (using exactly the type given in the signature for inference, and usually dispatch also, if possible)

Is it possible to specialize on just specific type parameters?

Edit: People want to be able to check array and table shapes with JET (or at runtime) without paying for extra specialization cost.

What if I want to include the whole shape of an array but specialize only on the rank and element type

I'm not sure if that even makes sense in principle though.

JeffBezanson commented 2 years ago

I don't think this should be closed since there could still be much more control over what to specialize on.