Type Tags for Extendable Type Information

ChrisRackauckas commented 3 years ago

Wrapper types are a mess. The classic example is that TrackedArray{Adjoint{Array}} is no longer seen as an Adjoint and Tracker.jl needed special overloads for handling adjoint arrays. Two deep you can start to special case: three deep is a nightmare. Why?

The issue is that wrapper types aren't exactly new types, they are extended pieces of information. A lot of information stays the same, like length, but only some changes. In the parlance of Julia, this is the kind of behavior that you normally have with type parameters, where Array{T,1} shares dispatches with Array{T,2} but then overrides what changes when you change dimensions. So, we could in theory do Array{T,1,Adjoint}, but now you can see the issue: it's an extendability problem since these tags need to all be pre-specified in the type definition! What if new packages add new tag verbs? SOL, redefine/pirate the type if you want to use it! This is why people use wrapper types: you can always extend, but of course you get the dispatch problem.

What about traits you say? They extend the way things dispatch, but only do so at compile time for a given type. isadjoint(::Array) can't be a thing, unless you have a type parameter for it.

So you need some extra information. That's why I am proposing type tags. A type tag is an extension like Array{Float64,1}{Adjoint}, where the tag adds information to an existing type. It could be like, tag(A,IsAdjoint()). By default, any tags would have the default behavior unless tagged, so Array{Float64,1} would have tag(Adjoint) == DefaultTag(). Then dispatches could be written on the tags, allowing for extension while keeping dispatches.

The one thing to really think about though is ambiguity handling...

longemen3000 commented 3 years ago

That would work if TaggedArray was a Tuple and instead of an isa, <::

julia> Tuple{Float64, 2, Union{Adjoint, Symmetric}} <: Tuple{T, 2, Union{Adjoint, Other}} where {T, Other}  
true
julia> typeof(rand(5)) isa Array #basically the same
false

covariant vs invariant, i guess

DilumAluthge commented 3 years ago

I'm not sure if invariance is the issue here. It specifically seems to be the UnionAll that is the issue. Other abstract types work fine without needing any <: annotation.

julia> g(::Ref{Union{Int, Other}}) where {Other} = 1
g (generic function with 1 method)

julia> g(Ref{Union{Int, Float64}}(1))
1

julia> g(Ref{Union{Integer, AbstractFloat}}(1))
1

julia> g(Ref{Union{Int, Array}}(1))
ERROR: MethodError: no method matching g(::Base.RefValue{Union{Int64, Array}})
Closest candidates are:
  g(::Ref{Union{Int64, Other}}) where Other at REPL[1]:1
Stacktrace:
 [1] top-level scope at REPL[4]:1

julia> g(Ref{Union{Int, Array{Float64, 100}}}(1))
1

julia> g(Ref{Union{Int, Array{AbstractFloat, 100}}}(1))
1

julia> g(Ref{Union{Integer, Array{AbstractFloat, 100}}}(1))
1

julia> g(Ref{Union{Number, Array{Real, 100}}}(1))
1

DilumAluthge commented 3 years ago

@MasonProtter for un-adjointing, we could try this:

@generated function unionsubtract(::Type{U}, ::Type{T}) where {U, T}
    return Union{setdiff(Core.Compiler.uniontypes(U), [T])...}
end

In the REPL:

julia> using LinearAlgebra

julia> struct Foo end

julia> struct Bar end

julia> unionsubtract(Union{Foo, Adjoint, Bar}, Adjoint)
Union{Bar, Foo}

DilumAluthge commented 3 years ago

Or this might be better:

@generated function unionsubtract(::Type{U}, ::Type{T}) where {U, T}
    return Core.Compiler.typesubtract(U, T, typemax(Int))
end

In the REPL:

julia> using LinearAlgebra

julia> struct Foo end

julia> struct Bar end

julia> unionsubtract(Union{Foo, Adjoint, Bar}, Adjoint)
Union{Bar, Foo}

MasonProtter commented 3 years ago

I think we don't need to use type subtract if we can figure out the UnionAll bug:

struct Foo end
struct Bar end

Base.adjoint(x::TaggedArray{T, N, Adjoint}) where {T, N} = TaggedArray{T, N, Union{}}(x.data)
Base.adjoint(x::TaggedArray{T, N, Union{Adjoint, Other}}) where {T, N, Other} = TaggedArray{T, N, Other}(x.data)

julia> TaggedArray(rand(2,2))'
TaggedArray{Float64,2,Adjoint}([0.7337422713379738 0.6323261034343546; 0.6549766230552179 0.0531575103233175])

julia> (TaggedArray(rand(2,2))')'
TaggedArray{Float64,2,Union{}}([0.5639123718675811 0.9175558237352524; 0.01610087829178175 0.632625236637625])

julia> (TaggedArray{Float64, 2, Union{Foo, Bar}}(rand(2,2))')
TaggedArray{Float64,2,Union{Bar, Foo, Adjoint}}([0.699743822169709 0.9959177311945779; 0.5597322929561694 0.8551353920860059])

julia> (TaggedArray{Float64, 2, Union{Foo, Bar}}(rand(2,2))')'
TaggedArray{Float64,2,Union{Bar, Foo}}([0.44928212809483625 0.6453411569592404; 0.5148462355738783 0.9647290574345497])

DilumAluthge commented 3 years ago

~The type subtract is for converting an adjoint back to an non-adjoint. I.e. if you have something with Union{Adjoint}, and then you call ', you need to unadjoint it, right? Because presumably (x')' is always exactly equal to x.~

DilumAluthge commented 3 years ago

Hmmm wait it seems you already got it working.

DilumAluthge commented 3 years ago

Ahh I see what you mean. If we can get the Other trick to work, then no need for subtraction.

DilumAluthge commented 3 years ago

Will that work for multiple things? E.g. suppose I have Union{Foo, Adjoint, Bar} and I want to unadjoint it. Does the Other trick still work?

MasonProtter commented 3 years ago

Will that work for multiple things? E.g. suppose I have Union{Foo, Adjoint, Bar} and I want to unadjoint it. Does the Other trick still work?

It's a bit on the illegible side, but see the final two lines of my example above. Covers exactly that case with even those type names :smile:

DilumAluthge commented 3 years ago

That's what I get for being on my phone 😂

Hard to scroll horizontally

DilumAluthge commented 3 years ago

So really #37793 is the only blocker, right?

Want to create a package with this prototype? Then @ChrisRackauckas can stress-test it and see how it does in his various use cases.

vtjnash commented 3 years ago

FWIW, #37793 seems likely to eventually be fixed by ensuring that Other is left undefined in that function

DilumAluthge commented 3 years ago

Hmmm. That is unfortunate news for us, because IIUC Mason is using Other in the body of that function above.

So @MasonProtter we may need to have the restriction that tags cannot be UnionAlls? That seems fine, right? We can just do struct SymmetricTag end and use that in the tag.

vtjnash commented 3 years ago

I mean that all types should behave like UnionAll currently does. Currently, it gets the answer wrong when presented with a concrete type.

DilumAluthge commented 3 years ago

Hmmm. In that case, we can't use the Other trick at all.

timholy commented 3 years ago

Late to the party. I've certainly seen the problem reported in the OP with wrapper types, but I've sometimes wondered if we could get a lot of mileage by defining iswrapper(::Type{<:MyNewArray}) = true and then having the fallback for most traits be

sometrait(::MyNewArray) = Awesome()
sometrait(A::AbstractAray) = iswrapper(A) ? sometrait(parent(A)) : Mundane()

Tokazama commented 3 years ago

If you take a look at ArrayInterface this is done a lot with parent_type. If a trait isn't defined on an array then we see if there is a parent and try digging until we find something.

Fore example, known_length uses this strategy.

yurivish commented 3 years ago

Re: Tim’s idea, does it happen in practice that a wrapper type is a wrapper with respect to some traits but not others?

JeffBezanson commented 3 years ago

f(x::TaggedArray{T, 2, Tag}) where {T, Tag, Union{Adjoint, Symmetric} <: Tag} = 3

Do you mean

f(x::TaggedArray{T, 2, Tag}) where {T, Tag >: Union{Adjoint, Symmetric}} = 3

?

JeffBezanson commented 3 years ago

Good discussion. I see this as basically moving type parameters from being positional to being like keyword arguments, which is something I've thought about. That would make parameters more meaningful as well as more extensible. A lot of things to think about here. For example, Vector{Int} could become an abstract type in some sense --- we know the layout, and we know enough to do compile-time dispatch for most functions, but there could be more parameters. Or we could insist that types written as they are now are exact, and you need something like Vector{Int; ...} to mean the abstract type that includes types with additional parameters. That's very explicit but seems less useful.

MasonProtter commented 3 years ago

```julia
f(x::TaggedArray{T, 2, Tag}) where {T, Tag, Union{Adjoint, Symmetric} <: Tag} = 3

Do you mean

f(x::TaggedArray{T, 2, Tag}) where {T, Tag >: Union{Adjoint, Symmetric}} = 3

?

Huh, interesting I had definitely tried the angry-face operator there and failed to make it work for me, but I probably just did something silly. Thanks for pointing that out. As I said earlier today on Zulip,

Sometimes you just gotta do something badly and then hope someone smart comes and tells you why it's bad.

Tokazama commented 3 years ago

I think that this is a step in the right direction for traits in general. For example, an issue brought up on Zulip was that someone wanted to add the table interface (i.e. istable) to a bunch of types that did not have a common supertype but had another related trait. A simple solution would be doing istable(x) = istable(tags(x)) instead.

martinholters commented 3 years ago

This is an interesting discussion and being able to attach extra tags to types seems like it might be quite useful, but I'm not convinced it solves the problem it's intended to here. What made me skeptical is the appearance of Adjoint and Sorted in the same example somewhere above. That makes sense for vectors, but if you think about higher-dimensional arrays, you see the problem coming. Imagine (and assume for the sake of discussion) the Sorted tag is also defined for higher-dimensional arrays, meaning the array is sorted in iteration order. Now one could add the Sorted tag when verifying the contents are sorted, and adjoint would attach the Adjoint tag. But wait! the adjoint of a sorted matrix is in general no longer sorted, so adjoint would need to be aware of the Sorted tag and remove it. Would it also need to remove other tags from other packages? Also the order of operations matters: Taking the adjoint first, and then verifying the result is sorted, one could safely attach the Sorted tag. But that's more or less where we're at now: Sorted{<:Adjoint} is sorted, Adjoint{<:Sorted} is not (necessarily) sorted, for SomeOtherWrapper{<:Sorted} we don't know.

So I'm afraid for the problem of array wrappers, we replace the combinatorial explosion of wrapper combinations with the same explosion of tag combinations. Or can someone give a brief example, how, say, Sorted, Adjoint, TrackedArray, and SizedArray, all worked together more easily using tags? Ideally with no code using definition from TrackedArray and SizedArrayat once.

pablosanjose commented 3 years ago

@martinholters: good points. Let me rephrase your comments to digest their consequences. You are essentially saying that if tags are commutative (i.e. their order doesn't matter, so we encode them with a Union instead of a Tuple) they need to encode orthogonal information. At the very least they should never conflict with each other. One should be able to add any tag in any order without contradicting (i.e. having to modify) any other existing tag.

If we instead want to keep tags general (so we can have both Sorted and Adjoint tags, which are not orthogonal), then tags should necessarily be non-commutative. Wrapper types are already explicitly non-commutative. What would we gain with tags over wrapper types then? As presented in the OP, the advantage would be to be able to dispatch on a given tag regardless of the order it occupies in the tag list. But that has little value, I think, as the order is crucial. Matrix{Int}{Sorted, Adjoint} does not allow the same algorithms as Matrix{Int}{Adjoint, Sorted}, as @martinholters notes. Or another example: it doesn't make sense to dispatch on an Matrix with the UpperTriangular tag if Adjoint was added after it, because then it is actually LowerTriangular. So, it would seem that if we are adamant about allowing general, non-orthogonal tags, we might just as well stick to wrappers.

What we could perhaps attempt in relation to the problem in the OP is to allow defining equivalence classes of types in terms of commutation rules of wrappers (or tags), such as

Adjoint{UpperTriangular{T}} == LowerTriangular{Adjoint{T}} where {T<:AbstractMatrix}
Adjoint{Diagonal{T}} == Diagonal{Adjoint{T}} where {T<:AbstractMatrix}
Adjoint{Adjoint{T}} == T where {T<:AbstractMatrix}
TrackedArray{Adjoint{T}} == Adjoint{TrackedArray{T}} where {T<:AbstractMatrix}

Then, if we write a dispatch like f(::LowerTriangular{T}) where {T<:AbstractArray}, the type machinery could work out, based on repeated application of the commutation rules, that it also applies to f(::TrackedArray{Adjoint{UpperTriangular{Matrix{Int}}}}). But I wonder whether that is a hopelessly complex task in practice.

pablosanjose commented 3 years ago

Uhm, I guess there is another aspect in which tags as proposed in the OP are prefereable to wrappers: they don't require unwrapping to access the fields of a type. The interface of a tagged type would not change as a result of the added types. So the above stuff on commutation rules should actually be reformulated in terms of non-commutative tags instead of wrappers so that the same code of the f method could work on types with equivalent sets of tags.

martinholters commented 3 years ago

But that only holds if the tag (née wrapper) does not interfere with how the internals are to be interpreted. So most algorithms would have to differentiate whether Adjoint is present or not, which would also mean that those that don't would have to opt-in to ignoring it. (Otherwise, functions unprepared for the Adjoint tag could silently give wrong results.) That doesn't sound that much better than having to peel off a wrapper. So maybe the tag idea shines for additional information, as in a Sorted tag? Not really, the problem persists for mutating functions. And as we don't distinguish mutating/non-mutating functions on the language level, that doesn't really gain us anything.

MasonProtter commented 3 years ago

Yes, I agree this is a bad fit for Adjoint. I think the most attractive use for this is things like Sorted and that it gives people a way to 'alias' existing structs and have a little copy that they can put their own methods on and treat differently without having to do all the extra stuff involved with making wrappers.

MasonProtter commented 3 years ago

Good discussion. I see this as basically moving type parameters from being positional to being like keyword arguments, which is something I've thought about. That would make parameters more meaningful as well as more extensible. A lot of things to think about here. For example, Vector{Int} could become an abstract type in some sense --- we know the layout, and we know enough to do compile-time dispatch for most functions, but there could be more parameters. Or we could insist that types written as they are now are exact, and you need something like Vector{Int; ...} to mean the abstract type that includes types with additional parameters. That's very explicit but seems less useful.

@JeffBezanson I think at least for the purposes of labelling the fields of structs, we'd be forced for sanity reasons to say that e.g.

struct Foo
    a::Vector{Int}
end

can only store the un-tagged version. However, I wonder if in all other circumstances, we could get away with having it be 'abstract'? i.e. people can write

struct Foo{V <: Vector{Int}}
    a::V
end

and then that would be compatible with Vector{Int; MyTag}? This has the downside of being an extra rule that people have to remember about spelling types though.

bramtayl commented 3 years ago

One way to deal with the whole trait/multiple inheritance thing would just be to have a macro for appending to union types. Like if you have

struct BlueRough <: AbstractRough end
struct BlueSmooth <: AbstractSmooth end
const Blue = Union{BlueRough, BlueSmooth}

A user could bestow a trait on a new thing

struct BlueShiny end
@bestow BlueShiny Blue

ParadaCarleton commented 10 months ago

Question; is this proposal related to the tagged unions implemented in other programming languages, and in some Julia packages?

More broadly, in the time since this was first proposed, has anyone written a possible ecosystem or package solution (not in base) that fixes this/provides this feature (like how there are packages fixing the lack of multiple inheritance, traits, etc. in Julia)?

MasonProtter commented 10 months ago

Question; is this proposal related to the tagged unions implemented in other programming languages, and in some Julia packages?

No.

ParadaCarleton commented 10 months ago

Question; is this proposal related to the tagged unions implemented in other programming languages, and in some Julia packages?

No.

To clarify, I'm not asking because of the array--I'm asking because the original example in this thread, Adjoint, looks like it could be implemented by having arrays behave like unions with two basic cases (Adjoint and not Adjoint). @ChrisRackauckas would this solve the original problem?

ChrisRackauckas commented 10 months ago

That addresses the problem of Adjoint, but not Tracked, Symmetric, and every other property someone would want to add.

JuliaLang / julia

Type Tags for Extendable Type Information #37790