JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

Allow custom types to get narrow Union type with map (like `promote_typejoin` for `Missing`/`Nothing`) #38241

Open oxinabox opened 4 years ago

oxinabox commented 4 years ago

Consider this code:

julia> bar(::Nothing) = missing
bar (generic function with 1 method)

julia> bar(x) = x;

julia> xs = [1, 2, nothing, 4]
4-element Vector{Union{Nothing, Int64}}:
 1
 2
  nothing
 4

julia> ys = map(bar, xs)
4-element Vector{Union{Missing, Int64}}:
 1
 2
  missing
 4

We can see we get a nice array of small unions in the output.

But if I make my own similar singleton type, then the output is a Array{Any}.

julia> struct Null end

julia> foo(::Nothing) = Null();

julia> foo(x) = x;

julia> xs = [1, 2, nothing, 4]
4-element Vector{Union{Nothing, Int64}}:
 1
 2
  nothing
 4

julia> ys = map(foo, xs)
4-element Vector{Any}:
 1
 2
  Null()
 4

Is there something extra i need to do, or is the compiler cheating for Missing and Nothing?

KristofferC commented 4 years ago

Is there something extra i need to do, or is the compiler cheating for Missing and Nothing?

There's a lot of cheating with those two (but not really by the compiler):

https://github.com/JuliaLang/julia/blob/a80f9038c1f51bc3f831b54a0aa6f5b5f5f8e446/base/missing.jl#L41-L70

Here is another example:

julia> struct Foo
          x::Ref{Union{Nothing, Float64}}
       end

julia> Foo(1)
Foo(Base.RefValue{Union{Nothing, Float64}}(1.0))

julia> struct MyMissing end

julia> struct Bar
          x::Ref{Union{MyMissing, Float64}}
       end

julia> Bar(1)
ERROR: MethodError: Cannot `convert` an object of type 
  Int64 to an object of type 
  Union{MyMissing, Float64}
Closest candidates are:
  convert(::Type{T}, ::T) where T at essentials.jl:171
oxinabox commented 4 years ago

Nice. Just adding Base.promote_rule(T::Type{Null}, S::Type) = Union{S, Null} doesn't seem to make it work. It would be good to workout exactly what is needed and document it.

KristofferC commented 4 years ago

https://github.com/KristofferC/LazilyInitializedFields.jl/pull/1/files#diff-0abed87289859febd0e9c73389d5bb7fae46aa675acf8f1a468de7a3b9dcf0a1R95-R124 got me most of the way there. I still think there was something that was not perfect but don't really recall it.

nalimilan commented 4 years ago

Being extensible wasn't the priority when we implemented this, but maybe it could be made easier and documented. There's also promote_typejoin which isn't part of the public API.

vtjnash commented 4 years ago

This is based on the explicit promote_typejoin rule, and not inferred by the compiler. It's also not extensible, sorry.

oxinabox commented 4 years ago

Doesn't this issue remain open to document how to use promote_typejoin for this?

Or to make it extensible?

oxinabox commented 3 years ago

Doesn't this issue remain open to document how to use promote_typejoin for this?

Or to make it extensible?

bump

vtjnash commented 3 years ago

There's no "cheating" or uncertainty: it's the explicit implementation, and can't be extended, so there's not much to document?

oxinabox commented 3 years ago

can't be extended

but just because it can't be extended now doesn't mean that it is impossible for making it extendable in the future. Which would resolve this issue. Not saying it is easy, but we have done non-easy enhancements to the language before. Feature requests that are are hard are not normally closed, they are normally left open and untouched. Feature requests that make no-sense, or that are a bad idea are normally closed. Maybe this makes no sense or is a bad idea, but i don't understand why it is.

iamed2 commented 3 years ago

I agree with @oxinabox; is there some rewording of this issue that could be reopened?

nickrobinson251 commented 3 years ago

Bump :) Can this be reopened or else a brief explanation be added on why this can't/won't be supported?

cstjean commented 3 years ago

FWIW, we're relying on this behaviour as well in our product. @etpinard used this:

julia> Base.promote_typejoin(::Type{Blag}, ::Type{T}) where {T} =
           isconcretetype(T) || T === Union{} ? Union{T, Blag} : Any
julia> Base.promote_typejoin(::Type{T}, ::Type{Blag}) where {T} =
           isconcretetype(T) || T === Union{} ? Union{T, Blag} : Any

which works for us as end-users, although it's susceptible to method ambiguities, so I wouldn't use it inside of a library. It's both 1.5 and 1.6-compatible.

quinnj commented 3 years ago

We've needed the Unioning behavior as well in a few packages recently, with code like this and this

StefanKarpinski commented 3 years ago

Wouldn't it be possible to allow singleton types to opt into this with a trait?

oscardssmith commented 3 years ago

I think that would be a good half-solution, but the better (and harder) solution would be to make singletons play nicer without having to ask.

oxinabox commented 3 years ago

better (and harder) solution would be to make singletons play nicer without having to ask.

Why? It's different from other type promotion interaction. It seems like it should be opt-in?

Or is it just always preferred?

iamed2 commented 3 years ago

If it's possible to use a trait, is there a reason not to use Base.issingletontype as a default check (i.e. opt-in all singletons)?

ParadaCarleton commented 2 years ago

Mentioned on the #statistics channel in the Slack -- being able to create custom Missing-like structs could be very useful for distinguishing between different kinds of missing data, e.g. data that are missing completely at random vs data that have been censored.

nalimilan commented 2 years ago

This could probably be solved by having an AbstractMissing type in Base, even without making promote_typejoin extensible for non AbstractMissing types. (Note that currently Base hardcodes Missing/missing in quite a few places so the code would have to be made more generic, notably by using ismissing now that it's as good for inference as === missing.)

oxinabox commented 2 years ago

That's not good, as you can only have one supertype. And if you are already using that then can't have another.

Also this behavior is not linked on any way to the notion of missing.

KristofferC commented 2 years ago

Small update, but the special casing of Nothing and Missing basically comes from this line:

https://github.com/JuliaLang/julia/blob/81813164963f38dcd779d65ecd222fad8d7ed437/base/promotion.jl#L174

So if you for example do:

struct MySingleton end
foo(::Nothing) = MySingleton();
foo(x) = x;
xs = [1, 2, nothing, 4]
ys = map(foo, xs)
@show typeof(ys)
@eval Base Base._promote_typesubtract(@nospecialize(a)) = typesplit(a, Union{Nothing, Missing, $MySingleton})
ys = map(foo, xs)
@show typeof(ys)

it will show

Vector{Any} 
Vector{Union{MySingleton, Int64}} 
KristofferC commented 2 years ago

Speaking with Jameson a bit, a possible solution to this would be to just always go to a Union in the case of two types, and then go to Any instead of going directly to Any.

quinnj commented 2 years ago

Yeah, that sounds like a pretty good compromise to me.

cstjean commented 2 years ago

👍 from me too, but it doesn't help with

Mentioned on the #statistics channel in the Slack -- being able to create custom Missing-like structs could be very useful for distinguishing between different kinds of missing data, e.g. data that are missing completely at random vs data that have been censored.

We're also doing something similar (Vector{Floa64, Missing, SomethingLikeMissing), and will need to keep hacking something up until this is resolved...

nalimilan commented 2 years ago

Are you sure you need to mix Missing and SomethingLikeMissing in the same vector? In my TypedMissings.jl experiment, missing is converted/promoted to a default TypedMissing value which doesn't specify the reason for missingness, so that you only ever need Vector{Float64, TypedMissing}.

cstjean commented 2 years ago

That's an interesting approach. Our SomethingLikeMissing is AtLeast(x), so I guess we could use AtLeast(-Inf) as a missing replacement. It's ugly, though, I think we'd keep the hack for the time being.