Closed palday closed 6 months ago
OK, thanks.
Just trying to get my head around everything. Is using code_typed
ultimately less likely to invoke undefined behavior than return_type
? Is there any difference in obsrvable behavior? Does this make something work that doesn't work without it? (The Julia 1.10 issues seem more related to changes to handling of Union{}
than to changes in the type inference algorithm).
Just trying to get my head around everything. Is using
code_typed
ultimately less likely to invoke undefined behavior thanreturn_type
?
Probably not, but return_type
is considered an internal function and there seems to be reluctance to treat the problems with Union{}
as real as long as you're accessing an internal function. So there is a real problem that this package demonstrates (well, via errors that occur in dependents' use of this package) but that problem is considered invalid by some core Julia devs because you're using internals.
That said, using internals is also risky because their behavior can change or be dropped at any point.
Is there any difference in obsrvable behavior? Does this make something work that doesn't work without it?
No(ish) and yes. There are two changes here:
return_type
that does not depend on compiler internals (no change in observable behavior)Index
, but realistically unobservable since it corresponds to the index of an empty collection). This dodges the problems with Union{}
introduced in Julia 1.10.The latter change in particular fixes a major downstream Julia 1.10 compatibility issue: https://github.com/MakieOrg/AlgebraOfGraphics.jl/issues/472
(The Julia 1.10 issues seem more related to changes to handling of
Union{}
than to changes in the type inference algorithm).
I agree, but the IMHO legitimate problem in the new handling of Union{}
is being dismissed because you're touching compiler internals.
The latter change in particular fixes a major downstream Julia 1.10 compatibility issue
Shouldn't this rather be fixed in AoG? Assuming it doesn't get fixed in Julia.
Shouldn't this rather be fixed in AoG? Assuming it doesn't get fixed in Julia.
AoG is calling things, but the actual error occurs in Dictionaries.jl code.
Creating Indices{Union{}}
is a perfectly fine thing to do, and it's not prevented by this PR (neither should it).
julia> Indices(Union{}[])
0-element Indices{Union{}}
This PR adds an explicit check to one special case, where Indices
are constructed from a Tuple
. Is that worth it? I'm hopeful for the Julia team to acknowledge and likely fix the breaking change in some 1.10 version...
The change of return_type
-> code_typed
is orthogonal to that Tuple{}
thing, and doesn't really reduce any dependency on "internals". Inference results are not guaranteed to stay stable, and this influences both return_type
and code_typed
. Note that (according to JuliaHub search) lots of packages use return_type
, but (almost?) none use code_typed
for this purpose.
The change of
return_type
->code_typed
is orthogonal to thatTuple{}
thing, and doesn't really reduce any dependency on "internals".
This is patently incorrect -- Core.Compiler.return_type
is not documented in the manual and is therefore per definition internal.
I do agree that the Union{}
/ Tuple{}
issue is orthogonal, but I suggested this change because the real issue was being dismissed prima facie because of the use of internals. So showing that the issue still arose when there was no use of internals was useful for that.
In terms of the workaround here for Union{}
/Tuple{}
-- I'm not strongly tied to it, but it seemed like a reasonable, straightforward change. If you're constructing an Index
from an empty tuple, you do not know what the original element type is, so Any
IMHO makes as much sense as Union{}
. On the other hand if you're constructing from e.g. an empty Vector
, then you still have the eltype information and so can construct an appropriate empty index. In terms of the type definition, tree, etc., it's fine to explicitly construct Index{Union{}}()
, but that is an explicitly an empty index (because of both the type parameter and the lack of initialization values), which I view as a slightly different case than Index(iter)
where iter just happens to be Tuple{}
.
This is patently incorrect
Sorry, maybe I formulated it too strongly :) It very slightly reduces the internals dependence, because the actual return value of code_typed
is still not guaranteed to be stable and can change when inference changes in Julia. Relying on inference, no matter how exactly it is done, is relying on Julia internals.
This change from return_type
to code_typed
increases overhead a lot:
# Core return_type()
julia> @btime Core.Compiler.return_type(sin, Tuple{Float64})
0.791 ns (0 allocations: 0 bytes)
Float64
# return_type() from this PR
julia> @btime return_type(sin, Tuple{Float64})
48.333 μs (1232 allocations: 69.22 KiB)
Float64
Wouldn't be great to pay this overhead on every Dictionary/Indices construction, especially given that it doesn't remove the internals dependence.
And indeed much more packages use return_type
to probe inference than code_typed
.
I suggested this change because the real issue was being dismissed prima facie because of the use of internals
The Tuple{Union{}}
error in 1.10 is completely independent on internals, and is a breaking change anyway. It just happened that the first manifestations of this issue were found in code that relies on inference.
Luckily, now there's no shortage of internals-free examples that demonstrate this issue (in https://github.com/JuliaLang/julia/issues/52385).
Okay, this has really opened a can of worms :)
There are few things going on here.
Tuple{Union{}}
, where previous versions haven't.And yes, this package relies on Julia internals. It always has - originally this was "a feature not a bug" because it gave great performance + ergonomics. Traditionally we update the package to work well with each new version of Julia (but this hasn't been necessary in recent years).
The plan was always to use something "recommended" once Julia became more mature. I actually have an minimal example for the logic we follow:
x = map(sin, Int64[])
What is the type (concrete, runtime type) of x
? Is it inferred? How does Base
determine the eltype
of the returned array? For many years, it has been known that invoking type inference to figure out a type which then needs to get inferred might expose undefined behavior (I'm even not sure what well-defined nested inference looks like inside recursive loops...). However, when not run as a part of a recursive loop it does in practice seem to work well. In fact, for the example above Base
still uses Core.Compiler.return_type
to figure out the eltype must be Float64
. I think core devs like Jameson are right to raise concern with this pattern, but I still believe Base
needs to lead the way here before we can copy a working pattern.
In general, I feel like swapping Core.Compiler.return_type
to code_typed
ultimately targets a political problem, not a technical problem. In fact it may introduce some regressions here. The discussion at https://github.com/JuliaLang/julia/issues/52385 is progressing - but I think we need to get the attention of a few more devs that can do something in the space so they can design a solution amongst themselves.
We do however need to get this package working well with Julia 1.10, and any improvements to handling tables with zero rows or with zero columns are extremely valuable and welcome. Is there something more minimal we could try that will at least get us unstuck in the short term?
Attention: 3 lines
in your changes are missing coverage. Please review.
Comparison is base (
7b5b7b3
) 79.35% compared to head (c83f300
) 79.15%.
Files | Patch % | Lines |
---|---|---|
src/Dictionary.jl | 0.00% | 2 Missing :warning: |
src/map.jl | 66.66% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I actually have an minimal example for the logic we follow: x = map(sin, Int64[])
And Base
itself doesn't hesitate to return a collection with Union{}
eltype when it makes sense :)
julia> map(length, Symbol[])
Union{}[]
Also when making an array out of an empty tuple - the exact same scenario as discussed here with Indices
:
julia> collect(())
Union{}[]
So the consistent way, following Base, is to continue returning Indices{Union{}}
. And as @nsajko points out that this is the most accurate type to return. Would be weird to change it and make the result both less consistent and less accurate.
Would be nice if those affected by this 1.10 issue chimed in the https://github.com/JuliaLang/julia/issues/52385 discussion! I think popularity does affect the priority different bugs get (and that's totally sensible).
I think popularity does affect the priority different bugs get (and that's totally sensible).
Note that the issue is labeled with triage
, so I guess a decision might happen either way in less than 24 hours.
It looks like the Tuple{Union{}}
change will be reverted: https://github.com/JuliaLang/julia/issues/52385#issuecomment-1876217885
Thanks for following up on this @palday
see also https://github.com/JuliaLang/julia/issues/52385
There's also another problem that arises on Julia 1.10 when
Indices
are constructed from an empty tuple, the resultantIndices{Union{}}
create all sorts of problems. To get around this, I've added a check for empty tuples and changes the indices eltype to beAny
. See also https://github.com/MakieOrg/AlgebraOfGraphics.jl/issues/472