Open pdeffebach opened 9 months ago
Update, the inference failure happens because x
is a vector which may contain missing
. A simpler MWE:
julia> x = Union{Int, Missing}[1, 2, 3]
3-element Vector{Union{Missing, Int64}}:
1
2
3
julia> @code_warntype mean(x)
MethodInstance for Statistics.mean(::Vector{Union{Missing, Int64}})
from mean(A::AbstractArray; dims) @ Statistics ~/.julia/juliaup/julia-1.9.0+0.x64.linux.gnu/share/julia/stdlib/v1.9/Statistics/src/Statistics.jl:164
Arguments
#self#::Core.Const(Statistics.mean)
A::Vector{Union{Missing, Int64}}
Body::Any
1 ─ %1 = Statistics.:(var"#mean#2")(Statistics.:(:), #self#, A)::Any
└── return %1
I'm bummed
Union{Float64, Missing}
. That would help inference a lotSubArray
trick doesn't work. I thought telling the compiler the eltype
of the SubArray
was Int
would be enough for it realize there is no type instability. But that is not the case. The SubArray
issue seems to be due to the fact that nothing ensures that a value of type T
is returned: getindex
just calls getindex
on the parent array, without a type assertion. I wonder whether it would be acceptable to add ::T
in Base, but I'm afraid it would have a performance impact in the most common case where the eltype is the same as the parent.
mean
being inferred as Any
more generally is annoying and probably related to the promotion mechanism that we now use. This needs investigation to see whether we can find a good solution.
This also affects issue also affects the function var
and std
.
julia> @code_warntype std([1.,2.,missing])
MethodInstance for Statistics.std(::Vector{Union{Missing, Float64}})
from std(A::AbstractArray; corrected, mean, dims) @ Statistics /mnt/data1/abarth/opt/julia-1.9.0/share/julia/stdlib/v1.9/Statistics/src/Statistics.jl:449
Arguments
#self#::Core.Const(Statistics.std)
A::Vector{Union{Missing, Float64}}
Body::Any
1 ─ %1 = Statistics.:(var"#std#17")(true, Statistics.nothing, Statistics.:(:), #self#, A)::Any
└── return %1
I am working on a PR to Missings.jl where I construct a subarray manually. See the PR here
I've been trying to debug inference issues using the
mean
function. I thought that my implementation was wrong. But now I see thatmean
is actually type-unstable in this instance!Is there anything that I should do to recover type stability? Is that I'm doing kosher at all?