JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.6k stars 5.48k forks source link

`LinearAlgebra.dot` fails with `Vector{Union{Float, Missing}}` #40743

Open Datseris opened 3 years ago

Datseris commented 3 years ago

MWE:

julia> using LinearAlgebra

julia> x = [5, 6, missing];

julia> w = [1, 2, 3];

julia> dot(x, w)
ERROR: MethodError: no method matching iterate(::Missing)
Closest candidates are:
  iterate(::Union{LinRange, StepRangeLen}) at range.jl:664
  iterate(::Union{LinRange, StepRangeLen}, ::Int64) at range.jl:664
  iterate(::T) where T<:Union{Base.KeySet{var"#s79", var"#s78"} where {var"#s79", var"#s78"<:Dict}, Base.ValueIterator{var"#s77"} where var"#s77"<:Dict} at dict.jl:693
  ...
Stacktrace:
   [1] ⋮ internal
     @ LinearAlgebra
   [2] dot(x::Vector{Union{Missing, Int64}}, y::Vector{Int64})
     @ LinearAlgebra C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\generic.jl:916
Use `err` to retrieve the full stack trace.

I would have assumed that the dot should yield missing.

For reference, I came across this using StatsBase.mean(x, weights(w)).

pdeffebach commented 3 years ago

Just want to emphasize there is no passmissing solution for this since we call dot recursively. So I think #40769 is the only way forward absent some sort of number-like iteration for missing.

nalimilan commented 3 years ago

I think that's a bug in StatsBase, which should only use dot for element types known to give the expected result. Actually we already have an almost finished PR to fix this: https://github.com/JuliaStats/StatsBase.jl/pull/658.

dkarrasch commented 3 years ago

Does that also fix people's need for a missing-norm? I'm fine with closing my PRs.

nalimilan commented 3 years ago

I don't have a strong opinion about that.

Datseris commented 3 years ago

Isn't the fact that this issue was inspired by StatsBase a bit irrelevant here? Shouldn't dot([1, 2, missing], [1, 2, 3]) return missing? Not sure why we keep arguing about StatsBase instead of addressing the main point. I've updated the original post to not use StatsBase at all, as it is not necessary to reproduce the issue.

pdeffebach commented 3 years ago

Sorry to bring this up again, but is there some way to come to a decision on this? Does it get added to triage?

nalimilan commented 3 years ago

https://github.com/JuliaLang/julia/pull/40769 is marked for linalg triage.

juliohm commented 1 year ago

It would be nice to have this fixed. I am still pirating the dot definition to make it work in applications with missing values.

nalimilan commented 1 year ago

For some reason https://github.com/JuliaLang/julia/pull/40769 hasn't been discussed by linalg triage despite the label. I'm not even sure what "linalg triage" is. :-)

juliohm commented 3 hours ago

Below is the workaround (relies on piracy) we've been using since this issue was opened:

import LinearAlgebra: ⋅
⋅(::Missing, ::Missing) = missing
adienes commented 3 hours ago

I am a card-carrying certified Missing Propagations Hater and I sincerely hope that method is never added to LinearAlgebra