invenia / Impute.jl

Imputation methods for missing data in julia
https://invenia.github.io/Impute.jl/latest/
Other
77 stars 11 forks source link

Add Tables support for SVD #128

Open ParadaCarleton opened 2 years ago

ParadaCarleton commented 2 years ago
julia> y = DataFrame(randn(10, 10) + Diagonal(vcat(missings(5), zeros(5))), :auto)
10×10 DataFrame
 Row │ x1              x2              x3               x4               ⋯
     │ Float64?        Float64?        Float64?         Float64?         ⋯
─────┼────────────────────────────────────────────────────────────────────
   1 │ missing              -0.405736        0.171563         0.957845   ⋯
   2 │      -1.11445   missing              -1.29022         -0.690164
   3 │       2.3283          0.390413  missing               -1.98893
   4 │      -0.49741         0.745011       -0.423567   missing        
   5 │      -0.886931        0.462199        1.16495          0.398612   ⋯
   6 │      -0.768736        0.285009        0.867999         0.272915
   7 │      -0.256854        0.856902        0.349907         0.343897
   8 │      -0.837532       -0.199018       -0.581194        -1.27162
   9 │       0.486324        0.457107        0.0757129       -1.08212    ⋯
  10 │      -0.323865       -1.02249         0.0142763       -0.753485
                                                         6 columns omitted

julia> Impute.svd!(y)
ERROR: MethodError: no method matching _impute!(::Vector{Union{Missing, Float64}}, ::Impute.SVD)
Closest candidates are:
  _impute!(::AbstractVector{<:Union{Missing, T}}, ::Impute.Interpolate) where T at ~/.julia/packages/Impute/oZAQh/src/imputors/interp.jl:41
  _impute!(::AbstractArray{Union{Missing, T}, 1}, ::Impute.LOCF) where T at ~/.julia/packages/Impute/oZAQh/src/imputors/locf.jl:44
  _impute!(::AbstractArray{Union{Missing, T}, 1}, ::Impute.NOCB) where T at ~/.julia/packages/Impute/oZAQh/src/imputors/nocb.jl:43
  ...
Stacktrace:
 [1] impute!(data::Vector{Union{Missing, Float64}}, imp::Impute.SVD; dims::Function, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ Impute ~/.julia/packages/Impute/oZAQh/src/imputors.jl:127
 [2] impute!(data::Vector{Union{Missing, Float64}}, imp::Impute.SVD)
   @ Impute ~/.julia/packages/Impute/oZAQh/src/imputors.jl:127
 [3] impute!(table::DataFrame, imp::Impute.SVD; cols::Nothing)
   @ Impute ~/.julia/packages/Impute/oZAQh/src/imputors.jl:232
 [4] impute!
   @ ~/.julia/packages/Impute/oZAQh/src/imputors.jl:224 [inlined]
 [5] svd!(data::DataFrame; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ Impute ~/.julia/packages/Impute/oZAQh/src/functional.jl:80
 [6] svd!(data::DataFrame)
   @ Impute ~/.julia/packages/Impute/oZAQh/src/functional.jl:79
 [7] top-level scope
   @ REPL[273]:1
rofinn commented 2 years ago

That's by design. If you look at the docstring it explicitly states that it only works on matrices. Similarly, vectors and n-dimensional arrays will also fail. I suppose we could add a special case for converting tables to a matrix and then back again, but that would still have a bunch of edge cases if the columns aren't the same types.