rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
271 stars 38 forks source link

Matrix factorization discards axes #514

Closed ParadaCarleton closed 7 months ago

ParadaCarleton commented 1 year ago
julia> x = randn(3, 3)

julia> x = x * x'

julia> y = DimArray(x, (x=1:3, y=1:3))
3×3 DimArray{Float64,2} with dimensions: 
  Dim{:x} Sampled{Int64} 1:3 ForwardOrdered Regular Points,
  Dim{:y} Sampled{Int64} 1:3 ForwardOrdered Regular Points
    1          2         3
 1  3.34007    1.32409   0.196571
 2  1.32409    1.85992  -1.16631
 3  0.196571  -1.16631   1.27849

julia> cholesky(y)
Cholesky{Float64, Matrix{Float64}}
U factor:
3×3 UpperTriangular{Float64, Matrix{Float64}}:
 1.82758  0.724501   0.107558
  ⋅       1.15543   -1.07686
  ⋅        ⋅         0.327573
rafaqz commented 1 year ago

PR?

(Generally functions that dont call similar need manual intervention to return a AbstractDimArray. Just rebuild may be enough, but not always. I dont use cholesky so I dont know if both the dimension names remain meaningful - so its not implemented)

ParadaCarleton commented 1 year ago

Generally functions that dont call similar need manual intervention to return a AbstractDimArray.

Do you know if there's a good reason cholesky (and perhaps other matrix factorizations) don't call similar? If not, this should maybe be done in LinearAlgebra for greater generality.

I dont use cholesky so I dont know if both the dimension names remain meaningful - so its not implemented

My main use case for cholesky is factorizing covariance matrices, where the dimension names are the same for both axes and stay meaningful. The useful property of cholesky is that multiplying the Cholesky factor of the covariance by IID normal data gives you data with the correct covariance.

sethaxen commented 1 year ago

Might I suggest, before going through the work of overloading cholesky for an AbstractDimArray, can you construct a Cholesky object with the types you want and show how you would then use it?

rafaqz commented 1 year ago

cholesky doesnt call similar because it returns a triangular matrix (cant remember which).

Its possible that wrapping that in a DimArray will lose some dispatch optimisations.

ParadaCarleton commented 1 year ago

cholesky doesnt call similar because it returns a triangular matrix (cant remember which).

It returns a wrapper around a triangular matrix, but the triangular matrix should be a DimArray. (I assume that’s possible? If not I’ll have to implement it when I have time.)

rafaqz commented 11 months ago

@ParadaCarleton this issue will wait forever on your MWE ;)

If it doesn't actually make sense after all, please close it because I'm not qualified to decide that either way.