rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
271 stars 38 forks source link

cat works but ;; does not - add methods for hvncat #531

Open bjarthur opened 1 year ago

bjarthur commented 1 year ago

specifically, cat returns a DimArray but ;; returns an Array:

julia> A
3×4×5 DimArray{Float64,3} with dimensions: X, Y, Z
[:, :, 1]
 0.0933486  0.8736    0.52288   0.619605
 0.690546   0.190862  0.82748   0.673151
 0.25412    0.307921  0.267134  0.525097
[and 4 more slices...]

julia> B
3×1×5 DimArray{Float64,3} with dimensions: X, Y, Z
[:, :, 1]
 0.8736
 0.190862
 0.307921
[and 4 more slices...]

julia> cat(A, B; dims=2)
3×5×5 DimArray{Float64,3} with dimensions: X, Y, Z
[:, :, 1]
 0.0933486  0.8736    0.52288   0.619605  0.8736
 0.690546   0.190862  0.82748   0.673151  0.190862
 0.25412    0.307921  0.267134  0.525097  0.307921
[and 4 more slices...]

julia> [A;; B]
3×5×5 Array{Float64, 3}:
[:, :, 1] =
 0.0933486  0.8736    0.52288   0.619605  0.8736
 0.690546   0.190862  0.82748   0.673151  0.190862
 0.25412    0.307921  0.267134  0.525097  0.307921

[:, :, 2] =
 0.461316  0.917871  0.853377  0.0952855  0.917871
 0.750445  0.444085  0.904203  0.231573   0.444085
 0.309473  0.140427  0.701603  0.521695   0.140427

[:, :, 3] =
 0.0461241  0.60098    0.231344  0.804608  0.60098
 0.773921   0.692283   0.564171  0.771804  0.692283
 0.444511   0.0428789  0.412942  0.285834  0.0428789

[:, :, 4] =
 0.416985  0.74103   0.0278025   0.449966  0.74103
 0.375376  0.779698  0.674408    0.745035  0.779698
 0.206332  0.70281   0.00972078  0.311566  0.70281

[:, :, 5] =
 0.330682  0.010291  0.915508  0.190581  0.010291
 0.92222   0.934861  0.342756  0.738472  0.934861
 0.242877  0.269638  0.948106  0.437875  0.269638
julia> VERSION
v"1.9.3"

julia> Sys.MACHINE
"arm64-apple-darwin22.4.0"

(shroff) pkg> st
Status `~/shroff/Project.toml`
  [0703355e] DimensionalData v0.24.13
rafaqz commented 1 year ago

Ok looks like we are missing methods for hvncat.

Somehow I didn't even know that existed, looks like it was only added to Base in 2021.

rafaqz commented 1 year ago

Turns out this is kind of a nightmare to actually implement if we want the dimensions to be guaranteed to make sense afterwards. hvncat accepts a pretty complex mosaic of n dimensional arrays as inputs and its really either we support everything here or stick with the fallback.

Basically calculating the new lookups and making sure they are correct for all arguments in all cases is very difficult.

We would need to:

  1. create new lookups along the first row/column etc concatenations in all dimensions, making sure they make sense, don't overlap etc.
  2. calculate where each component array will end up in the final array.
  3. check that the lookup for each component array matches our new overall lookups for the part they are being put into, probably by comparing them to views of the overal dimensions for the indices they occupy.
  4. If anything fails, just cat the parent array type. If it succeeds, do the same but rebuild as a DimArray with the new dimensions we made.

Getting that right and testing thoroughly will take someone a few days. @bjarthur, you're welcome to try, or anyone else who wants a challenge. Half of the code and checks are already written for cat, but this is strictly harder because of the mosaic of different sized arrays that is allowed.