rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
271 stars 38 forks source link

`cat` no longer works with concrete indices #512

Closed sethaxen closed 11 months ago

sethaxen commented 1 year ago

v0.24.13 seems to have broken this code that previously worked:

julia> using DimensionalData

julia> x = DimArray(randn(2, 3), (Dim{:a}(1:2), Dim{:b}(1:3)))
2×3 DimArray{Float64,2} with dimensions: 
  Dim{:a} Sampled{Int64} 1:2 ForwardOrdered Regular Points,
  Dim{:b} Sampled{Int64} 1:3 ForwardOrdered Regular Points
     1          2          3
 1   0.788356  -0.873793  -0.702195
 2  -0.879859  -0.733255  -0.0725819

julia> cat(x, x; dims=:a)

on v0.24.12:

4×3 DimArray{Float64,2} with dimensions: 
  Dim{:a} Sampled{Int64} Int64[1, 2, 1, 2] ForwardOrdered Regular Points,
  Dim{:b} Sampled{Int64} 1:3 ForwardOrdered Regular Points
     1          2          3
 1   0.788356  -0.873793  -0.702195
 2  -0.879859  -0.733255  -0.0725819
 1   0.788356  -0.873793  -0.702195
 2  -0.879859  -0.733255  -0.0725819

on v0.24.14:

ERROR: Lookups overlap or are not in order at 1 and 2
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] _lookup_index_cat_error(lookup::DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}, xl::Int64)
    @ DimensionalData ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:394
  [3] (::DimensionalData.var"#136#138"{DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}})(lookup::DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata})
    @ DimensionalData ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:377
  [4] #59
    @ ./tuple.jl:602 [inlined]
  [5] BottomRF
    @ ./reduce.jl:81 [inlined]
  [6] afoldl
    @ ./operators.jl:535 [inlined]
  [7] _foldl_impl
    @ ./tuple.jl:329 [inlined]
  [8] foldl_impl
    @ ./reduce.jl:48 [inlined]
  [9] mapfoldl_impl
    @ ./reduce.jl:44 [inlined]
 [10] #mapfoldl#288
    @ ./reduce.jl:170 [inlined]
 [11] mapfoldl
    @ ./reduce.jl:170 [inlined]
 [12] #foldl#289
    @ ./reduce.jl:193 [inlined]
 [13] foldl
    @ ./reduce.jl:193 [inlined]
 [14] foreach
    @ ./tuple.jl:602 [inlined]
 [15] _vcat_index
    @ ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:373 [inlined]
 [16] _vcat_lookups
    @ ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:319 [inlined]
 [17] vcat
    @ ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:309 [inlined]
 [18] BottomRF
    @ ./reduce.jl:81 [inlined]
 [19] afoldl
    @ ./operators.jl:536 [inlined]
 [20] _foldl_impl
    @ ./tuple.jl:329 [inlined]
 [21] foldl_impl(op::Base.BottomRF{typeof(vcat)}, nt::Base._InitialValue, itr::Tuple{Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}})
    @ Base ./reduce.jl:48
 [22] mapfoldl_impl(f::typeof(identity), op::typeof(vcat), nt::Base._InitialValue, itr::Tuple{Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}})
    @ Base ./reduce.jl:44
 [23] mapfoldl(f::Function, op::Function, itr::Tuple{Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}; init::Base._InitialValue)
    @ Base ./reduce.jl:170
 [24] mapfoldl
    @ ./reduce.jl:170 [inlined]
 [25] #mapreduce#292
    @ ./reduce.jl:302 [inlined]
 [26] mapreduce
    @ ./reduce.jl:302 [inlined]
 [27] reduce(op::Function, itr::Tuple{Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}; kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base ./reduce.jl:485
 [28] reduce
    @ ./reduce.jl:485 [inlined]
 [29] #109
    @ ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:243 [inlined]
 [30] map
    @ ./tuple.jl:273 [inlined]
 [31] _cat(catdims::Tuple{Dim{:a, Colon}}, A1::DimArray{Float64, 2, Tuple{Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:b, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}, Tuple{}, Matrix{Float64}, DimensionalData.NoName, DimensionalData.Dimensions.LookupArrays.NoMetadata}, As::DimArray{Float64, 2, Tuple{Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:b, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}, Tuple{}, Matrix{Float64}, DimensionalData.NoName, DimensionalData.Dimensions.LookupArrays.NoMetadata})
    @ DimensionalData ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:236
 [32] _cat(_catdims::Tuple{Symbol}, A1::DimArray{Float64, 2, Tuple{Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:b, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}, Tuple{}, Matrix{Float64}, DimensionalData.NoName, DimensionalData.Dimensions.LookupArrays.NoMetadata}, As::DimArray{Float64, 2, Tuple{Dim{:a, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:b, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}, Tuple{}, Matrix{Float64}, DimensionalData.NoName, DimensionalData.Dimensions.LookupArrays.NoMetadata})
    @ DimensionalData ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:228
 [33] _cat
    @ ~/.julia/packages/DimensionalData/jKcnN/src/array/methods.jl:231 [inlined]
 [34] #cat#153
    @ ./abstractarray.jl:1979 [inlined]
 [35] top-level scope
    @ REPL[11]:1
rafaqz commented 1 year ago

Ahh damn yes thats a bug. Will fix later today.

rafaqz commented 1 year ago

Actually seems that your a dimension was already broken on v0.24.12 ? Its not ForwardOrdered or Regular so selectors wont work at all on it. The error is trying to avoid making broken buggy dimension like that.

I guess it could fall back to NoLookup instead ? (although that wont be type stable)

Without lookups it works:

julia> x = DimArray(randn(2, 3), (:a, :b))
2×3 DimArray{Float64,2} with dimensions: Dim{:a}, Dim{:b}
 0.399932  0.221793  -1.17104
 1.58064   1.2108    -0.836461

julia> cat(x, x; dims=:a)
4×3 DimArray{Float64,2} with dimensions: Dim{:a}, Dim{:b}
 0.399932  0.221793  -1.17104
 1.58064   1.2108    -0.836461
 0.399932  0.221793  -1.17104
 1.58064   1.2108    -0.836461
sethaxen commented 1 year ago

In my use cases, we frequently load data from a netCDF file where default unit range lookups have been added for all dimensions. Now we might want to concatenate two such DimArrays across dimensions. My users would be fine with a convenient syntax for indicating that the existing lookups should be discarded, but all of the following raise the same error:

julia> cat(x, x; dims=Dim{:a});

julia> cat(x, x; dims=Dim{:a}());

julia> cat(x, x; dims=Dim{:a}(LookupArrays.NoLookup()));

Is there a simple syntax I can recommend?

rafaqz commented 1 year ago

Ok great. I just tried something like that too, I thought I had allowed it but it doesnt get through.

The second two should work (but dont), as well as passing in new lookup values when you want that.

Using just a type or Symbol will give the current behavour, but a constructed dim like your example will fix it.

rafaqz commented 1 year ago

@sethaxen it looks like the fix is a breaking change. I think even though the current version broke things that were already bugs, it should have been a breaking change too as they still worked as basic arrays.

So maybe we can yank this version and push a new breaking release with the fix when its done?

sethaxen commented 1 year ago

I agree the fix should have been breaking. I ran into this again in a numerical algorithm that simply vcated 2 arrays, which should always work without error but began erroring with this fix.

rafaqz commented 1 year ago

Ok I'll yank it

rafaqz commented 1 year ago

Sorry for the hassle

sethaxen commented 1 year ago

No problem at all!

sethaxen commented 11 months ago

@rafaqz is there any update on this? v0.24.14 was yanked, but the reported errors when using cat began on v0.24.13, which is still the latest release.

rafaqz commented 11 months ago

Ah sorry I got the versions mixed up. Its fixed on master for a week now Ive just been on holiday and yhere were some CI problems

You could also yank 0.24.13

rafaqz commented 11 months ago

If you can please register the latest version when CI is green on master, I cant comment on the commit from my phone

sethaxen commented 11 months ago

Sure, will do