rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
274 stars 39 forks source link

Broadcast on grouped array #640

Closed lazarusA closed 7 months ago

lazarusA commented 7 months ago

The following fails at a several levels.

using YAXArrays, DimensionalData

tempo = (Dim{:time}(Date("2022-01-01"):Day(1):Date("2023-01-30")),)
tem = dims(ds, :Ti)

month_length = YAXArray((tem,), daysinmonth.(tem))
month_length_dim = DimArray(daysinmonth.(tem), (tem))
# with yax
g_tempo = groupby(month_length, Ti => season(; start=December))
sum_days = sum.(g_tempo, dims=:Ti)
# with dim
g_tempo_dim = groupby(month_length_dim, Ti => season(; start=December))
sum_days_dim = sum.(g_tempo_dim, dims=:Ti)
# no more dropdims available to flatten to a vector, what's the alternative?

This is how the sums look like:

sum_days
╭───────────────────────────────────────╮
│ 4-element DimGroupByArray{YAXArray,1} │
├───────────────────────────────────────┴────────────────────────────────────────────────────────── dims ┐
  ↓ Ti Sampled{Vector{Int64}} [[12, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]] Unordered Irregular Points
├──────────────────────────────────────────────────────────────────────────────────────────── metadata ┤
  Dict{Symbol, Any} with 1 entry:
  :groupby => :Ti=>CyclicBins(month; cycle=12, step=3, start=12)…
├────────────────────────────────────────────────────────────────────────────────────────── group dims ┤
  ↓ Ti
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
 [12, 1, 2]   1-element YAXArray
 [3, 4, 5]    1-element YAXArray
 [6, 7, 8]    1-element YAXArray
 [9, 10, 11]  1-element YAXArray
sum_days_dim
╭───────────────────────────────────────╮
│ 4-element DimGroupByArray{DimArray,1} │
├───────────────────────────────────────┴────────────────────────────────────────────────────────── dims ┐
  ↓ Ti Sampled{Vector{Int64}} [[12, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]] Unordered Irregular Points
├──────────────────────────────────────────────────────────────────────────────────────────── metadata ┤
  Dict{Symbol, Any} with 1 entry:
  :groupby => :Ti=>CyclicBins(month; cycle=12, step=3, start=12)…
├────────────────────────────────────────────────────────────────────────────────────────── group dims ┤
  ↓ Ti
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
 [12, 1, 2]   1-element DimArray
 [3, 4, 5]    1-element DimArray
 [6, 7, 8]    1-element DimArray
 [9, 10, 11]  1-element DimArray

and here the errors:

g_tempo ./ sum_days
ERROR: MethodError: /(::YAXArray{Int64, 1, SubArray{Int64, 1, Vector{Int64}, Tuple{Vector{Int64}}, false}, Tuple{Ti{DimensionalData.Dimensions.LookupArrays.Sampled{CFTime.DateTimeNoLeap, SubArray{CFTime.DateTimeNoLeap, 1, Vector{CFTime.DateTimeNoLeap}, Tuple{Vector{Int64}}, false}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Irregular{Tuple{CFTime.DateTimeNoLeap, CFTime.DateTimeNoLeap}}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}}, ::YAXArray{Int64, 1, Vector{Int64}, Tuple{Ti{DimensionalData.Dimensions.LookupArrays.Sampled{CFTime.DateTimeNoLeap, Vector{CFTime.DateTimeNoLeap}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Irregular{Tuple{CFTime.DateTimeNoLeap, CFTime.DateTimeNoLeap}}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}}) is ambiguous.

Candidates:
  /(c::YAXArray, s)
    @ YAXArrays.Cubes ~/Documents/YAXArrays.jl/src/Cubes/Slices.jl:21
  /(A::AbstractVecOrMat, B::AbstractVecOrMat)
    @ LinearAlgebra ~/.julia/juliaup/julia-1.10.1+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/LinearAlgebra/src/generic.jl:1154

Possible fix, define
  /(::Union{YAXArray{T, 1, A} where {T, A<:AbstractVector{T}}, YAXArray{T, 2, A} where {T, A<:AbstractMatrix{T}}}, ::AbstractVecOrMat)
g_tempo_dim ./ sum_days_dim # this one looks closer to a solution
ERROR: ArgumentError: axes of the array (1) do not match number of dimensions (2)
Stacktrace:
  [1] _dimlengtherror(na::Int64, nd::Int64)
    @ DimensionalData ~/.julia/packages/DimensionalData/8RvBF/src/array/array.jl:437
  [2] checkdims(n::Int64, dims::Tuple{Ti{DimensionalData.Dimensions.LookupArrays.Sampled{…}}, DimensionalData.Dimensions.AnonDim{Base.OneTo{…}}})

And this one works

g_tempo_dim ./ [1,2,3,4] #   g_tempo ./ [1,2,3,4] fails, is the one with yax.

╭───────────────────────────────────────╮
│ 4-element DimGroupByArray{DimArray,1} │
├───────────────────────────────────────┴────────────────────────────────────────────────────────── dims ┐
  ↓ Ti Sampled{Vector{Int64}} [[12, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]] Unordered Irregular Points
├──────────────────────────────────────────────────────────────────────────────────────────── metadata ┤
  Dict{Symbol, Any} with 1 entry:
  :groupby => :Ti=>CyclicBins(month; cycle=12, step=3, start=12)…
├────────────────────────────────────────────────────────────────────────────────────────── group dims ┤
  ↓ Ti
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
 [12, 1, 2]   9-element DimArray
 [3, 4, 5]    9-element DimArray
 [6, 7, 8]    9-element DimArray
 [9, 10, 11]  9-element DimArray
rafaqz commented 7 months ago

I dont have ds...

But I think the problem is you need to map the broadcast over the slices:

map(./, g_tempo, sum_days)
# Or depending on what you are broadcasting
map(g -> g ./ sum_days, g_tempo, )
# Or double broadcast, which looks kinda weird:
(./).(g_tempo, sum_days))

Because actually each grouped array in g_tempo needs to be broadcast with sum_days.

I'm not sure if it makes sense to hack broadcast to "make it work".... even in xarray it only works because they hack those methods like mean work like that - cusom methods wont broadcast.

lazarusA commented 7 months ago

this one works:

map(./, g_tempo, sum_days)