rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
274 stars 39 forks source link

broadcasted_dims on groupby #641

Closed lazarusA closed 7 months ago

lazarusA commented 7 months ago

Consider the following:

using DimensionalData
using YAXArrays
using Dates
using Statistics

axlist = (
    Dim{:Ti}(Date("2021-12-01"):Day(1):Date("2022-12-31")),
    X(range(1, 10, length=10)),
    Y(range(1, 5, length=15)),
    Dim{:Variable}(["var1", "var2"]))
data = rand(396, 10, 15, 2)
ds = YAXArray(axlist, data)

tempo = dims(ds, Dim{:Ti})  # Dim{:Ti} and not Ti ! a yax thing maybe.
month_length = YAXArray((tempo,), daysinmonth.(tempo))

g_tempo = groupby(month_length, Dim{:Ti} => season(; start=December))

sum_days = sum.(g_tempo, dims=Dim{:Ti})
weights = map(./, g_tempo, sum_days)

g_ds = groupby(ds, Dim{:Ti} => season(; start=December))

# TODO 
# broadcast_dims.(*, weights, Ref(g_ds))
# g_ds_w = weights .* g_ds # the red (first dimension)
# sum.(g_ds_w, dims = Dim{:Ti})
rafaqz commented 7 months ago

YAX bug?

julia> g_ds_w = broadcast_dims.(*, DimArray.(weights), DimArray.(g_ds))
╭───────────────────────────────────────╮
│ 4-element DimGroupByArray{DimArray,1} │
├───────────────────────────────────────┴─────────────────────────────────────────────── dims ┐
  ↓ Ti Categorical{Symbol} [:Dec_Jan_Feb, :Mar_Apr_May, :Jun_Jul_Aug, :Sep_Oct_Nov] Unordered
├───────────────────────────────────────────────────────────────────────────────── metadata ┤
  Dict{Symbol, Any} with 1 entry:
  :groupby => :Ti=>CyclicBins(month; cycle=12, step=3, start=12)…
├─────────────────────────────────────────────────────────────────────────────── group dims ┤
  ↓ Ti, → X, ↗ Y, ⬔ Variable
└───────────────────────────────────────────────────────────────────────────────────────────┘
 :Dec_Jan_Feb  121×10×15×2 DimArray
 :Mar_Apr_May   92×10×15×2 DimArray
 :Jun_Jul_Aug   92×10×15×2 DimArray
 :Sep_Oct_Nov   91×10×15×2 DimArray

@felixcremer looks like you need to implement rebuild(::YAXArray; kw...) because you don't have a metadata field.

Oops you do it just has a limited type, maybe loosen it to a parameter? https://github.com/JuliaDataCubes/YAXArrays.jl/blob/master/src/Cubes/Cubes.jl#L203

NoMetadata is just like nothing but get(nometadata, :key, default) wont fail on it. And it doesn't allocate like Dict(). You could catch it and make a Dict{String,Any}() instead if you always want that.

julia> g_ds_w = broadcast_dims.(*, weights, g_ds)
ERROR: MethodError: Cannot `convert` an object of type 
  DimensionalData.Dimensions.LookupArrays.NoMetadata to an object of type 
  Dict{String}

Closest candidates are:
  convert(::Type{T}, ::T) where T<:AbstractDict
   @ Base abstractdict.jl:565
  convert(::Type{T}, ::T) where T
   @ Base Base.jl:84
  convert(::Type{T}, ::AbstractDict) where T<:AbstractDict
   @ Base abstractdict.jl:567

Stacktrace:
  [1] YAXArray(axes::Tuple{…}, data::Array{…}, properties::DimensionalData.Dimensions.LookupArrays.No
Metadata, chunks::DiskArrays.GridChunks{…}, cleaner::Vector{…})
    @ YAXArrays.Cubes ~/.julia/packages/YAXArrays/vR35N/src/Cubes/Cubes.jl:123
  [2] YAXArray(axes::Tuple{…}, data::Array{…}, properties::DimensionalData.Dimensions.LookupArrays.No
Metadata; cleaner::Vector{…}, chunks::DiskArrays.GridChunks{…})
    @ YAXArrays.Cubes ~/.julia/packages/YAXArrays/vR35N/src/Cubes/Cubes.jl:136
  [3] rebuild(A::YAXArray{…}; data::Array{…}, dims::Tuple{…}, metadata::DimensionalData.Dimensions.Lo
okupArrays.NoMetadata, kw::@Kwargs{…})
    @ YAXArrays.Cubes ~/.julia/packages/YAXArrays/vR35N/src/Cubes/Cubes.jl:203
  [4] similar(A::YAXArray{Float64, 1, Vector{…}, Tuple{…}}, ::Type{Float64}, D::Tuple{Dim{…}, X{…}, Y
{…}, Dim{…}})
    @ DimensionalData ~/.julia/dev/DimensionalData/src/array/array.jl:244
  [5] broadcast_dims(::Function, ::YAXArray{Float64, 1, Vector{Float64}, Tuple{Dim{…}}}, ::Vararg{Abs
tractDimArray})
    @ DimensionalData ~/.julia/dev/DimensionalData/src/utils.jl:117
  [6] _broadcast_getindex_evalf
    @ ./broadcast.jl:709 [inlined]
  [7] _broadcast_getindex
    @ ./broadcast.jl:682 [inlined]
  [8] getindex
    @ ./broadcast.jl:636 [inlined]
  [9] copy
    @ ./broadcast.jl:942 [inlined]
 [10] copy(bc::Base.Broadcast.Broadcasted{DimensionalData.DimensionalStyle{…}, Nothing, typeof(broadc
ast_dims), Tuple{…}})
    @ DimensionalData ~/.julia/dev/DimensionalData/src/array/broadcast.jl:39
 [11] copy(bc::Base.Broadcast.Broadcasted{DimensionalData.DimensionalStyle{…}, Tuple{…}, typeof(broad
cast_dims), Tuple{…}})
    @ DimensionalData ~/.julia/dev/DimensionalData/src/array/broadcast.jl:39
 [12] materialize(bc::Base.Broadcast.Broadcasted{…})
    @ Base.Broadcast ./broadcast.jl:903
 [13] top-level scope
    @ REPL[55]:1
Some type information was truncated. Use `show(err)` to see complete types.