JuliaDataCubes / YAXArrays.jl

Yet Another XArray-like Julia package
https://juliadatacubes.github.io/YAXArrays.jl/
Other
103 stars 18 forks source link

mapslices on a Dataset returns a wrong cube #232

Open Qfl3x opened 1 year ago

Qfl3x commented 1 year ago

Running mapslices on a Dataset should return another Dataset, instead it returns an odd cube:

using YAXArrays

a = YAXArray(ones(10,10,10))
b = YAXArray(ones(10,10,10))
c = YAXArray(ones(10,10,10))

axes = Dict(:Dim_1 => a.axes[1], :Dim_2 => a.axes[2], :Dim_3 => a.axes[3])

dataset = Dataset(Dict(:a => a, :b => b, :c => c), test_axes, Dict("test" => "test"))

slice = mapslices(mean, dataset; dims="Dim_1")

println(slice)

I expect a dataset to be returned with the correct variables, instead I get this cube:

YAXArray with the following dimensions
OutAxis1            Axis with 10 Elements from 1 to 10
Dim_2               Axis with 10 Elements from 1 to 10
Dim_3               Axis with 10 Elements from 1 to 10
Total size: 7.81 KB

In addition to the following warning:

[ Info: Found multiple matching axes for output dimension 1
felixcremer commented 1 year ago

Yes, that is a problem and I would also expect a dataset with the variables a, b, c and the dimensions Dim_2, Dim_3. Currently I think we can achieve this behaviour by broadcasting the mapslices on the values of the ds.cubes OrderedDict and then stitching everything together to a new Dataset. See the MWE below The only question for me is, what should we do if one of the cubes of the dataset do not have the dimension that are used in the mapslices? I would expect that this is going to apply the mapslices on the cubes with the sliced dimension and to skip the cubes without the sliced dimension.

julia> a = YAXArray((RangeAxis("X",1:10), RangeAxis("Y", 1:20)), rand(10,20));

julia> ds = Dataset(;a, b=a)
YAXArray Dataset
Dimensions: 
   X                   Axis with 10 Elements from 1 to 10
   Y                   Axis with 20 Elements from 1 to 20
Variables: a b 

julia> Dataset(;(;zip(keys(ds.cubes), mapslices.(mean, values(ds.cubes); dims=Ref(:X)))...)...)
YAXArray Dataset
Dimensions: 
   Y                   Axis with 20 Elements from 1 to 20
Variables: a b