JuliaDataCubes / YAXArrays.jl

Yet Another XArray-like Julia package
https://juliadatacubes.github.io/YAXArrays.jl/
Other
101 stars 17 forks source link

Seasonal or arbitrary subsetting #250

Closed Balinus closed 7 months ago

Balinus commented 1 year ago

Hello!

Is there a way to subset a cube to keep only a given yearly period? For example, if I have a hourly Cube ranging from 1981-01-01 to 2020-12-31, I'd like to extract only the months of say, May, June and July, but for all years:

time axis would be something like

DateTime(1981,5,1)..DateTime(1981,7,31,23,59)
DateTime(1982,5,1)..DateTime(1982,7,31,23,59)
[...]
DateTime(2019,5,1)..DateTime(2019,7,31,23,59)
DateTime(2020,5,1)..DateTime(2020,7,31,23,59)

DimensionalData.jl seems to have the appropritate tools:

using Dates
using DimensionalData, YAXArrayBase

t = DateTime(2001,1,1):Day(1):DateTime(2003,12, 31)
x = 10:10:1095

#3D array
A = rand(X(x), Y(x), Ti(t))

109×109×1095 DimArray{Float64,3} with dimensions: 
  X Sampled{Int64} 10:10:1090 ForwardOrdered Regular Points,
  Y Sampled{Int64} 10:10:1090 ForwardOrdered Regular Points,
  Ti Sampled{DateTime} DateTime("2001-01-01T00:00:00"):Day(1):DateTime("2003-12-31T00:00:00") ForwardOrdered Regular Points
[:, :, 1]

# Subsetting
A[Ti(Where(x -> month(x) >= 5 && month(x) <= 7))]

109×109×1095 DimArray{Float64,3} with dimensions: 
  X Sampled{Int64} 10:10:1090 ForwardOrdered Regular Points,
  Y Sampled{Int64} 10:10:1090 ForwardOrdered Regular Points,
  Ti Sampled{DateTime} DateTime("2001-01-01T00:00:00"):Day(1):DateTime("2003-12-31T00:00:00")

Just wondering if we can avoid converting to a dimarray with current syntax, because my current Dataset size is ~170GB and I am not sure if it scale well, the dataset seems to be loaded in memory each time I do a different subset.

Cheers!

lazarusA commented 1 year ago

Indeed, that's an issue. @felixcremer is working on that, so that is not longer needed that intermediate step. Unfortunately, still is not functional.

Balinus commented 1 year ago

ok, thanks for the quick feedback. Since it's a script that I will run one-time (saving the result), it's not a big obstacle right now. Cheers!

meggart commented 1 year ago

I agree this will be solved once we have merged #249