Open danlooo opened 1 year ago
data cubes of different spatio-temporal resolutions
isn't this case already. You can always pass bunch of YAXArrays of different dimensions into a dataset that can be saved as a .zarr
file, or?
Datasets are to store multiple variables sampled over the same grid defined by their shared axes. However, the e.g. spatial axes of different resolutions are not the same. Trying this:
using YAXArrays
using Zarr
high_res_cube = YAXArray(rand(10, 10, 3))
low_res_cube = YAXArray(rand(5, 5, 3))
ds = Dataset(high_res = high_res_cube, low_res = low_res_cube)
savedataset(ds; path = "foo.zarr", driver=:zarr)
also returns an error when it comes to saving the dataset on disk:
ERROR: ArgumentError: Can not construct YAXArray, supplied data size is (10, 10, 3) while axis lenghts are (5, 5, 3)
Stacktrace:
[1] YAXArray(axes::Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, data::ZArray{Float64, 3, Zarr.BloscCompressor, DirectoryStore}, properties::Dict{String, Any}, chunks::DiskArrays.GridChunks{3}, cleaner::Vector{YAXArrays.Cubes.CleanMe})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/R6KY3/src/Cubes/Cubes.jl:110
[2] #YAXArray#5
@ ~/.julia/packages/YAXArrays/R6KY3/src/Cubes/Cubes.jl:129 [inlined]
[3] collectfromhandle(e::NamedTuple{(:name, :t, :chunks, :axes, :attr, :subs, :require_CF, :offs), Tuple{String, DataType, Tuple{Int64, Int64, Int64}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, Dict{String, Any}, Nothing, Bool, Dict{Symbol, Int64}}}, dshandle::YAXArrayBase.ZarrDataset, cleaner::Vector{YAXArrays.Cubes.CleanMe})
@ YAXArrays.Datasets ~/.julia/packages/YAXArrays/R6KY3/src/DatasetAPI/Datasets.jl:403
[4] #102
@ ~/.julia/packages/YAXArrays/R6KY3/src/DatasetAPI/Datasets.jl:564 [inlined]
[5] iterate
@ ./generator.jl:47 [inlined]
[6] collect_to!(dest::Vector{YAXArray{Float64, 3, ZArray{Float64, 3, Zarr.BloscCompressor, DirectoryStore}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}}}, itr::Base.Generator{Vector{NamedTuple{(:name, :t, :chunks, :axes, :attr, :subs, :require_CF, :offs), Tuple{String, DataType, Tuple{Int64, Int64, Int64}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, Dict{String, Any}, Nothing, Bool, Dict{Symbol, Int64}}}}, YAXArrays.Datasets.var"#102#108"{YAXArrayBase.ZarrDataset, Vector{YAXArrays.Cubes.CleanMe}}}, offs::Int64, st::Int64)
@ Base ./array.jl:840
[7] collect_to_with_first!(dest::Vector{YAXArray{Float64, 3, ZArray{Float64, 3, Zarr.BloscCompressor, DirectoryStore}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}}}, v1::YAXArray{Float64, 3, ZArray{Float64, 3, Zarr.BloscCompressor, DirectoryStore}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}}, itr::Base.Generator{Vector{NamedTuple{(:name, :t, :chunks, :axes, :attr, :subs, :require_CF, :offs), Tuple{String, DataType, Tuple{Int64, Int64, Int64}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, Dict{String, Any}, Nothing, Bool, Dict{Symbol, Int64}}}}, YAXArrays.Datasets.var"#102#108"{YAXArrayBase.ZarrDataset, Vector{YAXArrays.Cubes.CleanMe}}}, st::Int64)
@ Base ./array.jl:818
[8] _collect(c::Vector{NamedTuple{(:name, :t, :chunks, :axes, :attr, :subs, :require_CF, :offs), Tuple{String, DataType, Tuple{Int64, Int64, Int64}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, Dict{String, Any}, Nothing, Bool, Dict{Symbol, Int64}}}}, itr::Base.Generator{Vector{NamedTuple{(:name, :t, :chunks, :axes, :attr, :subs, :require_CF, :offs), Tuple{String, DataType, Tuple{Int64, Int64, Int64}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, Dict{String, Any}, Nothing, Bool, Dict{Symbol, Int64}}}}, YAXArrays.Datasets.var"#102#108"{YAXArrayBase.ZarrDataset, Vector{YAXArrays.Cubes.CleanMe}}}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1})
@ Base ./array.jl:812
[9] collect_similar(cont::Vector{NamedTuple{(:name, :t, :chunks, :axes, :attr, :subs, :require_CF, :offs), Tuple{String, DataType, Tuple{Int64, Int64, Int64}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, Dict{String, Any}, Nothing, Bool, Dict{Symbol, Int64}}}}, itr::Base.Generator{Vector{NamedTuple{(:name, :t, :chunks, :axes, :attr, :subs, :require_CF, :offs), Tuple{String, DataType, Tuple{Int64, Int64, Int64}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, Dict{String, Any}, Nothing, Bool, Dict{Symbol, Int64}}}}, YAXArrays.Datasets.var"#102#108"{YAXArrayBase.ZarrDataset, Vector{YAXArrays.Cubes.CleanMe}}})
@ Base ./array.jl:711
[10] map(f::Function, A::Vector{NamedTuple{(:name, :t, :chunks, :axes, :attr, :subs, :require_CF, :offs), Tuple{String, DataType, Tuple{Int64, Int64, Int64}, Vector{RangeAxis{Int64, _A, Base.OneTo{Int64}} where _A}, Dict{String, Any}, Nothing, Bool, Dict{Symbol, Int64}}}})
@ Base ./abstractarray.jl:3261
[11] savedataset(ds::Dataset; path::String, persist::Nothing, overwrite::Bool, append::Bool, skeleton::Bool, backend::Symbol, driver::Symbol, max_cache::Float64, writefac::Float64, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ YAXArrays.Datasets ~/.julia/packages/YAXArrays/R6KY3/src/DatasetAPI/Datasets.jl:564
[12] top-level scope
@ REPL[20]:1
Multiple Datasets in the Common Data Model V4 can be stored in the same file. Hereby, they are organized in (nested) groups, analog to files in directories and subdirectories.
For example,
xarray.Dataset.to_zarr
has the optiongroup
to specify the path inside the zarr storage in which the dataset should be stored. Similarily,zarr.hierarchy.group
has the optionpath
to specify the (group) path. The prototype (and part of xarray roadmap) xarray-datatree uses this to represent a tree of Datasets as its own type. I think it is already implemented in Zarr.jl function Zarr.zcreate in optionname
.This is of particular importance when it comes to store data cubes of different spatio-temporal resolutions in the same store. I'd be great to have an additional
group
option to the functionsavedataset
andsavecube
.