In trying to create a nicer way to access ocean model output (several stacks of netcdf files where each stack can be concatenated, but not necessarily merged into a single Dataset object), I've been able to construct a DataTree object:
hourly = xr.concat([virtualizarr.open_virtual_dataset(path, ...) for path in paths], dim="time", ...)
daily = xr.concat([virtualizarr.open_virtual_dataset(path, ...) for path in paths], dim="time", ...)
monthly = xr.concat([virtualizarr.open_virtual_dataset(path, ...) for path in paths], dim="time", ...)
tree = DataTree.from_dict({"/": ..., "/hourly": hourly, "/daily": daily, "/monthly": monthly})
but would then need a way to write that tree to disk.
The current file formats (except maybe parquet, but not sure), definitely support this since they're based on zarr, we'd just need to create a DataTree accessor and write the code to serialize DataTree objects containing ManifestArrays.
I was going to say that this is a duplicate of #84, but it's actually not because being able to write from DataTree is useful even if we have not yet implemented open_virtual_datatree.
In trying to create a nicer way to access ocean model output (several stacks of netcdf files where each stack can be concatenated, but not necessarily merged into a single
Dataset
object), I've been able to construct aDataTree
object:but would then need a way to write that tree to disk.
The current file formats (except maybe parquet, but not sure), definitely support this since they're based on
zarr
, we'd just need to create aDataTree
accessor and write the code to serializeDataTree
objects containingManifestArray
s.Edit: related to #84 and #11