Open TomNicholas opened 8 months ago
We actually don't need to wait for anything upstream in xarray to occur before making something useful here. We could simply create a new virtualizarr.open_virtual_datatree
function, which would detect the filetype, loop over the groups, and use open_virtual_datatree
(/kerchunk directly if necessary) to first create the virtual xr.Dataset
objects, then put them all into a datatree.Datatree
to return. This function could be modelled after how datatree.open_datatree
currently works.
At that point you would have a datatree.Datatree
object wrapping lots of ManifestArray
objects (let's call it vdt1
for "virtual datatree 1"). You could concatenate two such trees using
from datatree import map_over_subtree
combined_virtual_tree = datatree.map_over_subtree(xr.concat, vdt1, vdt2, dim=
'time')
(cc @maxrjones, who asked about doing something similar but for nested HDF5 files)
Recently a way of kerchunking grib data as a
DataTree
object was added https://github.com/fsspec/kerchunk/pull/399. Since the ongoing xarray-datatree integration is adding anopen_datatree
method to xarray's backendentrypoint classes, it's likely that we could make aopen_datatree
method that understands how to read a grib file and return a datatree containingManifestArray
objects.