ecmwf / anemoi-datasets

Apache License 2.0
33 stars 16 forks source link

Specify groups when reading NetCDF files #29

Closed oriolhinojoeum closed 4 weeks ago

oriolhinojoeum commented 1 month ago

Is your feature request related to a problem? Please describe.

I have a NetCDF file and the important information is in a group called "measurements". Would it be possible to specify this in the recipe.yml? If I do not specify this somehow, anemoi does not work because it says it does not find data, which is true because it is not searching in the groups.

Describe the solution you'd like

I would like to be able to define it in the recipe.yml files, something like:

dates:
  start: 2022-01-30T00:00:00Z
  end: 2022-02-31T19:00:00Z

input:
  netcdf:
    path: sample_groups.nc
    groups: measurements

Describe alternatives you've considered

No response

Additional context

This is how I would access the code in using xarray in python code:

import xarray as xr
path="sample_groups.nc"
# Load NetCDF file directly into an xarray
ds = xr.open_dataset(path, group="measurements")

ds

Here you have a sample dataset where you can try it. https://wekeo-files.prod.wekeo2.eu/index.php/s/caTNZZXR2GF6pJY

Organisation

No response

b8raoult commented 1 month ago

Should it be group or groups? Do you expect to have data from several groups simultaneously?

oriolhinojoeum commented 1 month ago

Thanks for your message. All relevant information for normal users are in the group "measurements", so I think it could be simplified to extract it from only one group.

b8raoult commented 1 month ago

I had a look a the group measurement. The "time" variable is not a coordinate. It is also missing attributes.

b8raoult commented 4 weeks ago

This is possible in the latest version:

input:
    netcdf:
         path:  ...
         group: ...