ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes
Apache License 2.0
407 stars 77 forks source link

Select dataset/group while opening grib file with xarray #372

Open vinodkatmos opened 8 months ago

vinodkatmos commented 8 months ago

Is your feature request related to a problem? Please describe.

Currently, if I open a heterogeneous grib file uisng xarray, I have to filter for appropriate dataset by providing appropriate filter_by_keys . However this doesnot always work and in my case for data = xr.open_dataset(filein, engine='cfgrib', backend_kwargs={'indexpath': 'temp/{short_hash}.idx', 'filter_by_keys':{'typeOfLevel':'hybrid'}}) I got an error

cfgrib.dataset.DatasetBuildError: key present and new value is different: key='hybrid' value=Variable(dimensions=('hybrid',), data=array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39., 40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 50., 51., 52., 53., 54., 55., 56., 57., 58., 59., 60.])) new_value=Variable(dimensions=(), data=1.0)

This was happening because my gribfile had two dataset on hybrid levels. One of 60 Hybrid levels and another on 1 hybrid levels.

Describe the solution you'd like

If I open the same data with data = cfgrib.open_datasets(filein, indexpath='temp/{short_hash}.idx') , it returns me a list of xarray datasets nicely groped.

data[0]

Dimensions: (hybrid: 60, values: 348528) Coordinates: time datetime64[ns] 2007-09-12 step timedelta64[ns] 06:00:00 hybrid (hybrid) float64 1.0 2.0 3.0 4.0 5.0 ... 57.0 58.0 59.0 60.0 latitude (values) float64 89.73 89.73 89.73 ... -89.73 -89.73 -89.73 longitude (values) float64 0.0 20.0 40.0 60.0 ... 280.0 300.0 320.0 340.0 valid_time datetime64[ns] 2007-09-12T06:00:00 Dimensions without coordinates: values Data variables: (12/19) t (hybrid, values) float32 ... q (hybrid, values) float32 ... aermr01 (hybrid, values) float32 ... aermr02 (hybrid, values) float32 ... aermr03 (hybrid, values) float32 ... aermr04 (hybrid, values) float32 ... ... ... no2 (hybrid, values) float32 ... so2 (hybrid, values) float32 ... co (hybrid, values) float32 ... hcho (hybrid, values) float32 ... go3 (hybrid, values) float32 ... aerext355 (hybrid, values) float32 ... Attributes: GRIB_edition: 2 GRIB_centre: ecmf GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts GRIB_subCentre: 0 Conventions: CF-1.7 institution: European Centre for Medium-Range Weather Forecasts data[1] Dimensions: (values: 348528) Coordinates: time datetime64[ns] 2007-09-12 step timedelta64[ns] 06:00:00 hybrid float64 1.0 latitude (values) float64 ... longitude (values) float64 ... valid_time datetime64[ns] ... Dimensions without coordinates: values Data variables: lnsp (values) float32 ... Attributes: GRIB_edition: 2 GRIB_centre: ecmf GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts GRIB_subCentre: 0 Conventions: CF-1.7 institution: European Centre for Medium-Range Weather Forecasts

Can we implement similar behaviour with xarray such that the user could select which data group she/he wants to load just by providing the index similar to index of list returned by cfgrib.open_datasets

Describe alternatives you've considered

No response

Additional context

No response

Organisation

EUMETSAT