ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes
Apache License 2.0
407 stars 77 forks source link

Add option ignore_keys (similar to filter_keys) #285

Closed floriankrb closed 5 months ago

floriankrb commented 2 years ago

When a user wants to open a grib file containing multiple gribs with inconsistent values, cfgrib refuses (rightfully) to merge them, leading to an error such as:

DatasetBuildError: multiple values for unique key, try re-open the file with one of:
    filter_by_keys={'dataType': 'pf'}
    filter_by_keys={'dataType': 'cf'}

While this solves some issues, it is not enough when the user does not want to filter, but prefers to ignore this key. I would like to extend the filter_keys option and add a new ignore_keys option.

For a key in the list provided by the user in ignore_keys, cfgrib should force the value to None when reading these keys. And the error message above could become:

DatasetBuildError: multiple values for unique key, try re-open the file with one of:
    filter_by_keys={'dataType': 'pf'}
    filter_by_keys={'dataType': 'cf'}
    ignore_keys=['dataType']

This is related to https://github.com/ecmwf/climetlab/issues/33 (and maybe others, such as https://github.com/ecmwf/cfgrib/issues/263 and https://github.com/ecmwf/cfgrib/issues/268 and https://gis.stackexchange.com/questions/372729/unable-to-read-grib-datas-with-xarray).

EddyCMWF commented 5 months ago

Hi @floriankrb , I also could do with this feature, so I've linked a branch I have been working on. If you still have an active use case for this, maybe you could give it a go? https://github.com/ecmwf/cfgrib/tree/ignore_keys

floriankrb commented 5 months ago

@sandorkertesz would these ignore_keys still be needed in earthkit.data ?