ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes
Apache License 2.0
407 stars 77 forks source link

Reading some variables in GFS files (with heightAboveGround) #263

Closed matteodefelice closed 1 year ago

matteodefelice commented 3 years ago

Perhaps this is linked to https://github.com/ecmwf/cfgrib/issues/75 but I am using the latest version of cfgrib and I cannot read some variables in a GFS file. I am trying to access this: https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfs.20211014/18/atmos/gfs.t18z.pgrb2.0p25.f027

If I try to read it:

cfgrib.open_dataset('gfs.t18z.pgrb2.0p25.f027')
---------------------------------------------------------------------------
DatasetBuildError                         Traceback (most recent call last)
~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in build_dataset_components(index, errors, encode_cf, squeeze, log, read_keys, time_dims, extra_coords)
    640                 time_dims=time_dims,
--> 641                 extra_coords=extra_coords,
    642             )

~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in build_variable_components(index, encode_cf, filter_by_keys, log, errors, squeeze, read_keys, time_dims, extra_coords)
    471 ) -> T.Tuple[T.Dict[str, int], Variable, T.Dict[str, Variable]]:
--> 472     data_var_attrs = enforce_unique_attributes(index, DATA_ATTRIBUTES_KEYS, filter_by_keys)
    473     grid_type_keys = GRID_TYPE_MAP.get(index.getone("gridType"), [])

~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in enforce_unique_attributes(index, attributes_keys, filter_by_keys)
    272                 fbks.append(fbk)
--> 273             raise DatasetBuildError("multiple values for key %r" % key, key, fbks)
    274         if values and values[0] not in ("undef", "unknown"):

DatasetBuildError: multiple values for key 'typeOfLevel'

During handling of the above exception, another exception occurred:

DatasetBuildError                         Traceback (most recent call last)
<ipython-input-12-8e07db70dea6> in <module>
----> 1 cfgrib.open_dataset('gfs.t18z.pgrb2.0p25.f027')

~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/xarray_store.py in open_dataset(path, **kwargs)
     36         raise ValueError("only engine=='cfgrib' is supported")
     37     kwargs["engine"] = "cfgrib"
---> 38     return xr.open_dataset(path, **kwargs)  # type: ignore
     39
     40

~/miniconda3/envs/pydev/lib/python3.7/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, group, decode_cf, mask_and_scale, decode_times, autoclose, concat_characters, decode_coords, engine, chunks, lock, cache, drop_variables, backend_kwargs, use_cftime, decode_timedelta)
    570
    571         opener = _get_backend_cls(engine)
--> 572         store = opener(filename_or_obj, **extra_kwargs, **backend_kwargs)
    573
    574     with close_on_error(store):

~/miniconda3/envs/pydev/lib/python3.7/site-packages/xarray/backends/cfgrib_.py in __init__(self, filename, lock, **backend_kwargs)
     43             lock = ECCODES_LOCK
     44         self.lock = ensure_lock(lock)
---> 45         self.ds = cfgrib.open_file(filename, **backend_kwargs)
     46
     47     def open_store_variable(self, name, var):

~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in open_file(path, grib_errors, indexpath, filter_by_keys, read_keys, time_dims, extra_coords, **kwargs)
    718     return Dataset(
    719         *build_dataset_components(
--> 720             index, read_keys=read_keys, time_dims=time_dims, extra_coords=extra_coords, **kwargs
    721         )
    722     )

~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in build_dataset_components(index, errors, encode_cf, squeeze, log, read_keys, time_dims, extra_coords)
    652                 fbks.append(fbk)
    653                 error_message += "\n    filter_by_keys=%r" % fbk
--> 654             raise DatasetBuildError(error_message, key, fbks)
    655         short_name = data_var.attributes.get("GRIB_shortName", "paramId_%d" % param_id)
    656         var_name = data_var.attributes.get("GRIB_cfVarName", "unknown")

DatasetBuildError: multiple values for unique key, try re-open the file with one of:
    filter_by_keys={'typeOfLevel': 'meanSea'}
    filter_by_keys={'typeOfLevel': 'hybrid'}
    filter_by_keys={'typeOfLevel': 'atmosphere'}
    filter_by_keys={'typeOfLevel': 'surface'}
    filter_by_keys={'typeOfLevel': 'unknown'}
    filter_by_keys={'typeOfLevel': 'isobaricInPa'}
    filter_by_keys={'typeOfLevel': 'isobaricInhPa'}
    filter_by_keys={'typeOfLevel': 'heightAboveGround'}
    filter_by_keys={'typeOfLevel': 'depthBelowLandLayer'}
    filter_by_keys={'typeOfLevel': 'heightAboveSea'}
    filter_by_keys={'typeOfLevel': 'nominalTop'}
    filter_by_keys={'typeOfLevel': 'heightAboveGroundLayer'}
    filter_by_keys={'typeOfLevel': 'tropopause'}
    filter_by_keys={'typeOfLevel': 'maxWind'}
    filter_by_keys={'typeOfLevel': 'isothermZero'}
    filter_by_keys={'typeOfLevel': 'pressureFromGroundLayer'}
    filter_by_keys={'typeOfLevel': 'sigmaLayer'}
    filter_by_keys={'typeOfLevel': 'sigma'}
    filter_by_keys={'typeOfLevel': 'potentialVorticity'}

And If I try to: d = xr.open_dataset('gfs.t18z.pgrb2.0p25.f027', decode_cf = True, engine = 'cfgrib', backend_kwargs = {'filter_by_keys':{ ...: 'typeOfLevel': 'heightAboveGround'}} )

I get this error:

skipping variable: paramId==167 shortName='t2m' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==174096 shortName='sh2' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==168 shortName='d2m' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==260242 shortName='r2' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==260255 shortName='aptmp' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==3015 shortName='tmax' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==3016 shortName='tmin' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==165 shortName='u10' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=10.0) skipping variable: paramId==166 shortName='v10' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=10.0) skipping variable: paramId==131 shortName='u' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=('heightAboveGround',), data=array([20., 30., 40., 50., 80.])) skipping variable: paramId==132 shortName='v' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=('heightAboveGround',), data=array([20., 30., 40., 50., 80.])) skipping variable: paramId==130 shortName='t' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=('heightAboveGround',), data=array([ 80., 100.])) skipping variable: paramId==133 shortName='q' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=80.0) skipping variable: paramId==54 shortName='pres' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=80.0) skipping variable: paramId==228246 shortName='u100' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=100.0) skipping variable: paramId==228247 shortName='v100' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=100.0)

iainrussell commented 3 years ago

Hi @matteodefelice,

I've downloaded a similar GRIB file and can see what the problem is. With a grib_ls, we can see the messages that are being filtered:

grib_ls -ptypeOfLevel,level,shortName ./gfs.t18z.pgrb2.0p25.f027 | grep heightAboveGround
heightAboveGround  4000         refd        
heightAboveGround  1000         refd        
heightAboveGround  2            2t          
heightAboveGround  2            2sh         
heightAboveGround  2            2d          
heightAboveGround  2            2r          
heightAboveGround  2            aptmp       
heightAboveGround  2            tmax        
heightAboveGround  2            tmin        
heightAboveGround  10           10u         
heightAboveGround  10           10v         
heightAboveGroundLayer  3000         hlcy        
heightAboveGroundLayer  6000         ustm        
heightAboveGroundLayer  6000         vstm        
heightAboveGround  20           u           
heightAboveGround  20           v           
heightAboveGround  30           u           
heightAboveGround  30           v           
heightAboveGround  40           u           
heightAboveGround  40           v           
heightAboveGround  50           u           
heightAboveGround  50           v           
heightAboveGround  80           t           
heightAboveGround  80           q           
heightAboveGround  80           pres        
heightAboveGround  80           u           
heightAboveGround  80           v           
heightAboveGround  100          t           
heightAboveGround  100          100u        
heightAboveGround  100          100v  

Now, cfgrib does not like variables with different coordinates (see also #13). In this case, it will first read variable 'refd' and take the level coordinates to be 1000 and 4000. Then it hits variable '2t' with a level of 2. This will not work - see the note in the readme: https://github.com/ecmwf/cfgrib#filter-heterogeneous-grib-files

So you will need to also filter by variable to get only those that have the same levels. Before you ask, I'm not sure how to handle the awkward case of u/u10/u100 having different names and paramIds!

I do hope that this helps though.

Best regards, Iain

matteodefelice commented 3 years ago

Thanks a lot, actually I have solved the issue doing some pre-processing with wgrib2 but I'd have loved using only python.

iainrussell commented 3 years ago

Ok, no problem. Metview's Python interface can also be used for pre-processing, but it requires some binaries to be installed (usually through conda). We are, however, discussing some equivalent pure Python features that could help in this sort of situation, because its interface for GRIB handling is very good.

I'll close this issue now though.

aguerrero217 commented 1 year ago

Other Option is filter by levels The problem is that there are too many levels. So you can obtain the data by filter Here is an example :

For the 2m height in Python2 looks like this:

data_2maboveground=xarray.open_dataset(local_filename,engine="cfgrib" ,filter_by_keys={'typeOfLevel': 'heightAboveGround','level':2})

For the 2m height in Python3 looks like this:

data_2maboveground=xarray.open_dataset(local_filename,engine="cfgrib" ,backend_kwargs={'filter_by_keys':{'typeOfLevel': 'heightAboveGround','level':2}}) PD. I added ('typeOfLevel':'heightAboveGround') cause I have more data in the GRIB