meteofrance / meteonet

MeteoNet's toolbox and documentation
Other
120 stars 22 forks source link

Error when reading GRIB files from Météo France #20

Closed claustres closed 3 years ago

claustres commented 3 years ago

We started working with meteonet data for weather forecast, then wanted to buy more archived data from Météo France to cover our use cases. We now encounter some errors when reading the files as detailed in your documentation:

import xarray as xr 
data = xr.open_dataset('E:\Download\T47648_AROME0025.2017010100.grb', engine='cfgrib', backend_kwargs={"indexpath" : ""}) 
data

Here are the errors:

skipping variable: paramId==228228 shortName='tp'
Traceback (most recent call last):
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 653, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 584, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='step' value=Variable(dimensions=('step',), data=array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25.,
       26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38.,
       39., 40., 41., 42.])) new_value=Variable(dimensions=('step',), data=array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.,
       27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39.,
       40., 41., 42.]))
skipping variable: paramId==228164 shortName='tcc'
Traceback (most recent call last):
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 653, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 584, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='step' value=Variable(dimensions=('step',), data=array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25.,
       26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38.,
       39., 40., 41., 42.])) new_value=Variable(dimensions=('step',), data=array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.,
       27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39.,
       40., 41., 42.]))
skipping variable: paramId==3099 shortName='p3099'
Traceback (most recent call last):
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 653, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 584, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='step' value=Variable(dimensions=('step',), data=array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25.,
       26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38.,
       39., 40., 41., 42.])) new_value=Variable(dimensions=('step',), data=array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.,
       27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39.,
       40., 41., 42.]))
skipping variable: paramId==167 shortName='t2m'
Traceback (most recent call last):
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 653, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 584, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=10.0) new_value=Variable(dimensions=(), data=2.0)
skipping variable: paramId==157 shortName='r'
Traceback (most recent call last):
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 653, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 584, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=10.0) new_value=Variable(dimensions=(), data=2.0)
skipping variable: paramId==0 shortName='unknown'
Traceback (most recent call last):
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 653, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "D:\Applications\Anaconda3\envs\tf-gpu\lib\site-packages\cfgrib\dataset.py", line 584, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='step' value=Variable(dimensions=('step',), data=array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25.,
       26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38.,
       39., 40., 41., 42.])) new_value=Variable(dimensions=('step',), data=array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.,
       27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39.,
       40., 41., 42.]))

We tried this workaround without success. Any help on this is welcome, thanks.

claustres commented 3 years ago

It seems it's possible to read each variable separately like this without errors:

meteodata = xr.open_dataset('E:\Download\T47648_AROME0025.2017010100.grb', engine='cfgrib',
                            backend_kwargs={'filter_by_keys': {'cfVarName': 'tp'}})
meteodata = xr.open_dataset('E:\Download\T47648_AROME0025.2017010100.grb', engine='cfgrib',
                            backend_kwargs={'filter_by_keys': {'cfVarName': 'u10'}})
...
claustres commented 3 years ago

To get the list of variable names you can use the grib_dump tool.

larvorg commented 3 years ago

Hello, Thank you for the information. This grib file comes from MeteoNet ? Or it comes from the public data website of Meteo France ? Their data can have a different structure from ours in MeteoNet. So the reading can be different.

claustres commented 3 years ago

These files are coming from Meteo France when you buy weather forecast data (AROME 0.025° in my case).

larvorg commented 3 years ago

Ok. Could you send one sample file in attachment ? Maybe the coordinates names are not the same according to parameters ? For example, if the vertical levels are in meters, the vertical level name is 'heightAboveGround'. If it about isobar vertical levels (in hPa) it is called 'isobaricInhPa'...etc It is like this in MeteoNet, but the files are separated.

claustres commented 3 years ago

The file is big (~300MB), I will provide you with a link to download it in a private message on your slack.

benjamingorman commented 3 weeks ago

Almost exactly the same error crops up when using GRIB files from ECMWF's open dataset available here https://data.ecmwf.int/forecasts/20241014/00z/ifs/0p25/oper/ This is a 15 day short-term weather forecast.

I understand as @claustres said that reading variables one-at-a-time is an option to suppress the warning, but don't quite understand why it's happening in the first place.