Unable to retrieve all variables with cfgrib

ChristopheLRTE commented 2 years ago

Hello,

I'm trying to handle a Grib2 file from which I can extract all the data using directly the command '_grib_getdata -s stepUnits=0 out.grb'.

So it seems I don't need all stepUnits.

In the other hand, when using cfgrib it is impossible to read it fully. Here is the command I tried first :

import cfgrib
cfgrib.open_datasets('out.grb')

And here is the result :

<ipython-input-36-ccea25a1a3be> in <module>
      1 import cfgrib
----> 2 cfgrib.open_datasets('out.grb')

~/.local/lib/python3.6/site-packages/cfgrib/xarray_store.py in open_datasets(path, backend_kwargs, **kwargs)
    100     backend_kwargs = backend_kwargs.copy()
    101     backend_kwargs["squeeze"] = False
--> 102     datasets = open_variable_datasets(path, backend_kwargs=backend_kwargs, **kwargs)
    103 
    104     type_of_level_datasets = {}  # type: T.Dict[str, T.List[xr.Dataset]]

~/.local/lib/python3.6/site-packages/cfgrib/xarray_store.py in open_variable_datasets(path, backend_kwargs, **kwargs)
     88         bk["filter_by_keys"] = backend_kwargs.get("filter_by_keys", {}).copy()
     89         bk["filter_by_keys"]["paramId"] = param_id
---> 90         datasets.extend(raw_open_datasets(path, bk, **kwargs))
     91     return datasets
     92 

~/.local/lib/python3.6/site-packages/cfgrib/xarray_store.py in raw_open_datasets(path, backend_kwargs, **kwargs)
     63     datasets = []
     64     try:
---> 65         datasets.append(open_dataset(path, backend_kwargs=backend_kwargs, **kwargs))
     66     except DatasetBuildError as ex:
     67         fbks.extend(ex.args[2])

~/.local/lib/python3.6/site-packages/cfgrib/xarray_store.py in open_dataset(path, **kwargs)
     36         raise ValueError("only engine=='cfgrib' is supported")
     37     kwargs["engine"] = "cfgrib"
---> 38     return xr.open_dataset(path, **kwargs)  # type: ignore
     39 
     40 

~/.local/lib/python3.6/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, group, decode_cf, mask_and_scale, decode_times, autoclose, concat_characters, decode_coords, engine, chunks, lock, cache, drop_variables, backend_kwargs, use_cftime, decode_timedelta)
    570 
    571         opener = _get_backend_cls(engine)
--> 572         store = opener(filename_or_obj, **extra_kwargs, **backend_kwargs)
    573 
    574     with close_on_error(store):

~/.local/lib/python3.6/site-packages/xarray/backends/cfgrib_.py in __init__(self, filename, lock, **backend_kwargs)
     43             lock = ECCODES_LOCK
     44         self.lock = ensure_lock(lock)
---> 45         self.ds = cfgrib.open_file(filename, **backend_kwargs)
     46 
     47     def open_store_variable(self, name, var):

~/.local/lib/python3.6/site-packages/cfgrib/dataset.py in open_file(path, grib_errors, indexpath, filter_by_keys, read_keys, time_dims, extra_coords, **kwargs)
    718     return Dataset(
    719         *build_dataset_components(
--> 720             index, read_keys=read_keys, time_dims=time_dims, extra_coords=extra_coords, **kwargs
    721         )
    722     )

~/.local/lib/python3.6/site-packages/cfgrib/dataset.py in build_dataset_components(index, errors, encode_cf, squeeze, log, read_keys, time_dims, extra_coords)
    639                 read_keys=read_keys,
    640                 time_dims=time_dims,
--> 641                 extra_coords=extra_coords,
    642             )
    643         except DatasetBuildError as ex:

~/.local/lib/python3.6/site-packages/cfgrib/dataset.py in build_variable_components(index, encode_cf, filter_by_keys, log, errors, squeeze, read_keys, time_dims, extra_coords)
    499         }
    500         attributes.update(COORD_ATTRS.get(coord_name, {}).copy())
--> 501         data = np.array(sorted(values, reverse=attributes.get("stored_direction") == "decreasing"))
    502         dimensions = (coord_name,)  # type: T.Tuple[str, ...]
    503         if squeeze and len(values) == 1:

TypeError: '<' not supported between instances of 'float' and 'str'

After investigating, I tried to filter like that :

cfgrib.open_datasets('out.grb', backend_kwargs={'filter_by_keys': {'stepUnits': 0}})

It works, BUT I do not retrieve all data. Then I tried with :

cfgrib.open_datasets('out.grb', backend_kwargs={'filter_by_keys': {'stepUnits': 1}})

And it fails with the same above error.

So my question is why I would need to get stepUnits=1 (which seems to generate en error) with cfgrib while with eccodes binary grib_get_data there is no problem ?

And how could I do to use cfgrib without making distinction with stepUnits please?

Thanks for your help, Christophe

iainrussell commented 2 years ago

Hello @ChristopheLRTE,

Although I don't have a mixed step type file here to test with, I think the problem is that stepType's 'native type' in ecCodes is string, not a number. And when cfgrib is performing its filtering, it is extracting GRIB keys in their native type and then comparing against your value of zero (see the error message about comparing str with float). My guess is that {'stepUnits': 'instant'} should work. You can check the strings via

grib_ls -p stepType <gribfile>

Good luck! Iain

ChristopheLRTE commented 2 years ago

Hello @iainrussell,

You point to the problem which I simply cannot solve in my opinion. It is precisely on messages for which the stepType is 'accum' (and not 'instant') that there is a problem :

$ grib_ls -p shortName,stepType,stepUnits -w shortName=ssrd out.grb
out.grb
shortName   stepType    stepUnits
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
ssrd        accum       1
24 of 101 messages in out.grb

24 of 101 total messages in 1 files

$ grib_ls -p stepRange,endStep -w shortName=ssrd out.grb
out.grb
stepRange   endStep
ECCODES ERROR   :  unable to convert endStep in stepUnits
ECCODES ERROR   :  unable to get endStep as long (Wrong units for step (step must be integer))
Wrong units for step (step must be integer) stepRange

BUT IF I force stepUnits to 0, it's ok and I get this :

[infoger@PF9SOODESS431 arrow]$ grib_ls -s stepUnits=0 -p stepRange,endStep -w shortName=ssrd out.grb
out.grb
stepRange   endStep
0-15        15
0-30        30
0-45        45
0-60        60
0-75        75
0-90        90
0-105       105
0-120       120
0-135       135
0-150       150
0-165       165
0-180       180
0-195       195
0-210       210
0-225       225
0-240       240
0-255       255
0-270       270
0-285       285
0-300       300
0-315       315
0-330       330
0-345       345
0-360       360
24 of 101 messages in out.grb

24 of 101 total messages in 1 files

I would like to translate this behavior with 'cfgrib' rather than with 'grib_ls', 'grib_get', 'grib_get_data', ...

I don't really know if I'm clear :-)

iainrussell commented 2 years ago

Hi @ChristopheLRTE ,

Would you be able to attach a small subset of this GRIB file (or the whole thing if it's not very large) please? I think we will need to have a closer look!

Cheers, Iain

ChristopheLRTE commented 2 years ago

@iainrussell,

Here it is (only two step from the original) : out.zip

Thanks for your help, Christophe

shahramn commented 2 years ago

By the way stepUnits = 0 is the same as "minutes". You can use "m" which is more readable E.g.

% grib_ls -j -s stepUnits=m -ntime out.grb
  {
    "dataDate": 20211124,
    "dataTime": "0000",
    "stepType": "accum",
    "stepUnits": "m",
    "startStep": 0,
    "endStep": 15,
    "stepRange": "0-15",
    "validityDate": 20211124,
    "validityTime": 15
  },
  {
    "dataDate": 20211124,
    "dataTime": "0000",
    "stepType": "accum",
    "stepUnits": "m",
    "startStep": 0,
    "endStep": 30,
    "stepRange": "0-30",
    "validityDate": 20211124,
    "validityTime": 30
  }

ChristopheLRTE commented 2 years ago

Thanks @shahramn, but I would like to use cfgrib instead of grib_ls or grib_get_data, etc.

Moreover, cfgrib doesn't support (as far as I know ...) forcing reading grib with stepUnits to 0 (_just filtering on it with filter_bykey,s but I do not retrieve all data)

iainrussell commented 2 years ago

For @shahramn (but not on the weekend!!) - it's true that there's something dodgy here, whether it's the GRIB file itself, or ecCodes. Download the GRIB file attached to this issue and see:

grib_ls -pstartStep out.grb 
out.grb
startStep   
0          
0          
2 of 2 messages in out.grb

grib_ls -pendStep out.grb 
out.grb
endStep     
ECCODES ERROR   :  unable to convert endStep in stepUnits
Wrong units for step (step must be integer) endStep

shahramn commented 2 years ago

For @iainrussell The startStep and endStep keys are worked out from the stepRange key. For the 2 messages here we have: stepRange = 0-15 (in minutes) stepRange = 0-30 (in minutes)

And stepRange is made up of: (startStep)-(endStep)

So if you try to evaluate startStep which is just 0, there is no problem because 0 in minutes and hours is the same. But when it comes to evaluating the endStep it fails to convert 15 mins and 30 mins to hours as an integer. The default value of stepUnits being hours and these keys are always integers (not doubles)

Also see here: https://confluence.ecmwf.int/display/UDOC/What+is+the+GRIB+key+stepUnits+-+ecCodes+GRIB+FAQ

I hope this helps.

iainrussell commented 2 years ago

Thanks Shahram, good observation. In this case, I'm not so sure that we can solve it in cfgrib without something additional in the API so that users can specify the units, or maybe we can do it with some extra intelligence when cfgrib grabs the step from the data. In either case, I don't see a way to do it today without a bit of development in cfgrib.

ChristopheLRTE commented 2 years ago

Hello everyone,

Inside my local messages.py cfgrib Python file, if I modified the from_filestream() method from FileIndex class like that, it works fine for me (it could be a parameter we could pass ?) :

@classmethod
def from_filestream(cls, filestream, index_keys):
    # type: (FileStream, T.Sequence[str]) -> FileIndex
    offsets = {}  # type: T.Dict[T.Tuple[T.Any, ...], T.List[OffsetType]]
    index_keys = list(index_keys)
    count_offsets = {}  # type: T.Dict[int, int]
    header_values_cache = {}  # type: T.Dict[T.Tuple[T.Any, type], T.Any]
    for offset_field, message in filestream.items():

        # MY UPDATE START
        message['stepUnits'] = 0
        # MY UPDATE END

        header_values = []
        for key in index_keys:
            try:
                value = message[key]
            except:
                value = "undef"
            if isinstance(value, (np.ndarray, list)):
                value = tuple(value)
            # NOTE: the following ensures that values of the same type that evaluate equal are
            #   exactly the same object. The optimisation is especially useful for strings and
            #   it also reduces the on-disk size of the index in a backward compatible way.
            value = header_values_cache.setdefault((value, type(value)), value)
            header_values.append(value)
        offsets.setdefault(tuple(header_values), []).append(offset_field)
    self = cls(filestream=filestream, index_keys=index_keys, offsets=list(offsets.items()))
    # record the index protocol version in the instance so it is dumped with pickle
    return self

Tell me what you think about it please.

Thank you again, Christophe

shahramn commented 2 years ago

Another solution documented here: https://gist.github.com/erget/dc9ee910a8125af5bf36873c038eb429#file-working-with-gribs-with-stepunits-hours-ipynb

Please give that a try

ChristopheLRTE commented 2 years ago

Marvellous ! thanks !

I had to change the code a bit, but it's great!

#!/usr/bin/bash

def_path=$(codes_info -d | cut -d : -f 2)
expression="s/'stepUnits.table' = defaultStepUnits/'stepUnits.table' = indicatorOfUnitOfTimeRange/g"
#                                 ^^^^^^^^^^^^^^^^
mkdir -p definitions/grib2
templates=$(grep "stepUnits.table" $def_path/grib2/*.def | cut -d':' -f1 | awk -F'/' '{print $NF}')
for template in $templates
do
    new_path=definitions/grib2/$template
    cp $def_path/grib2/$template $new_path
    sed -i -e "$expression" $new_path
done

export ECCODES_DEFINITION_PATH=$(pwd)/definitions:$(codes_info -d)

The result :

(base) python@HP2:~/arome_pi$ grib_ls out.grb
out.grb
edition      centre       date         dataType     gridType     typeOfLevel  level        stepRange    shortName    packingType
2            lfpw         20211124     fc           regular_ll   surface      0            0-15         ssrd         grid_second_order
2            lfpw         20211124     fc           regular_ll   surface      0            0-30         ssrd         grid_second_order
2 of 2 messages in out.grb

2 of 2 total messages in 1 files

I understood the spirit but I did not understand how the templates are chosen by the framework. Would you know ?

Thanks again !

ChristopheLRTE commented 2 years ago

How the templates are chosen by the framework. Would you know ?

Any idea @shahramn, @iainrussell ?

Thanks again, Christophe

iainrussell commented 2 years ago

Hi @ChristopheLRTE,

I just wanted to comment on your modification to the code in cfgrib - I was indeed thinking that this could be passed as a backend argument. The thing that will make it complicated is that we are just finalising a big refactoring of the code that makes the GRIB access more abstract, and in the new code it may not be as clear how to get this working nicely in call cases. So in any case, we will not make any changes to the code until the refactoring is done, because merging will be a nightmare! If your fix works for you (and indeed it's a very reasonable fix just to get that data working) then stick with that, or the solution @shahramn mentioned, whichever you prefer, and we hope to get a proper solution in place in due course.

Cheers, Iain

ChristopheLRTE commented 2 years ago

@iainrussell,

Thank you for your explanation, I understand perfectly your point of view. Do not break all your work 😄 ! Thank you both again for your help. This issue can be closed for me (just a little explanation about Templates functioning would be welcome 😄).

Have a good day! Christophe

shahramn commented 2 years ago

How the templates are chosen by the framework. Would you know ?

Any idea @shahramn, @iainrussell ?

Thanks again, Christophe

@ChristopheLRTE Basically the ecCodes library has a set of definition files (which are ASCII) that define the keys of the GRIB message and some rules. Depending on the content of the message (say a given octet) it loads a given file. So for example if we have a GRIB edition 2 message, the files in the directory "definitions/grib2" are loaded. Also for the various sections there are different files defining the keys for that section, so for GRIB2 section 4 (Product Definition Section) we load either the instantaneous or interval-based files. I am simplifying things a lot here to keep it short. Hope this helps

ecmwf / cfgrib

Unable to retrieve all variables with cfgrib #270