ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes
Apache License 2.0
408 stars 77 forks source link

Value error on CfMessage `step` key #369

Closed rabernat closed 9 months ago

rabernat commented 9 months ago

What happened?

I encountered the grib message attached to this issue in a WRF dataset. Parsing this message with cfgrib fails when attempting to access the step field. The error occurs because the message endStep field evaluates to the string unavailable. This in turn triggers an error in Kerchunk here.

This seems somewhat similar to #335.

What are the steps to reproduce the bug?

tmp.grib.zip

import eccodes
import cfgrib

with open('tmp.grib', mode='rb') as fp2:
    mid = eccodes.codes_new_from_file(fp2, eccodes.CODES_PRODUCT_GRIB)

m = cfgrib.cfmessage.CfMessage(mid)
m['step']

this produces the following stack trace

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 m['step']

File /srv/conda/envs/notebook/lib/python3.11/site-packages/cfgrib/messages.py:213, in ComputedKeysMessage.__getitem__(self, item)
    211 if item in self.computed_keys:
    212     getter, _ = self.computed_keys[item]
--> 213     return getter(self)
    214 else:
    215     return super(ComputedKeysMessage, self).__getitem__(item)

File /srv/conda/envs/notebook/lib/python3.11/site-packages/cfgrib/cfmessage.py:97, in from_grib_step(message, step_key, step_unit_key)
     95     raise ValueError("unsupported stepUnit %r" % step_unit)
     96 assert isinstance(to_seconds, int)  # mypy misses this
---> 97 return int(message[step_key]) * to_seconds / 3600.0

ValueError: invalid literal for int() with base 10: 'unavailable'

Digging deeper, this it because m['endStep'] evaluates to the string unavailable rather than an int, as expected by cfgrib.

Version

0.9.10.4

Platform (OS and architecture)

x86_64 GNU/Linux

Relevant log output

No response

Accompanying data

Attached above

Organisation

Earthmover PBC

shahramn commented 9 months ago

The Product Definition Section (section 4) is badly encoded. It has the key "n" set to zero This key is at octet 42 and its description is (according to WMO):

"number of time range specifications describing the time intervals used to calculate the statistically processed field"

This has to be at least 1

If you try to decode this with "wgrib2", it also fails:

% wgrib2 tmp.grib 
pdt_len: bad stat_proc ranges = 0 set to to 1
** ERROR bad grib message: Statistical Processing bad n=0 **
1:0:d=2022041400:PRATE:surface::
*** FATAL ERROR (delayed): forecast time for tmp.grib
shahramn commented 9 months ago

And Panoply also complains: Cannot invoke ucar.nc2.CalendarPeriod.millisecs() because "from" is null

It cannot even load the message

rabernat commented 9 months ago

Thanks for your reply. I will try to understand why WRF is producing this invalid message.