Closed kmpaul closed 6 months ago
@emfdavid ?
We added this in https://github.com/fsspec/kerchunk/pull/364 I see these errors when reading HRRR SUB Hourly grib2 files. Are you seeing other CFGRIB errors that are somehow recoverable/fixable in Kerchunk? This is somewhat unqiue in that we can repair it with care by taking the difference between the runtime and the valid time to get the step.
(Sorry for the delay. I am currently in GTC+2.)
@emfdavid: Thanks for the reply. Let me understand you completely. When you say, "I see these errors when reading HRRR SUB Hourly grib2 files," are you saying you see ValueError
s or eccodes.WrongStepUnitError
s?
I am also running with subhourly GRIB2 files, and I see ValueError
s arising from cfgrib
due to the fact that the step
attribute of the CfMessage
has a str
data type, which you expect with subhourly data. In the m.get("step")
call, this produces a ValueError
when cfgrib
tries to compute int(message[step_key])
in cfgrib.cfmessage.from_grib_step()
. Full traceback below:
(Note, this is with Python 3.10, cfgrib 0.9.11.0, and eccodes 1.7.0.)
I never see eccodes.WrongStepUnitError
s.
(Incidentally, I know that there is a PR currently in to cfgrib
to fix this: https://github.com/ecmwf/cfgrib/pull/371, but I don't know when it will be merged and released. Note that this is not a bug in eccodes, but a feature of the newest version of eccodes that is not fully accepted in cfgrib yet.)
Are you seeing other CFGRIB errors that are somehow recoverable/fixable in Kerchunk?
I only ever see the ValueError
s, but they can be avoided in 2 ways:
hack cfgrib.cfmessage.from_grib_step()
to change the default values of the step_key
parameter from "endStep"
to "endStep:int"
(and for completeness, do the same for the step_unit_key
parameter and make the same changes to cfgrib.cfmessage.to_grib_step()
).
modify scan_grib
to deal with the special edge case of coord2 == "step"
and change it to coord2 = "step:int"
.
The first option pushes the changes to cfgrib (which is coming in ecmwf/cfgrib#371, but who knows when), and the second option hacks kerchuck
to deal with the issue temporarily while the upstream issues are resolved.
If you like, I can put together a PR that essentially deals with this edge case, something like:
coord2 = {"latitude": "latitudes", "longitude": "longitudes", "step": "step:int"}.get(coord, coord)
try:
x = m.get(coord2)
except eccodes.WrongStepUnitError as e:
logger.warning(
"Ignoring coordinate '%s' for varname '%s', raises: eccodes.WrongStepUnitError(%s)",
coord2,
varName,
e,
)
continue
That is cool - I poked around for a while and gave up. This looks much better.
If I understand correctly, we can put your patch in place in kerchunk now, and it should be forward compatible with cfgrib after the problem is corrected there. Then we can leave your patch in kerchunk in place until most users are up to date with cfgrib and eccodes updates.
Sounds like a plan!
I just create PR #450.
See comments in #450 for updates.
TL;DR: This bug only comes up because of changes in the most recent versions of eccodes
(2.34+).
In the
kerchunk.grib2.scan_grib()
function, there is a loop when processing the coordinate variables for a message, and there is atry
/except
block in this loop:https://github.com/fsspec/kerchunk/blob/a0c4f3b828d37f6d07995925b324595af68c4a19/kerchunk/grib2.py#L258-L267
that appears to be intended to skip coordinate variables that produce errors when retrieving the coordinate information with
cfgrib
(them
object is acfgrib.cfmessage.CfMessage
object). However, exceptions produced in thistry
block never get caught becausecfgrib
only ever raisesValueError
orAssertionError
exceptions (at least with the recent versions ofcfgrib
).Should this
try
block be modified to catchValueError
exceptions instead?