Open larsbarring opened 12 months ago
While I am not a GriB expert, I nevertheless did some further investigations by comparing with the file for the preceding day (attached below). This sheds some light on where the problem might be. grib_dump -O
shows for the non-problematic file:
1-4 section5Length = 21 5 numberOfSection = 5 6-9 numberOfValues = 466641 10-11 dataRepresentationTemplateNumber = 0 [Grid point data - simple packing (grib2/tables/17/5.0.table) ] 12-15 referenceValue = 0 16-17 binaryScaleFactor = -31 18-19 decimalScaleFactor = 0 20 bitsPerValue = 24 21 typeOfOriginalFieldValues = 0 [Floating point (grib2/tables/17/5.1.table) ]
and for the problematic file:
1-4 section5Length = 21 5 numberOfSection = 5 6-9 numberOfValues = 466641 10-11 dataRepresentationTemplateNumber = 0 [Grid point data - simple packing (grib2/tables/17/5.0.table) ] 12-15 referenceValue = 0 16-17 binaryScaleFactor = -28 18-19 decimalScaleFactor = 0 20 bitsPerValue = 0 21 typeOfOriginalFieldValues = 0 [Floating point (grib2/tables/17/5.1.table) ]
As far as I understand, if zero bits are used to represent the data then section 7 will be empty, in which case the referenceValue
represents the whole field (taking a non-zero decimalScaleFactor
into account). This was commented by @greenlaw here.
We ended up monkey-patching this (and a couple other special cases) by updating DataProxy.__getitem__
(originally defined in iris_grib/message.py) as follows. Use at your own risk.
def __getitem__(self, keys):
"""
GRIB message data accessor.
This handles reconstruction of the data values from packged representations, including
some special cases:
1) reconstruction of a message's data when `codedValues` is missing entirely, and
2) treating 9999 as missing data when `missingValueManagementUsed==1`.
See:
https://apps.ecmwf.int/codes/grib/format/grib2/regulations/
https://apps.ecmwf.int/codes/grib/format/grib2/templates/5/
"""
# NB. Currently assumes that the validity of this interpretation
# is checked before this proxy is created.
message = self.recreate_raw()
sections = message.sections
bitmap_section = sections[6]
bitmap = self._bitmap(bitmap_section)
if "codedValues" not in sections[7].keys():
data_rep_template = sections[5]["dataRepresentationTemplateNumber"]
if data_rep_template not in (0, 40, 41, 42, 50):
raise TranslationError(
f"Reconstruction of missing codedValues for dataRepresentationTemplateNumber {data_rep_template} is unsupported"
)
reference_value = sections[5]["referenceValue"]
decimal_scale_factor = sections[5]["decimalScaleFactor"]
data = np.ones(self.shape) * reference_value / (10**decimal_scale_factor)
else:
data = sections[7]["codedValues"]
if bitmap is not None:
# Note that bitmap and data are both 1D arrays at this point.
if np.count_nonzero(bitmap) == data.shape[0]:
# Only the non-masked values are included in codedValues.
_data = np.empty(shape=bitmap.shape)
_data[bitmap.astype(bool)] = data
# `ma.masked_array` masks where input = 1, the opposite of
# the behaviour specified by the GRIB spec.
data = ma.masked_array(_data, mask=np.logical_not(bitmap), fill_value=np.nan)
else:
msg = "Shapes of data and bitmap do not match."
raise TranslationError(msg)
elif "missingValueManagementUsed" in sections[5].keys() and sections[5]["missingValueManagementUsed"] == 1:
# This appears to be required for reading certain complex-packing fields.
# Whether it is caused by a data encoding problem or a bug in eccodes is unclear.
# The relevant eccodes decoder logic can be found here (note that although the module
# is misnamed as '2order_packing', it is definitely the complex packing logic):
# https://github.com/ecmwf/eccodes/blob/ac303936267ae99a9b3ae103e7d2db74674098e9/src/grib_accessor_class_data_g22order_packing.cc#L601
data = ma.masked_array(data, mask=(data == 9999), fill_value=np.nan)
data = data.reshape(self.shape)
return data.__getitem__(keys)
@larsbarring and @greenlaw This issue should now be resolved thanks to #362.
This has been included in the 0.19.0 release, which is now available on conda-forge
and PyPI
.
Care to try it out and confirm whether this fixes your problem?
If so, feel free to close this issue 👍
@bjlittle thanks for taking care of this! Here is my test:
$ ipython
Python 3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:27:40) [GCC 11.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.10.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import iris
In [2]: print(iris.__version__)
3.8.0.dev52
In [3]: import iris_grib
/home/a001257/mambaforge/envs/scitools/lib/python3.11/site-packages/gribapi/__init__.py:23: UserWarning: ecCodes 2.31.0 or higher is recommended. You are running version 2.29.0
warnings.warn(
In [4]: print(iris_grib.__version__)
0.19.dev0
In [5]: d1 = iris.load_cube("sd_an_1961092306.grb").data
In [6]: print(d1.min(), d1.max())
0.0 0.0040847319178283215
In [7]: d2 = iris.load_cube("sd_an_1961092406.grb").data
In [8]: print(d2.min(), d2.max())
0.0 0.0
That is, the all-zero file is read without problems, as expected. :+1: :+1: :+1:
However, when looking at the PR (#362) I see that if the bitsPerValue
field is 0
then data
is filled with with zeros. This is of course right for my particular file (and probably most other). However, as @greenlaw notes the GRIB documentation allows the same type of "packing" if all values are the same, be it zeroes or something else. To be on the safe side you might want to change line 254 in message.py
to implement the formula from the GRIB doumentation, i.e something similar to what @greenlaw did (above):
reference_value = sections[5]["referenceValue"]
decimal_scale_factor = sections[5]["decimalScaleFactor"]
data = np.ones(self.shape) * reference_value / (10**decimal_scale_factor)
@greenlaw Do you have an example GRIB2 file with a single message that demonstrates this behaviour?
I'm happy to add this extension as a patch i.e., 0.19.1
@bjlittle @larsbarring Sorry, it was a while ago that I wrote that code, and I can't seem to find a file that demonstrates the non-zero missing codedValues
behavior. It's possible that eccodes
handles this behind the scenes and the formula I included above is unnecessary. If I'm able to find one I will let you know.
No problem @greenlaw, thanks for getting back to let me know :beers:
I guess that files with all values in a field be exactly the same non-zero value are not that easy to come by. Hence I have hacked the test file included in my first post:
import struct
import os
with open("sd_an_1961092406.grb", mode='rb') as file:
fileContent = file.read()
# position in file of referenceValue -- derived using grib_dump -O
refPosition = 16 + 21 + 81 + 34 + 11
# change referenceValue to a "reasonable number",
# I did not bother what it actually meant or how it was packed into grib
ref = struct.pack("f",1.9999)
new = fileContent[0:refPosition] + ref + fileContent[refPosition+4:]
with open("sd_an_1961092406__NEW.grb", mode='wb') as file:
file.write(new)
print("\nCheck that the new value landed where is supposed to (actual value is garbled):")
os.system("grib_dump -O sd_an_1961092406__NEW.grb | grep referenceValue")
print("\n\n\nAnd print how eccodes sees the data:")
os.system("grib_dump sd_an_1961092406__NEW.grb | tail -n 36")
import iris
import iris_grib
print(iris.__version__)
print(iris_grib.__version__)
cube = iris.load_cube("sd_an_1961092406__NEW.grb")
print(f"\n\nCUBE min = {cube.data.min()}, and max = {cube.data.max()}")
results in this printout:
Check that the new value landed where is supposed to (actual value is garbled):
12-15 referenceValue = -0.000482554
And print how eccodes sees the data:
# A bit map applies to this product and is specified in this Section (grib2/tables/17/6.0.table)
bitMapIndicator = 0;
bitmapPresent = 1;
values(466641) = {
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554,
-0.000482554, -0.000482554, -0.000482554, -0.000482554, -0.000482554
... 466541 more values
}
#-READ ONLY- maximum = -0.000482554;
#-READ ONLY- minimum = -0.000482554;
#-READ ONLY- average = -0.000482554;
#-READ ONLY- standardDeviation = 0;
#-READ ONLY- skewness = 0;
#-READ ONLY- kurtosis = 0;
#-READ ONLY- isConstant = 1;
#-READ ONLY- numberOfMissing = 0;
#-READ ONLY- getNumberOfValues = 466641;
}
/home/a001257/mambaforge/envs/scitools/lib/python3.11/site-packages/gribapi/__init__.py:23: UserWarning: ecCodes 2.31.0 or higher is recommended. You are running version 2.29.0
warnings.warn(
3.8.0.dev52
0.19.dev0
CUBE min = 0.0, and max = 0.0
Thanks very much @larsbarring! Resource permitting, we now have all we need to work on this
@trexfeathers -- thanks
and if you change my code as follows the numbers becomes as one expects:
# put a "reasonable number"
ref = struct.pack(">f", 1.999)
bin = struct.pack("h", 0)
new = fileContent[0:refPosition] + ref + bin +fileContent[refPosition+6:]
where the grib-dump output now shows 1.999
.... and now I happened to bounce into #265 ....
We have a file where all data is 0. This is encoded as a bitmap in section 6, while section 7 is empty. When reading this into iris we get the error as below. Eccodes is able to read this without problems so it appears that there is no problem with the file. It seems like this error is similar to what was reported in #131. The file is available at the end, although with the ending ".txt" appended to allow it to be uploaded.
sd_an_1961092406.grb.txt