jswhit / pygrib

Python interface for reading and writing GRIB data
https://jswhit.github.io/pygrib
MIT License
321 stars 95 forks source link

Weird behavior when writing and reopening gribfiles #253

Open dasarkisov opened 1 week ago

dasarkisov commented 1 week ago

1) Why does the pygrib writes the values 273.15 as 273.1499939? Here is what I have:

arr = [[273.15 273.15 273.15 273.15 273.15]
       [273.15 273.15 273.15 273.15 273.15]
       [273.15 273.15 273.15 273.15 273.15]
       [273.15 273.15 273.15 273.15 273.15]
       [273.15 273.15 273.15 273.15 273.15]]

Here is what is written to a grib-message in the end:

grb['values'] = arr
print(grb['values'])
>> [[273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]
    [273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]
    [273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]
    [273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]
    [273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]]

This is such a pain later on.

2) After some computations, I have the following array and again write it to a grib-message:

arr2 = [[254.13378296 254.73378296 252.03378296 250.03378296 248.93378296]
        [253.93378296 254.23378296 252.23378296 249.83378296 249.33378296]
        [252.73378296 252.93378296 249.93378296 248.73378296 248.33378296]
        [252.13378296 252.83378296 251.03378296 248.13378296 247.53378296]
        [251.43378296 251.93378296 252.33378296 251.73378296 251.63378296]]
grb2['values'] = arr2

Then I write grib-message to a grib-file, saving it as a new one, and then reopen the new grib-file only to behold the the following values:

print(grb3['values'])
>>[[254.13376465 254.73376465 252.03376465 250.03376465 248.93376465]
   [253.93376465 254.23376465 252.23376465 249.83376465 249.33376465]
   [252.73376465 252.93376465 249.93376465 248.73376465 248.33376465]
   [252.13376465 252.83376465 251.03376465 248.13376465 247.53376465]
   [251.43376465 251.93376465 252.33376465 251.73376465 251.63376465]]

Why on Earth the values are different from those I had been writing??

3) An now the most ridiculous thing. When I write an array like this:

[[273.15      273.15      268.7045166 265.8045166 264.8045166]
 [273.15      273.15      271.4045166 267.3045166 266.3045166]
 [273.15      273.15      268.3045166 266.0045166 265.0045166]
 [272.8045166 272.6045166 269.1045166 264.8045166 264.8045166]
 [272.5045166 272.8045166 273.15      273.15      273.15     ]]

save message, save grib-file, reopen again, here is what I get:

[[273.1045166 273.1045166 268.7045166 265.8045166 264.8045166]
 [273.1045166 273.1045166 271.4045166 267.3045166 266.3045166]
 [273.1045166 273.1045166 268.3045166 266.0045166 265.0045166]
 [272.8045166 272.6045166 269.1045166 264.8045166 264.8045166]
 [272.5045166 272.8045166 273.1045166 273.1045166 273.1045166]]

Value "273.15" against value "273.1045...". What the hell? I'm sure it's me missing something out...

4) What the hell is this??

print(arr)
>>[[273.15000001 273.15000001 273.15000001 273.15000001 273.15000001]
   [273.15000001 273.15000001 273.15000001 273.15000001 273.15000001]
   [273.15000001 273.15000001 273.15000001 273.15000001 273.15000001]
   [273.15000001 273.15000001 273.15000001 273.15000001 273.15000001]
   [273.15000001 273.15000001 273.15000001 273.15000001 273.15000001]]
grb['values'] = arr
print(grb['values'])
>>[[273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]
   [273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]
   [273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]
   [273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]
   [273.1499939 273.1499939 273.1499939 273.1499939 273.1499939]]
jswhit commented 1 week ago

grib2 compression is lossy, rewriting the same data multiple times will progressively lose information

dasarkisov commented 1 week ago

grib2 compression is lossy, rewriting the same data multiple times will progressively lose information

I follow the procedure, where I open a gribfile containing temperature field, perform some calculations with the field and write it back down. Then later I take the result and do the same procedure. I cannot afford to loose any information. What could you suggest for these actions?

jswhit commented 6 days ago

You can turn off compression by setting the dataRepresentationTemplateNumber key to 4 and then re-writing the file (there's a grib_repack utility that does this)

dasarkisov commented 5 days ago

grib_repack

Is it a command line program only? I wish I could do it in my main script. I tried to set the key in this way, but that wouldn't work...

grbs = pygrib.open('data/gfsT_2024-06-23_00')
for grb in grbs:
    grb['dataRepresentationTemplateNumber'] = 4

print(grb[1]['dataRepresentationTemplateNumber'])
>> 0
jswhit commented 3 days ago

take a look at the source code. Note that this only works for GRIB2 files.

dasarkisov commented 3 days ago

take a look at the source code. Note that this only works for GRIB2 files.

Yep, I had already had, thank you. At the bottom of the source code, there is just what I did above:

for grb in grbs:
    if grb['editionNumber'] != 2:
        sys.stdout.write('not a GRIB2 message, skipping ..\n')
        continue
    if grb['dataRepresentationTemplateNumber'] == ipack:
        sys.stdout.write('no repacking required, skipping ..\n')
    grb['dataRepresentationTemplateNumber'] = ipack 

But the 'dataRepresentationTemplateNumber' value (which should be 4 in my case) does not change. Should I ignore this and write the grib anyway?