ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes
Apache License 2.0
409 stars 77 forks source link

Handling of missing values #304

Closed huanglangwen closed 2 years ago

huanglangwen commented 2 years ago

When loading grib files with values around 9999, those values are masked to np.nan by cfgrib in the following line:

https://github.com/ecmwf/cfgrib/blob/0834b19d56ab2a817cc5a3309149806afdcf0ced/cfgrib/dataset.py#L358

Is there a way to set missing values manually to avoid this case?

Thanks a lot!

shahramn commented 2 years ago

In ecCodes you can set the key "missingValue" to some other value (rather than the default 9999) to make sure it does not interfere with actual field values. For example 1.0e36

shahramn commented 2 years ago

Also see https://confluence.ecmwf.int/display/ECC/grib_set_bitmap#Python https://confluence.ecmwf.int/display/ECC/grib_iterator#Python

huanglangwen commented 2 years ago

Thank you! But setting missingValue to the grib file has a lot of overheads as we have 100GB+ grib files and there's no way to set it inplace. I'm wondering if cfgrib can provide an option to override the missingValue setting like the grib_get_data -m does.

iainrussell commented 2 years ago

Thank you for this report - indeed, I think we need to make a change in the internals of cfgrib to use an 'out of range' missing value indicator. Stay tuned!

iainrussell commented 2 years ago

Fixed - same resolution as for #313