ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes
Apache License 2.0
400 stars 77 forks source link

open_dataset crashes for coordinate grids with a single node #135

Closed fpavogt closed 3 years ago

fpavogt commented 4 years ago

Hello cfgrib gurus,

Experienced python user here, but complete novice to cfgrib and the GRIB format itself. I am trying to look at an ERA5 GRIB file in which the coordinate grid contains a unique latitude/longitude entry (i.e. looking at the ERA5 data for a single point on Earth).

Calling xr.open_dataset() gets me the following error:

~/anaconda3/lib/python3.7/site-packages/cfgrib/dataset.py in build_geography_coordinates(index, encode_cf, errors, log)
    366             dimensions=('latitude',), data=latitudes, attributes=COORD_ATTRS['latitude'].copy()
    367         )
--> 368         if latitudes[0] > latitudes[-1]:
    369             geo_coord_vars['latitude'].attributes['stored_direction'] = 'decreasing'
    370         geo_coord_vars['longitude'] = Variable(

IndexError: too many indices for array

I went in cfgrib/dataset.py to see if I might bypass this using an easy-but-far-from-perfect fix, doing:

if latitudes.ndim > 0: # <= this just by-passes the check
    if latitudes[0] > latitudes[-1]:
        geo_coord_vars['latitude'].attributes['stored_direction'] = 'decreasing'

However, this only displaces the problem further down the line, as I am then faced with the following error:

[...]
~/anaconda3/lib/python3.7/site-packages/xarray/backends/cfgrib_.py in open_store_variable(self, name, var)
     53         encoding["original_shape"] = var.data.shape
     54 
---> 55         return Variable(var.dimensions, data, var.attributes, encoding)
     56 
     57     def get_variables(self):

~/anaconda3/lib/python3.7/site-packages/xarray/core/variable.py in __init__(self, dims, data, attrs, encoding, fastpath)
    302         """
    303         self._data = as_compatible_data(data, fastpath=fastpath)
--> 304         self._dims = self._parse_dimensions(dims)
    305         self._attrs = None
    306         self._encoding = None

~/anaconda3/lib/python3.7/site-packages/xarray/core/variable.py in _parse_dimensions(self, dims)
    498             raise ValueError(
    499                 "dimensions %s must have the same length as the "
--> 500                 "number of data dimensions, ndim=%s" % (dims, self.ndim)
    501             )
    502         return dims

ValueError: dimensions ('latitude',) must have the same length as the number of data dimensions, ndim=0

To me, it looks like the "real" issue is that for coordinate grids with single nodes, the latitudes and longitudes values are stored in 0-dimensional arrays ?

Running Mac OS 10.13.6 with Python 3.7.3, cfgrib 0.9.8.1, and xarray 0.15.1 and pyeccodes (in place of the ecCodes library). For completeness:

$ python -m cfgrib selfcheck
Found: ecCodes v2.17.0.
Your system is ready.

Here is a demo ERA5 GRIB file that allows to reproduce the problem. ERA5_single-grid-point_test.grib.zip

fpavogt commented 4 years ago

For the record, the demo ERA5 GRIB file was downloaded using the following script:

import cdsapi

# Setup the client using my private key (can be obtained here:https://cds.climate.copernicus.eu/api-how-to)
cdsclient = cdsapi.Client(url="https://cds.climate.copernicus.eu/api/v2", 
                                         key="something_private")

cdsclient.retrieve('reanalysis-era5-pressure-levels',
    {'product_type': 'reanalysis', 'format': 'grib',
      'variable': [ 'temperature'],
      'pressure_level': ['1', '2', '3', '5', '7', '10', '20', '30', '50', '70', '100', '125', '150', '175', '200',
                                  '225', '250', '300', '350', '400', '450', '500', '550', '600', '650', '700', '750',
                                  '775', '800', '825', '850', '875', '900', '925', '950', '975', '1000'],
      'year': '2019', 'month': '10',
      'day': ['17'],
      'time': ['00:00', '01:00', '02:00', '03:00', '04:00', '05:00', '06:00', '07:00', '08:00',
                  '09:00', '10:00', '11:00', '12:00', '13:00', '14:00', '15:00', '16:00', '17:00',
                  '18:00', '19:00', '20:00', '21:00', '22:00', '23:00'],
        # Ask for an interpolation over a single location
        'area': [-24.62684, -70.40453 , -24.62684, -70.40453],
        'grid': [1, 1,]
    },
    'ERA5_single-grid-point_test.grib')
alexamici commented 3 years ago

@fpavogt I think @aurghs fixed this a while ago, I confirm that your sample file doesn't crash on master.

Hopefully a release is imminent!

iainrussell commented 3 years ago

Looks like this is what I fixed for #199.

alexamici commented 3 years ago

Woops, your are right, sorry! Credit where credit is due 😄

fpavogt commented 3 years ago

@alexamici: great ! Thanks a lot for the fix @iainrussell !