CCI-Tools / cate

ESA CCI Toolbox (Cate)
MIT License
50 stars 15 forks source link

Data download failed because "variable latitude not equal across datasets" #836

Open forman opened 5 years ago

forman commented 5 years ago

Expected behavior

Dataset downloads should produce consitent time series.

Actual behavior

Stacking the downloaded dataset files for AOD ENVISAT MERIS orbit frequency (see related #835) along time dimension should always be possible, because their spatial grids are expected to be aligned.

Steps to reproduce the problem

Please follow steps as described in #835.

If the download does not time-out, it ends with an operation error "variable latitude not equal across datasets".

Specifications

Cate 2.0.0.dev25

Traceback

Traceback (most recent call last):
  File "D:\Projects\PycharmProjects\cate\cate\util\web\jsonrpchandler.py", line 209, in send_service_method_result
    result = future.result()
  File "D:\Miniconda3\envs\cate-env\lib\concurrent\futures\_base.py", line 425, in result
    return self.__get_result()
  File "D:\Miniconda3\envs\cate-env\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
  File "D:\Miniconda3\envs\cate-env\lib\concurrent\futures\thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "D:\Projects\PycharmProjects\cate\cate\util\web\jsonrpchandler.py", line 306, in call_service_method
    result = method(*method_params, monitor=monitor)
  File "D:\Projects\PycharmProjects\cate\cate\webapi\websocket.py", line 292, in set_workspace_resource
    monitor=monitor)
  File "D:\Projects\PycharmProjects\cate\cate\core\wsmanag.py", line 320, in set_workspace_resource
    workspace.execute_workflow(res_name=res_name, monitor=monitor)
  File "D:\Projects\PycharmProjects\cate\cate\core\workspace.py", line 662, in execute_workflow
    self.workflow.invoke_steps(steps, context=self._new_context(), monitor=monitor)
  File "D:\Projects\PycharmProjects\cate\cate\core\workflow.py", line 627, in invoke_steps
    steps[0].invoke(context=context, monitor=monitor)
  File "D:\Projects\PycharmProjects\cate\cate\core\workflow.py", line 318, in invoke
    self._invoke_impl(_new_context(context, step=self), monitor=monitor)
  File "D:\Projects\PycharmProjects\cate\cate\core\workflow.py", line 980, in _invoke_impl
    return_value = self._op(monitor=monitor, **input_values)
  File "D:\Projects\PycharmProjects\cate\cate\core\op.py", line 216, in __call__
    return_value = self._wrapped_op(**input_values)
  File "D:\Projects\PycharmProjects\cate\cate\ops\io.py", line 83, in open_dataset
    monitor=monitor)
  File "D:\Projects\PycharmProjects\cate\cate\core\ds.py", line 609, in open_dataset
    return data_source.open_dataset(time_range, region, var_names, monitor=monitor.child(20))
  File "D:\Projects\PycharmProjects\cate\cate\ds\local.py", line 187, in open_dataset
    monitor=monitor)
  File "D:\Projects\PycharmProjects\cate\cate\core\ds.py", line 686, in open_xarray_dataset
    **kwargs)
  File "D:\Miniconda3\envs\cate-env\lib\site-packages\xarray\backends\api.py", line 642, in open_mfdataset
    data_vars=data_vars, coords=coords)
  File "D:\Miniconda3\envs\cate-env\lib\site-packages\xarray\core\combine.py", line 436, in auto_combine
    for ds in grouped]
  File "D:\Miniconda3\envs\cate-env\lib\site-packages\xarray\core\combine.py", line 436, in <listcomp>
    for ds in grouped]
  File "D:\Miniconda3\envs\cate-env\lib\site-packages\xarray\core\combine.py", line 365, in _auto_concat
    return concat(datasets, dim=dim, data_vars=data_vars, coords=coords)
  File "D:\Miniconda3\envs\cate-env\lib\site-packages\xarray\core\combine.py", line 120, in concat
    return f(objs, dim, data_vars, coords, compat, positions)
  File "D:\Miniconda3\envs\cate-env\lib\site-packages\xarray\core\combine.py", line 276, in _dataset_concat
    'variable %s not equal across datasets' % k)
ValueError: variable latitude not equal across datasets
forman commented 5 years ago

This issue is similar to #832, as the cause is a misinterpretation of lat/lon coordinate variables. The variables of AOD ENVISAT MERIS orbit frequency products are all 1D, so are lon/lat coord vars:

>>> ds = xr.open_dataset("~/.cate/data_stores/local/local.esacci.AEROSOL.satellite-orbit-frequency.L2P.AOD.MERIS.Envisat.MERIS_ENVISAT.2-2.r1.85c9886a-d2d5-32dc-9fb6-030dd198a7c8/20080101071710-ESACCI-L2P_AEROSOL-AOD-MERIS_ENVISAT-ALAMO-fv2.2.nc")
>>> ds.coords
Coordinates:
  * pixel_number  (pixel_number) float64 0.0 1.0 2.0 ... 1.426e+04 1.426e+04
    latitude      (pixel_number) float32 ...
    longitude     (pixel_number) float32 ...
>>> ds.data_vars
Data variables:
    iline                    (pixel_number) float64 ...
    icolumn                  (pixel_number) float64 ...
    pixel_corner_latitude1   (pixel_number) float32 ...
    pixel_corner_latitude2   (pixel_number) float32 ...
    pixel_corner_latitude3   (pixel_number) float32 ...
    pixel_corner_latitude4   (pixel_number) float32 ...
    pixel_corner_longitude1  (pixel_number) float32 ...
    pixel_corner_longitude2  (pixel_number) float32 ...
    pixel_corner_longitude3  (pixel_number) float32 ...
    pixel_corner_longitude4  (pixel_number) float32 ...
    AOD550                   (pixel_number) float32 ...
    AOD550_std               (pixel_number) float32 ...
    AOD865                   (pixel_number) float32 ...
    AOD865_std               (pixel_number) float32 ...
    fAOD550                  (pixel_number) float32 ...
    fAOD550_std              (pixel_number) float32 ...
    fAOD865                  (pixel_number) float32 ...
    fAOD865_std              (pixel_number) float32 ...
    R_eff                    (pixel_number) float32 ...
    R_eff_std                (pixel_number) float32 ...
    Aerosol_Altitude         (pixel_number) float32 ...
    Aerosol_Altitude_std     (pixel_number) float32 ...

The error occurs in Cate while opening multiple files in xr.open_mfdataset(). To stack multiple AOD orbit frequency products along time dimension, we must also stack the 'lon' and 'lat' variables. This could be done by moving them from coordinate variables into the data variables of an xarray dataset. The criterion for lon/lat being real coordinates to be excluded from stacking is that their dimension names are lonand lat too: lon(lon)and lat(lat). In the case AOD orbit frequency products it is instead longitude(pixel_number) and latitude(pixel_number).

forman commented 5 years ago

For time being a solution could be to display a better error message, e.g. "Unable to concatenate time steps, because their coordinate variables are not equal."