ESMValGroup / ESMValCore

ESMValCore: A community tool for pre-processing data from Earth system models in CMIP and running analysis scripts.
https://www.esmvaltool.org
Apache License 2.0
42 stars 38 forks source link

Concatenation fails with ancilarry variables #1970

Closed Peter9192 closed 1 year ago

Peter9192 commented 1 year ago

I want to concatenate historical and future for a dataset with ancillary variables (because I need to calculate the area):

from esmvalcore.dataset import Dataset

dataset = Dataset(
    short_name='tos',
    mip='Omon',
    project='CMIP6',
    exp= ['historical', 'ssp585'],
    dataset='CESM2',
    ensemble='r4i1p1f1',
    grid='gn',
)

dataset.add_supplementary(short_name='areacello', mip='Ofx')
dataset.augment_facets()
cube = dataset.load()

Unfortunately this fails, because esmvalcore apparently tries to concatenate the fx variable along the time dimension:

Stacktrace ``` ERROR:esmvalcore.preprocessor:Failed to run preprocessor function 'concatenate' on the data [, ] loaded from original input file(s) [LocalFile('/home/peter/climate_data/CMIP6/CMIP/NCAR/CESM2/historical/r4i1p1f1/Ofx/areacello/gn/v20190308/areacello_Ofx_CESM2_historical_r4i1p1f1_gn.nc'), LocalFile('/home/peter/climate_data/CMIP6/ScenarioMIP/NCAR/CESM2/ssp585/r4i1p1f1/Ofx/areacello/gn/v20200528/areacello_Ofx_CESM2_ssp585_r4i1p1f1_gn.nc')] with function argument(s) --------------------------------------------------------------------------- CoordinateNotFoundError Traceback (most recent call last) File ~/esmvalgroup/ESMValCore/esmvalcore/preprocessor/_io.py:232, in concatenate(cubes) 231 try: --> 232 cubes = sorted(cubes, key=lambda c: c.coord("time").cell(0).point) 233 except iris.exceptions.CoordinateNotFoundError as exc: File ~/esmvalgroup/ESMValCore/esmvalcore/preprocessor/_io.py:232, in concatenate..(c) 231 try: --> 232 cubes = sorted(cubes, key=lambda c: c.coord("time").cell(0).point) 233 except iris.exceptions.CoordinateNotFoundError as exc: File ~/mambaforge/envs/esmvalcore/lib/python3.10/site-packages/iris/cube.py:2014, in Cube.coord(self, name_or_coord, standard_name, long_name, var_name, attributes, axis, contains_dimension, dimensions, coord_system, dim_coords, mesh_coords) 2010 emsg = ( 2011 f"Expected to find exactly 1 {bad_name!r} coordinate, " 2012 "but found none." 2013 ) -> 2014 raise iris.exceptions.CoordinateNotFoundError(emsg) 2016 return coords[0] CoordinateNotFoundError: "Expected to find exactly 1 'time' coordinate, but found none." During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) Cell In[1], line 15 13 dataset.add_supplementary(short_name='areacello', mip='Ofx') 14 dataset.augment_facets() ---> 15 cube = dataset.load() File ~/esmvalgroup/ESMValCore/esmvalcore/dataset.py:655, in Dataset.load(self) 642 def load(self) -> Cube: 643 """Load dataset. 644 645 Raises (...) 653 An :mod:`iris` cube with the data corresponding the the dataset. 654 """ --> 655 return self._load_with_callback(callback='default') File ~/esmvalgroup/ESMValCore/esmvalcore/dataset.py:667, in Dataset._load_with_callback(self, callback) 665 supplementary_cubes = [] 666 for supplementary_dataset in self.supplementaries: --> 667 supplementary_cube = supplementary_dataset._load(callback) 668 supplementary_cubes.append(supplementary_cube) 670 output_file = _get_output_file(self.facets, self.session.preproc_dir) File ~/esmvalgroup/ESMValCore/esmvalcore/dataset.py:736, in Dataset._load(self, callback) 731 result = [ 732 file.local_file(self.session['download_dir']) if isinstance( 733 file, esgf.ESGFFile) else file for file in self.files 734 ] 735 for step, kwargs in settings.items(): --> 736 result = preprocess( 737 result, 738 step, 739 input_files=self.files, 740 output_file=output_file, 741 debug=self.session['save_intermediary_cubes'], 742 **kwargs, 743 ) 745 cube = result[0] 746 return cube File ~/esmvalgroup/ESMValCore/esmvalcore/preprocessor/__init__.py:375, in preprocess(items, step, input_files, output_file, debug, **settings) 373 result = [] 374 if itype.endswith('s'): --> 375 result.append(_run_preproc_function(function, items, settings, 376 input_files=input_files)) 377 else: 378 for item in items: File ~/esmvalgroup/ESMValCore/esmvalcore/preprocessor/__init__.py:328, in _run_preproc_function(function, items, kwargs, input_files) 323 logger.debug( 324 "Running preprocessor function '%s' on the data\n%s%s\nwith function " 325 "argument(s)\n%s", function.__name__, pformat(items), file_msg, 326 kwargs_str) 327 try: --> 328 return function(items, **kwargs) 329 except Exception: 330 # To avoid very long error messages, we truncate the arguments and 331 # input files here at a given threshold 332 n_shown_args = 4 File ~/esmvalgroup/ESMValCore/esmvalcore/preprocessor/_io.py:236, in concatenate(cubes) 233 except iris.exceptions.CoordinateNotFoundError as exc: 234 msg = "One or more cubes {} are missing".format(cubes) + \ 235 " time coordinate: {}".format(str(exc)) --> 236 raise ValueError(msg) 238 # iteratively concatenate starting with first cube 239 result = cubes[0] ValueError: One or more cubes [, ] are missing time coordinate: "Expected to find exactly 1 'time' coordinate, but found none." ``` <\details>
bouweandela commented 1 year ago

As discussed offline, the supplementary dataset inherits the facets from the main variable if they're not specified. In this case, you want to concatenate the historical and ssp585 experiment for the main dataset, but only use one of the two for the supplementary dataset, e.g.:

from esmvalcore.dataset import Dataset

dataset = Dataset(
    short_name='tos',
    mip='Omon',
    project='CMIP6',
    exp= ['historical', 'ssp585'],
    dataset='CESM2',
    ensemble='r4i1p1f1',
    grid='gn',
)

dataset.add_supplementary(short_name='areacello', mip='Ofx', exp='historical')
cube = dataset.load()
Peter9192 commented 1 year ago

It makes sense once you get it