ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
224 stars 128 forks source link

Test esmvalcore=2.3.0 with current batch of ESMValTool recipes #2198

Closed valeriupredoi closed 3 years ago

valeriupredoi commented 3 years ago

Hey good peeps @ESMValGroup/esmvaltool-developmentteam we have released esmvalcore=2.3.0 yesterday but we didn't do a sanity check and run all the recipes from ESMValTool, there were some key changes pushed forward with the release and we have decided to perform this check anyway, just to be sure all's still up and running. So could we please ask you to grab one or more of your favourite recipes and run it with ESMValCore installed from either main or PyPi or conda, your choice - only be sure to have pulled the latest main and/or the latest package off PyPi/conda. A fresh installation of ESMValTool will bring forth the newly released ESMValCore if you remove the pin - esmvalcore>=2.2.0,<2.3 on 2.3 in environment.yml and setup.py. Cheers ever so much! Oh and feel free to use the bot if @nielsdrost could install the latest esmvalcore there? There will be weissbiers as rewards for you at the next (actual, 3D, face-to-face) meeting at DLR :grin: :beer:

bouweandela commented 3 years ago

The usual procedure is to update the ESMValTool repository so it uses the latest version of ESMValCore immediately after the ESMValcore release. That way there are two weeks in which diagnostic developers can try out if their recipes work with the latest version of the ESMValCore before we release ESMValTool.

Oh and feel free to use the bot if @nielsdrost could install the latest esmvalcore there?

The bot uses the branch in the ESMValTool repository that you're requesting a test for, so if that branch uses the latest version of ESMValCore, the bot will also use it.

Should we move this issue to the ESMValTool repository, as it is about testing recipes in ESMValTool?

valeriupredoi commented 3 years ago

Should we move this issue to the ESMValTool repository, as it is about testing recipes in ESMValTool?

yeah, I was thinking of moving it, good point, man!

schlunma commented 3 years ago

For recipe_schlund20jgr_gpp_change_1pct.yml, I got a

2021-06-17 11:04:44,917 UTC [12752] ERROR   Failed to run fix_metadata([<iris 'Cube' of gross_primary_productivity_of_carbon / (kg m-2 s-1) (time: 1680; latitude: 64; longitude: 128)>], {'project': 'CMIP5', 'short_name': 'gpp', 'mip': 'Lmon', 'exp': 'esmFixClim1', 'ense
mble': 'r1i1p1', 'ref': True, 'preprocessor': 'preproc_total_mean_flux_var', 'variable_group': 'ref', 'diagnostic': 'diag_gpp_fraction_mean', 'dataset': 'MIROC-ESM', 'start_year': 11, 'end_year': 20, 'recipe_dataset_index': 4, 'institute': ['MIROC'], 'alias': 'MIROC-ESM
', 'original_short_name': 'gpp', 'standard_name': 'gross_primary_productivity_of_carbon', 'long_name': 'Carbon Mass Flux out of Atmosphere due to Gross Primary Production on Land', 'units': 'kg m-2 s-1', 'modeling_realm': ['land'], 'frequency': 'mon', 'filename': '/scra
tch/b/b309141/work_v2/recipe_schlund20jgr_gpp_change_1pct_20210617_110434/preproc/diag_gpp_fraction_mean/ref/CMIP5_MIROC-ESM_Lmon_esmFixClim1_r1i1p1_gpp_11-20.nc', 'check_level': <CheckLevels.DEFAULT: 3>})
2021-06-17 11:04:44,962 UTC [12663] INFO    Progress: 8 tasks running, 18 tasks waiting for ancestors, 2/28 done
2021-06-17 11:04:46,176 UTC [12663] INFO    Maximum memory used (estimate): 1.5 GB
2021-06-17 11:04:46,177 UTC [12663] INFO    Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2021-06-17 11:04:46,178 UTC [12663] ERROR   Program terminated abnormally, see stack trace below for more information:
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "cftime/_cftime.pyx", line 505, in cftime._cftime.num2date
OverflowError: date value out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/lustre02/work/bd0854/b309141/miniconda3/envs/esm2/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/_task.py", line 754, in _run_task
    output_files = task.run()
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/_task.py", line 252, in run
    self.output_files = self._run(input_files)
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/preprocessor/__init__.py", line 482, in _run
    product.apply(step, self.debug)
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/preprocessor/__init__.py", line 351, in apply
    self.cubes = preprocess(self.cubes, step, **self.settings[step])
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/preprocessor/__init__.py", line 295, in preprocess
    result.append(_run_preproc_function(function, items, settings))
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/preprocessor/__init__.py", line 281, in _run_preproc_function
    return function(items, **kwargs)
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/cmor/fix.py", line 124, in fix_metadata
    cube = checker(cube).check_metadata()
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/cmor/check.py", line 178, in check_metadata
    self._check_time_coord()
  File "/mnt/lustre01/pf/b/b309141/ESMValCore_2/esmvalcore/cmor/check.py", line 764, in _check_time_coord
    first = coord.cell(i).point
  File "/mnt/lustre02/work/bd0854/b309141/miniconda3/envs/esm2/lib/python3.9/site-packages/iris/coords.py", line 1925, in cell
    bound = self.units.num2date(bound)
  File "/mnt/lustre02/work/bd0854/b309141/miniconda3/envs/esm2/lib/python3.9/site-packages/cf_units/__init__.py", line 2036, in num2date
    return _num2date_to_nearest_second(
  File "/mnt/lustre02/work/bd0854/b309141/miniconda3/envs/esm2/lib/python3.9/site-packages/cf_units/__init__.py", line 609, in _num2date_to_nearest_second
    dates = cftime.num2date(
  File "cftime/_cftime.pyx", line 507, in cftime._cftime.num2date
ValueError: OverflowError in datetime, possibly because year < datetime.MINYEAR

This is the affected dataset: /pf/b/b309141/work/CMIP5_DKRZ/MIROC/MIROC-ESM/esmFixClim1/mon/land/Lmon/r1i1p1/v1/gpp/gpp_Lmon_MIROC-ESM_esmFixClim1_r1i1p1_000101-014012.nc

Here is the log file: OUTPUT_SCHLUNDETAL_GPP_1PCT.txt

zklaus commented 3 years ago

Thanks for looking into this, @schlunma! For the path you posted I only get Permission denied, so I tried to find it in the regular CMIP5 archive on mistral, but there, no esmFixClim1 experiment exists, only esmFixClim2 and that doesn't contain the same data. Are we sure that this is official, still valid CMIP5 data? ESGF also doesn't list esmFixClim1 for this model.

schlunma commented 3 years ago

I can't find the data anymore on the official archive - since I got it from there it must have been retracted.

Nevertheless, I found the issue (the first value of the bound of the time coordinate is smaller than datetime.MINYEAR, which is set to 1) and was able to solve this with an easy fix. I would really like to keep the recipe as is and open a PR with the fix. Doesn't need to be included in 2.3., but would be nice if it was in the main branch.

zklaus commented 3 years ago

No objection in principle. Just wasn't able to determine what exactly was going on. Do you want to make a separate issue describing the problem?

schlunma commented 3 years ago

Yes, will do.

Found another issue, this time in recipe_schlund20esd.yml: Due to the newly added automatic addition of fx files, the tool tries to load areacello now for AWI-CM, which leads to the issue described by @remi-kazeroni at the very bottom in ESMValGroup/ESMValCore#751.

This can be solved by adapting the recipe (with fx_files: {areacella}), but this is something we really should fix in 2.4, since it affects all datasets with unstructured grid (e.g., also ICON data).

remi-kazeroni commented 3 years ago

I got an error with the examples/recipe_preprocessor_test.yml using a fresh new installation, see below. For comparison, I also ran the recipe with the DKRZ central module (v2.2.0 for the core and the tool) and that worked fine.

2021-06-17 12:47:22,964 UTC [1213] ERROR   esmvalcore.preprocessor:283 Failed to run multi_model_statistics({<esmvalcore.preprocessor.PreprocessorFile object at 0x2ae7134799a0>, <esmvalcore.preprocessor.Preprocess\
orFile object at 0x2ae71346c730>, <esmvalcore.preprocessor.PreprocessorFile object at 0x2ae7134419a0>, <esmvalcore.preprocessor.PreprocessorFile object at 0x2ae713410f70>}, {'span': 'overlap', 'statistics': ['mean\
', 'median'], 'output_products': {'mean': <esmvalcore.preprocessor.PreprocessorFile object at 0x2ae7134a3ca0>, 'median': <esmvalcore.preprocessor.PreprocessorFile object at 0x2ae713431ac0>}})
2021-06-17 12:47:23,715 UTC [1213] INFO    esmvalcore._task:127 Maximum memory used (estimate): 3.8 GB
2021-06-17 12:47:23,715 UTC [1213] INFO    esmvalcore._task:129 Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2021-06-17 12:47:23,717 UTC [1213] ERROR   esmvalcore._main:440 Program terminated abnormally, see stack trace below for more information:
Traceback (most recent call last):
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 433, in run
    fire.Fire(ESMValTool())
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 410, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 104, in process_recipe
    recipe.run()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_recipe.py", line 1434, in run
    self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks'])
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 674, in run
    self._run_sequential()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 685, in _run_sequential
    task.run()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 252, in run
    self.output_files = self._run(input_files)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 475, in _run
    self.products = _apply_multimodel(self.products, step,
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 417, in _apply_multimodel
    result = preprocess(products - exclude, step, **settings)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 295, in preprocess
    result.append(_run_preproc_function(function, items, settings))
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 281, in _run_preproc_function
    return function(items, **kwargs)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 429, in multi_model_statistics
    return _multiproduct_statistics(
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 340, in _multiproduct_statistics
    statistics_cubes = _multicube_statistics(cubes=cubes,
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 320, in _multicube_statistics
    result_cube = _compute_eager(aligned_cubes,
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 275, in _compute_eager
    combined_slice = _combine(single_model_slices)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 262, in _combine
    merged_cube = cubes.merge_cube()
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/iris/cube.py", line 404, in merge_cube
    proto_cube.register(cube, error_on_mismatch=True)
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/iris/_merge.py", line 1358, in register
    match = cube_signature.match(other, error_on_mismatch)
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/iris/_merge.py", line 465, in match
    raise iris.exceptions.MergeError(msgs)
iris.exceptions.MergeError: failed to merge into a single cube.
  cube.ancillary_variables differ
2021-06-17 12:47:23,813 UTC [1213] INFO    esmvalcore._main:444
If you have a question or need help, please start a new discussion on https://github.com/ESMValGroup/ESMValTool/discussions
If you suspect this is a bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues
To make it easier to find out what the problem is, please consider attaching the files run/recipe_*.yml and run/main_log_debug.txt from the output directory.

main_log_debug.txt

remi-kazeroni commented 3 years ago

I also encountered a problem with the multi_model_statistics preprocessor for the recipe_ocean_bgc.yml. This is not related to the version number of the WOA dataset (see #1812) but rather to the core issue pointed out by @tomaslovato in this comment. For comparison, I could ran this recipe successfully using the DKRZ central module. @zklaus could you please confirm that you ran the recipe successfully with the latest installation? Here is the error I get:

2021-06-17 12:50:23,099 UTC [1212] ERROR   esmvalcore.preprocessor:283 Failed to run multi_model_statistics({<esmvalcore.preprocessor.PreprocessorFile object at 0x2b6929e936d0>}, {'span': 'overlap', 'statistics': \
['mean'], 'output_products': {'mean': <esmvalcore.preprocessor.PreprocessorFile object at 0x2b6929e93070>}})
2021-06-17 12:50:23,813 UTC [1212] INFO    esmvalcore._task:127 Maximum memory used (estimate): 2.1 GB
2021-06-17 12:50:23,814 UTC [1212] INFO    esmvalcore._task:129 Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2021-06-17 12:50:23,815 UTC [1212] ERROR   esmvalcore._main:440 Program terminated abnormally, see stack trace below for more information:
Traceback (most recent call last):
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 433, in run
    fire.Fire(ESMValTool())
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 410, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 104, in process_recipe
    recipe.run()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_recipe.py", line 1434, in run
    self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks'])
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 674, in run
    self._run_sequential()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 685, in _run_sequential
    task.run()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 248, in run
    input_files.extend(task.run())
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 252, in run
    self.output_files = self._run(input_files)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 475, in _run
    self.products = _apply_multimodel(self.products, step,
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 417, in _apply_multimodel
    result = preprocess(products - exclude, step, **settings)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 295, in preprocess
    result.append(_run_preproc_function(function, items, settings))
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 281, in _run_preproc_function
    return function(items, **kwargs)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 429, in multi_model_statistics
    return _multiproduct_statistics(
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 340, in _multiproduct_statistics
    statistics_cubes = _multicube_statistics(cubes=cubes,
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 309, in _multicube_statistics
    raise ValueError('Cannot perform multicube statistics '
ValueError: Cannot perform multicube statistics for a single cube.
2021-06-17 12:50:23,958 UTC [1212] INFO    esmvalcore._main:444
If you have a question or need help, please start a new discussion on https://github.com/ESMValGroup/ESMValTool/discussions
If you suspect this is a bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues
To make it easier to find out what the problem is, please consider attaching the files run/recipe_*.yml and run/main_log_debug.txt from the output directory.

main_log_debug.txt

EDIT from @valeriupredoi (meself) - ping @Peter9192 :beer:

valeriupredoi commented 3 years ago

autoassess stratosphere is succombing to a plotting error related to nc-time-axis:

2021-06-17 10:29:19,817 [23078] WARNING  esmvaltool.diag_scripts.autoassess.stratosphere.age_of_air,114 Run length < 12 years: Can't assess age of air
Traceback (most recent call last):
  File "/home/users/valeriu/esmvaltool/esmvaltool/diag_scripts/autoassess/autoassess_area_base.py", line 428, in <module>
    run_area(config)
  File "/home/users/valeriu/esmvaltool/esmvaltool/diag_scripts/autoassess/autoassess_area_base.py", line 420, in run_area
    multi_function(run_obj)
  File "/home/users/valeriu/esmvaltool/esmvaltool/diag_scripts/autoassess/stratosphere/strat_metrics_1.py", line 681, in multi_qbo_plot
    fig.savefig('qbo_30hpa.png')
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/figure.py", line 3005, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 2255, in print_figure
    result = print_method(
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 1669, in wrapper
    return func(*args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 508, in print_png
    return func(*args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 508, in print_png
    FigureCanvasAgg.draw(self)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 406, in draw
    self.figure.draw(self.renderer)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 74, in draw_wrapper
    result = draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/figure.py", line 2780, in draw
    mimage._draw_list_compositing_images(
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/_api/deprecation.py", line 431, in wrapper
    return func(*inner_args, **inner_kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 2921, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/_api/deprecation.py", line 431, in wrapper
    return func(*inner_args, **inner_kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 2921, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axis.py", line 1136, in draw
    ticks_to_draw = self._update_ticks()
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axis.py", line 1023, in _update_ticks
    major_locs = self.get_majorticklocs()
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axis.py", line 1255, in get_majorticklocs
    return self.major.locator()
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/nc_time_axis/__init__.py", line 151, in __call__
    return self.tick_values(vmin, vmax)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/nc_time_axis/__init__.py", line 176, in tick_values
    ticks = [
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/nc_time_axis/__init__.py", line 177, in <listcomp>
    cftime.datetime(
  File "cftime/_cftime.pyx", line 873, in cftime._cftime.datetime.__init__
TypeError: Expected str, got numpy.str_

fairly sure this is due to the new Matplotlib 3.4.2 not liking the older nc-time-axis but nc-time-axis=1.3.1 can not be installed in current configuration, see https://github.com/SciTools/nc-time-axis/issues/71

zklaus commented 3 years ago

@remi-kazeroni,

tomaslovato commented 3 years ago

@zklaus It might be the case to revise a little the recipe recipe_ocean_bgc.yml by adding a second dataset to pass through the single dataset multimodel lock? Maybe a CMIP6 dataset?

sloosvel commented 3 years ago

The first error you mention seems to be connected to fx files. @sloosvel, could you please have a look?

From the log, it looks like some datasets (bcc-csm1-1, GFDL-ESM2G) lose the ancillary variables after the extract_levels step, whereas others (MPI-ESM-LR and ERA-Interim) do not. And I guess that becomes a problem at the multimodel step.

remi-kazeroni commented 3 years ago

The second error comes about because in that recipe almost all datasets are deactivated. When you activate at least one other dataset, the recipe should work; I had the same behavior. If we want to allow the application of multi-model statistics to a single dataset (a bit non-sensical, but why not?) I suggest opening a specific issue about that.

@zklaus It might be the case to revise a little the recipe recipe_ocean_bgc.yml by adding a second dataset to pass through the single dataset multimodel lock? Maybe a CMIP6 dataset?

Sure, the error can be circumvented by adding another dataset but it would be good to include that in the recipe as @tomaslovato suggests. We may not need to have multi-model statistics working on a single dataset but the recipe becomes unusable in its current form for the new release.

It would also be good that we work on "minimal configurations" for testing the recipes for the releases. At the moment, many of the recipes in the list can't be tested directly (at least on Mistral) because of data availability...

zklaus commented 3 years ago

@remi-kazeroni, you are completely right. The point of this issue here is to determine if a bug slipped us by in making the core release that necessitates a bugfix release for the core. It is of course always possible that a core release entails changes to the recipes; those are completely fine and should be dealt with in this period between the core release and the tool release.

remi-kazeroni commented 3 years ago

Sorry for posting these errors from failing recipes without investigating in detail... This time it is about recipe_kcs.yml. Note that the same recipe runs with the version 2.2.0 of the core.

2021-06-18 06:46:32,482 UTC [30582] ERROR   esmvalcore._main:440 Program terminated abnormally, see stack trace below for more information:
Traceback (most recent call last):
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 433, in run
    fire.Fire(ESMValTool())
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 410, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_main.py", line 104, in process_recipe
    recipe.run()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_recipe.py", line 1434, in run
    self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks'])
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 674, in run
    self._run_sequential()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 685, in _run_sequential
    task.run()
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 248, in run
    input_files.extend(task.run())
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/_task.py", line 252, in run
    self.output_files = self._run(input_files)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 475, in _run
    self.products = _apply_multimodel(self.products, step,
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 417, in _apply_multimodel
    result = preprocess(products - exclude, step, **settings)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 295, in preprocess
    result.append(_run_preproc_function(function, items, settings))
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/__init__.py", line 281, in _run_preproc_function
    return function(items, **kwargs)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 429, in multi_model_statistics
    return _multiproduct_statistics(
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 340, in _multiproduct_statistics
    statistics_cubes = _multicube_statistics(cubes=cubes,
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 320, in _multicube_statistics
    result_cube = _compute_eager(aligned_cubes,
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 275, in _compute_eager
    combined_slice = _combine(single_model_slices)
  File "/mnt/lustre01/pf/b/b309192/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 262, in _combine
    merged_cube = cubes.merge_cube()
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/iris/cube.py", line 404, in merge_cube
    proto_cube.register(cube, error_on_mismatch=True)
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/iris/_merge.py", line 1358, in register
    match = cube_signature.match(other, error_on_mismatch)
  File "/work/bd0854/b309192/soft/miniconda3/envs/release_core/lib/python3.9/site-packages/iris/_merge.py", line 465, in match
    raise iris.exceptions.MergeError(msgs)
iris.exceptions.MergeError: failed to merge into a single cube.
  cube data dtype differs: float32 != float64

main_log_debug.txt

katjaweigel commented 3 years ago

There is also a problem in recipedeangelis15nat.yml with the derived variable lvp. Although the error messages says it could be missing data I don't think that this is the reason, but I'll try to understand what is the issue. [error recipe_deangelis15natlvp.txt](https://github.com/ESMValGroup/ESMValTool/files/6676643/error.recipe_deangelis15nat_lvp.txt)

Units shouldn't be the issue, the equation should take care of this. The error says: ValueError: Coordinate 'latitude' has different points for the LHS cube 'precipitation_flux' and RHS cube 'water_evaporation_flux'. What are LHS and RHS cubes?

recipe_li17natcc.yml ran without problems.

katjaweigel commented 3 years ago

There is also a problem in recipedeangelis15nat.yml with the derived variable lvp. Although the error messages says it could be missing data I don't think that this is the reason, but I'll try to understand what is the issue. [error recipe_deangelis15natlvp.txt](https://github.com/ESMValGroup/ESMValTool/files/6676643/error.recipe_deangelis15nat_lvp.txt)

Units shouldn't be the issue, the equation should take care of this. The error says: ValueError: Coordinate 'latitude' has different points for the LHS cube 'precipitation_flux' and RHS cube 'water_evaporation_flux'. What are LHS and RHS cubes?

recipe_li17natcc.yml ran without problems.

Some model seems to miss a fix to have the same number of digits for the latitude for each variable. It is MIROC5 and it is probably related to this pull request: https://github.com/ESMValGroup/ESMValCore/pull/1110 There a round_coordinates(cubes) is now also applied to pr. For recipe_deangelis15nat it would be necessary to apply it also to evspsbl and hfls. There could be more variables necessary for other combinations somewhere, but I don't know how to test this?

katjaweigel commented 3 years ago

recipe_martin18grl.yml runs but I found an issue (probably related to longitude) in some of the plots, (e.g. SPI_mapObservations_Dur_of_Events_Mean.png). But this is not related to the new core release, I checked that it also happend with the old core last year. I'll look at it after the issues related to the release.

katjaweigel commented 3 years ago

For recipe_deangelis15nat.yml: should I remove MIROC5 or is it possible to do the rounding from ESMValGroup/ESMValCore#1110 also for evspsbl and hfls now, after the Core has been released?

katjaweigel commented 3 years ago

For recipe_deangelis15nat.yml: should I remove MIROC5 or is it possible to do the rounding from ESMValGroup/ESMValCore#1110 also for evspsbl and hfls now, after the Core has been released?

Without MIROC5 recipe_deangelis15nat.yml runs.

katjaweigel commented 3 years ago

For recipe_deangelis15nat.yml: should I remove MIROC5 or is it possible to do the rounding from ESMValGroup/ESMValCore#1110 also for evspsbl and hfls now, after the Core has been released?

I made an issue https://github.com/ESMValGroup/ESMValCore/issues/1191 and a pull request https://github.com/ESMValGroup/ESMValCore/pull/1192 for the Core.

bettina-gier commented 3 years ago

recipe_perfmetrics_land_CMIP5.yml fails for the fgco2 diagnostics:

Traceback (most recent call last):
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_main.py", line 433, in run
    fire.Fire(ESMValTool())
  File "/work/bd0854/b309137/anaconda_20200811/envs/main_v3/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/work/bd0854/b309137/anaconda_20200811/envs/main_v3/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/work/bd0854/b309137/anaconda_20200811/envs/main_v3/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_main.py", line 410, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_main.py", line 104, in process_recipe
    recipe.run()
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_recipe.py", line 1434, in run
    self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks'])
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_task.py", line 674, in run
    self._run_sequential()
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_task.py", line 685, in _run_sequential
    task.run()
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_task.py", line 248, in run
    input_files.extend(task.run())
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_task.py", line 248, in run
    input_files.extend(task.run())
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/_task.py", line 252, in run
    self.output_files = self._run(input_files)
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/__init__.py", line 475, in _run
    self.products = _apply_multimodel(self.products, step,
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/__init__.py", line 417, in _apply_multimodel
    result = preprocess(products - exclude, step, **settings)
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/__init__.py", line 295, in preprocess
    result.append(_run_preproc_function(function, items, settings))
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/__init__.py", line 281, in _run_preproc_function
    return function(items, **kwargs)
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 431, in multi_model_statistics
    return _multiproduct_statistics(
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 342, in _multiproduct_statistics
    statistics_cubes = _multicube_statistics(cubes=cubes,
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 322, in _multicube_statistics
    result_cube = _compute_eager(aligned_cubes,
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 277, in _compute_eager
    combined_slice = _combine(single_model_slices)
  File "/mnt/lustre02/work/bd0854/b309137/v2_split/ESMValCore/esmvalcore/preprocessor/_multimodel.py", line 264, in _combine
    merged_cube = cubes.merge_cube()
  File "/work/bd0854/b309137/anaconda_20200811/envs/main_v3/lib/python3.9/site-packages/iris/cube.py", line 404, in merge_cube
    proto_cube.register(cube, error_on_mismatch=True)
  File "/work/bd0854/b309137/anaconda_20200811/envs/main_v3/lib/python3.9/site-packages/iris/_merge.py", line 1361, in register
    match = coord_payload.match_signature(
  File "/work/bd0854/b309137/anaconda_20200811/envs/main_v3/lib/python3.9/site-packages/iris/_merge.py", line 284, in match_signature
    raise iris.exceptions.MergeError(msgs)
iris.exceptions.MergeError: failed to merge into a single cube.
  Coordinates in cube.dim_coords differ: longitude.

I was able to track it back to missing bounds in the longitude coordinate for some models - these models go through regridding first. When I added a guess bounds

        cube.coord("longitude").bounds = None
        cube.coord("longitude").guess_bounds()

to longitude in the core _multimodel.py after l.259 it ran through.

Not sure that's the best idea to implement though - but having a problem with the mmm (the recipe ran through before the refactoring of mmm to use native iris https://github.com/ESMValGroup/ESMValCore/pull/1150) seems like a huge issue. Not sure if it'd be better to add (guess) the bounds in general before the other preprocessor steps, or make sure that it's just not a problem in the multi model statistics.

zklaus commented 3 years ago

To keep track of what is still open, I'll mark all comments that have been resolved with a :tada:

bouweandela commented 3 years ago

To help with this task and to prepare for automating it better in the future, I tried to run all recipes on Mistral. The output can be seen here.

Note that if you see a web page when viewing the results of a successful recipe run and some plots or other output is missing from it, this is because provenance has not been implemented in the diagnostic script for these output files. Example of the output of a recipe without provenance: recipe_albedolandcover.yml. Example of a recipe with provenance correctly implemented: examples/recipe_python.yml.

Below is a quick summary of why certain recipes fail, probably duplicating some of the information above.

The runs in June were done with a 1 hour time limit, the runs in July with a 2 hour time limit on the Mistral compute partition. The code for doing the runs is available in #2219.

@ESMValGroup/esmvaltool-developmentteam

Potential bugs

recipe run reason of failure remedy
recipe_autoassess_landsurface_soilmoisture_20210625_151027 recipe uses undocumented auxiliary data, possibly only available on Jasmin :+1: works on Jasmin
recipe_bock20jgr_20210625_151027 missing CMIP5 data, recipe needs to change OBS6 to native6 for ERA5, ESMValCore bug in finding CMIP3 data (dates not recognized) (:+1:), works with the following bug fixed, CMIP3 bug tracked in ESMValGroup/ESMValCore#1245
recipe_climwip_test_basic_20210625_151101 crash in diagnostic script :+1: works with recent fixes
recipe_clouds_ipcc_20210625_151145 failed because of crash trying to include too large provenance in PNG file https://github.com/ESMValGroup/ESMValCore/issues/1148 (:+1:) works without provenance, will be fixed with better provenance handling in the future
recipe_consecdrydays_20210625_151145 diagnostic script crashed with TypeError: quickplot() got multiple values for argument 'plot_type' :+1: fixed in #2244
recipe_deangelis15nat_20210625_151145 failed to derive lvp because Coordinate 'latitude' has different points for the LHS cube 'precipitation_flux' and RHS cube 'water_evaporation_flux' :+1: resolved in ESMValGroup/ESMValCore#1192
recipe_impact_20210625_151714 seems to fail on CMOR check of AWI-CM-1-1-MR areacella/o pending
recipe_kcs_20210625_151714 multimodel statistics crash: cube data dtype differs: float32 != float64 :+1: resolved in ESMValGroup/ESMValCore#1237
recipe_meehl20sciadv_20210625_151748 seems to fail on CMOR check of AWI-CM-1-1-MR areacella/o :+1: resolved in #2253
recipe_ocean_bgc_20210625_152235 tries to compute multimodel statistics on single cube :+1: wontfix
recipe_ocean_Landschuetzer2016_20210625_152110 multimodel statistics crash: cube data dtype differs: float32 != float64 :+1: resolved in ESMValGroup/ESMValCore#1239
recipe_ocean_multimap_20210625_152327 cube arithmetic bug or data issue in diagnostic script tracked in #2243
recipe_preprocessor_test_20210625_150851 cube.ancillary_variables differ in multimodel statistics original issue resolved in ESMValGroup/ESMValCore#1220, but new issue tracked in #2241
recipe_seaice_drift_20210625_152519 diagnostic script crash (data or shapefile issue?) pending

Computational issues?

recipe run reason of failure
recipe_extreme_events_20210625_151332 processing took more than 1 hr?
recipe_eyring06jgr_20210625_151418 ran out of memory?
recipe_eyring13jgr_12_20210625_151418 ran out of memory
recipe_lisflood_20210625_151027 hangs? ran out of time/memory?
recipe_martin18grl_20210625_151714 processing took more than 1 hr?
recipe_smpi_20210625_152604/ ran out of memory during linear regrid of CMIP5 GFDL-ESM2G ta to ERA5 grid
recipe_wenzel16jclim_20210625_152840 processing took more than 1 hr?

Missing data

recipe run reason of failure
recipe_anav13jclim_20210625_151027 missing CMIP5 data
recipe_check_obs_20210625_150707 missing OBS data v1.1-CRU+GPCC and v1.1-CRU
recipe_climwip_brunner20esd_20210625_151101 missing CMIP6 data
recipe_climwip_test_performance_sigma_20210625_151101 missing CMIP6 data
recipe_collins13ipcc_20210625_151145 missing CMIP5 data
recipe_ecs_constraints_20210625_151321 missing ERA5 data
recipe_era5_20210625_150707 missing ERA5 data
recipe_era5-land_20210625_150707 missing ERA5 data
recipe_flato13ipcc_20210625_151434 missing CMIP3, CMIP5 data
recipe_gier2020bg_20210625_151434 missing OBS, CMIP5, CMIP6 data
recipe_globwat_20210625_151027 missing ERA5 data
recipe_hydro_forcing_20210625_151027 missing ERA5 data
recipe_hype_20210625_151027 missing ERA5 data
recipe_landcover_20210625_151714 missing CMIP5 data
recipe_marrmot_20210625_151027 missing ERA5 data
recipe_pcrglobwb_20210625_151027 missing ERA5 data
recipe_perfmetrics_CMIP5_20210625_152344 missing CMIP5 data
recipe_perfmetrics_CMIP5_4cds_20210625_152327 missing CMIP5 data
recipe_perfmetrics_land_CMIP5_20210625_152344 missing CMIP5 data
recipe_preprocessor_derive_test_20210625_150851 missing CMIP5 data
recipe_schlund20esd_20210625_152519 missing CMIP5 data
recipe_schlund20jgr_gpp_abs_rcp85_20210625_152858 missing CMIP5 data
recipe_schlund20jgr_gpp_change_1pct_20210625_152936 missing CMIP5 data
recipe_schlund20jgr_gpp_change_rcp85_20210625_152936 missing CMIP5 data
recipe_seaice_20210625_152519 missing CMIP5 sic and areacello data
recipe_seaice_feedback_20210625_152519 missing CMIP5 data
recipe_smpi_4cds_20210625_152604 missing CMIP5 data
recipe_snowalbedo_20210625_152605 missing CMIP5 data
recipe_wenzel14jgr_20210625_152752 missing CMIP5 data for variable nbp
recipe_wenzel16nat_20210625_152840 missing CMIP5 data MIROC-ESM variable gpp
recipe_wflow_20210625_151027 missing ERA5 data
recipe_williams09climdyn_CREM_20210625_152840 missing CMIP5 and CMIP6 data
ruthlorenz commented 3 years ago

Has anyone successfully run a recipe with ncl diagnostics? I try to run recipe_collins13ipcc.yml but receive an error from $diag_scripts/../interface_scripts/auxiliary.ncl

fatal:Undefined` identifier: (ncdf_write) is undefined, can't continue

Not sure if this is related to the new core or I am doing something very wrong....

bettina-gier commented 3 years ago

I ran perfmetrics, smpi and gier2020bg which had no problems with ncl. Manuel also ran anav judging from the list. And Remi ran the ncl test recipe. Can you see if there's a error further up in the log for why ncdf_write is undefined for you? I did change that function a bit to allow for writing several variables into one netcdf file, but hadn't found any issues testing it without using that option.

ruthlorenz commented 3 years ago

Thanks for that, I attach the full log, the first error is: fatal:syntax error: line 156 in file $diag_scripts/../interface_scripts/auxiliary.ncl before or near \n elseif (meta .eq. "diag_script")

log.txt

bettina-gier commented 3 years ago

You need to update your ncl version, elseif was introduced in version 6.5, you're using 6.4. Which is weird cause the environment.yml file specifies ncl>=6.5

ruthlorenz commented 3 years ago

aha, it picks up 6.4 somehow, (because thats the default on the machine maybe), or something stupid I did, thanks!!! I try with >=6.5

valeriupredoi commented 3 years ago

autoassess stratosphere is succombing to a plotting error related to nc-time-axis:

2021-06-17 10:29:19,817 [23078] WARNING  esmvaltool.diag_scripts.autoassess.stratosphere.age_of_air,114 Run length < 12 years: Can't assess age of air
Traceback (most recent call last):
  File "/home/users/valeriu/esmvaltool/esmvaltool/diag_scripts/autoassess/autoassess_area_base.py", line 428, in <module>
    run_area(config)
  File "/home/users/valeriu/esmvaltool/esmvaltool/diag_scripts/autoassess/autoassess_area_base.py", line 420, in run_area
    multi_function(run_obj)
  File "/home/users/valeriu/esmvaltool/esmvaltool/diag_scripts/autoassess/stratosphere/strat_metrics_1.py", line 681, in multi_qbo_plot
    fig.savefig('qbo_30hpa.png')
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/figure.py", line 3005, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 2255, in print_figure
    result = print_method(
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 1669, in wrapper
    return func(*args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 508, in print_png
    return func(*args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 508, in print_png
    FigureCanvasAgg.draw(self)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 406, in draw
    self.figure.draw(self.renderer)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 74, in draw_wrapper
    result = draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/figure.py", line 2780, in draw
    mimage._draw_list_compositing_images(
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/_api/deprecation.py", line 431, in wrapper
    return func(*inner_args, **inner_kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 2921, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/_api/deprecation.py", line 431, in wrapper
    return func(*inner_args, **inner_kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 2921, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axis.py", line 1136, in draw
    ticks_to_draw = self._update_ticks()
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axis.py", line 1023, in _update_ticks
    major_locs = self.get_majorticklocs()
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/matplotlib/axis.py", line 1255, in get_majorticklocs
    return self.major.locator()
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/nc_time_axis/__init__.py", line 151, in __call__
    return self.tick_values(vmin, vmax)
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/nc_time_axis/__init__.py", line 176, in tick_values
    ticks = [
  File "/home/users/valeriu/miniconda3-June2021/envs/release230/lib/python3.9/site-packages/nc_time_axis/__init__.py", line 177, in <listcomp>
    cftime.datetime(
  File "cftime/_cftime.pyx", line 873, in cftime._cftime.datetime.__init__
TypeError: Expected str, got numpy.str_

fairly sure this is due to the new Matplotlib 3.4.2 not liking the older nc-time-axis but nc-time-axis=1.3.1 can not be installed in current configuration, see SciTools/nc-time-axis#71

OK don't worry about this for now, we have matplotlib pinned to <3.4 for now and that does the job for that recipe, there are still issues with iris and the latest nc-time-axis see https://github.com/SciTools/iris/issues/3959 - we'll try unpin matplotlib when we can see that all sorted :+1:

bouweandela commented 3 years ago

It looks like that was just fixed with the release of iris 3.0.4

zklaus commented 3 years ago

In principle yes, but we noticed that the update to iris 3.0.4 comes with a bunch of (very useful) implications:

so we decided to pin to <3.0.4 for the bugfix release now. Let's discuss how to proceed in a separate issue.

remi-kazeroni commented 3 years ago

Thanks a lot @bouweandela for testing all the recipes and posting the reasons for the crashes here! That is nice to have. Following up on this comment, I had a look at all recipes that crashed because of "Missing ERA5 data". Most are hydrological recipes which sometimes require up to 30 years of daily ERA5 data and up to 5 variables. It would require quite some computational effort to cmorize all that (storage would be manageable). Is it really needed to test those recipes over the whole time period to check that these run fine from a technical point of view? The question also holds for recipes using a large number of datasets. Do we really need to run the recipes on all original datasets (including for example missing CMIP5 data) to check that these run fine? I guess most recipe failures that were spotted recently could have been witnessed using a subset of data and a limited time period. It would be great if we could have at some point some kind of a "test mode" for the recipes to allow to check them faster and avoid all the missing data problems listed here.

remi-kazeroni commented 3 years ago

Regarding failing recipes because of "missing ERA5 data":

Could the authors of these recipes (maybe @SarahAlidoost, @stefsmeets, @Peter9192?) help me get the missing aux files and dataset in order to be able to test these recipes on Mistral? I'm not sure if the auxiliary files are just stored in another folder that I'm not aware of.

The other recipe failures related to "Missing ERA5 data" seem to be only due to missing ERA5 data.

bouweandela commented 3 years ago

Thanks for looking into this @remi-kazeroni!

Is it really needed to test those recipes over the whole time period to check that these run fine from a technical point of view?

Yes. For example, if it turns out that they need 10 TB of RAM, they do not work

The question also holds for recipes using a large number of datasets. Do we really need to run the recipes on all original datasets (including for example missing CMIP5 data) to check that these run fine?

Yes, because the input data is so diverse that this is the only way to make sure everything works. I used 29 compute hours on Mistral for the tests I did in this issue, so the required amount of compute hours seems manageable so far.

I guess most recipe failures that were spotted recently could have been witnessed using a subset of data and a limited time period. It would be great if we could have at some point some kind of a "test mode" for the recipes to allow to check them faster and avoid all the missing data problems listed here.

You're probably right (cf https://github.com/ESMValGroup/ESMValTool/issues/2240#issuecomment-893361048), but if a recipe already fails on missing data, the amount of compute used is quite small because we do check that all required data is available before running any computations.

Could the authors of these recipes (maybe @SarahAlidoost, @stefsmeets, @Peter9192?) help me get the missing aux files and dataset in order to be able to test these recipes on Mistral? I'm not sure if the auxiliary files are just stored in another folder that I'm not aware of.

The documentation of the recipes contains links to where the shapefiles can be downloaded: https://docs.esmvaltool.org/en/latest/recipes/recipe_hydro_forcing.html https://docs.esmvaltool.org/en/latest/recipes/recipe_hydrology.html I copied most of these auxiliary data files to /work/bd0854/b381141/auxiliary_data/ on Mistral if you don't feel like downloading it yourself. Is there any file that cannot be found using that documentation?

remi-kazeroni commented 3 years ago

The documentation of the recipes contains links to where the shapefiles can be downloaded: https://docs.esmvaltool.org/en/latest/recipes/recipe_hydro_forcing.html https://docs.esmvaltool.org/en/latest/recipes/recipe_hydrology.html I copied most of these auxiliary data files to /work/bd0854/b381141/auxiliary_data/ on Mistral if you don't feel like downloading it yourself. Is there any file that cannot be found using that documentation?

On Mistral we have a shared directory for auxiliary data files (/mnt/lustre02/work/bd0854/DATA/ESMValTool2/AUX), similarly to OBS and RAWOBS. It could be an option to use this common directory to avoid duplications and missing data. Note that /mnt/lustre02/work/bd0854/DATA/ESMValTool2/AUX is mirrored to Jasmin. Following the docs I could download everything but:

bouweandela commented 3 years ago

On Mistral we have a shared directory for auxiliary data files (/mnt/lustre02/work/bd0854/DATA/ESMValTool2/AUX),

Great, could you add that to the config-user.yml that is shipped with ESMValCore so people can find it? I think we may run into similar issues as with the Tier3 datasets, for this recipe: https://docs.esmvaltool.org/en/latest/recipes/recipe_carvalhais14nat.html#observations

Could you please open new issues for the missing auxiliary data files? Then we can tag the authors of the respective recipes there and hopefully get it solved.

Peter9192 commented 3 years ago
* `dem_file: 'wflow_parameterset/meuse/staticmaps/wflow_dem.map'` for `hydrology/recipe_wflow.yml` (no info in the doc or I missed it)

There is an example here: https://github.com/openstreams/wflow/blob/master/examples/wflow_rhine_sbm/staticmaps/wflow_dem.map although I'm not sure if that's the same region as currently specified in the extract_region preprocessor in the recipe.

Agree it would be good to document this, so as soon as there is an issue I can transfer this comment.

zklaus commented 3 years ago

I will close this issue, since we are well passed the 2.3.0 release of ESMValTool.

I encourage everyone who has been involved in this issue to see if there are issues that were discovered here and have neither been resolved nor are being tracked in there own ticket yet and to open separate issues for those.

For the next round, i.e. 2.4.0, I suggest we try this exercise as a discussion instead of an issue to be able to keep comments together that belong together.

Thanks to everyone for the effort!