ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
227 stars 128 forks source link

recipe_wenzel14jgr cannot be run in parallel #2690

Open sloosvel opened 2 years ago

sloosvel commented 2 years ago

Describe the bug Diagnostic diag_gammaHist_Fig3and4 is failing due to a missing file:

INFO    Reading in file = /scratch/b/b381943/esmvaltool_output/recipe_wenzel14jgr_20220615_073550/preproc/diag_gammaLT_5/nbp_esmFix/CMIP5_NorESM1-ME_Lmon_esmFixClim1_r1i1p1_nbp_0030-0110.nc
DEBUG   >>> Leaving read_data (interface_scripts/data_handling.ncl)
DEBUG   <<< Entering time_operations (diag_scripts/shared/statistics.ncl)
DEBUG   >>> Leaving time_operations (diag_scripts/shared/statistics.ncl)
DEBUG   <<< Entering time_operations (diag_scripts/shared/statistics.ncl)
DEBUG   >>> Leaving time_operations (diag_scripts/shared/statistics.ncl)
DEBUG   <<< Entering time_operations (diag_scripts/shared/statistics.ncl)
DEBUG   >>> Leaving time_operations (diag_scripts/shared/statistics.ncl)
DEBUG   <<< Entering convert_units (diag_scripts/shared/scaling.ncl)
DEBUG   >>> Leaving convert_units (diag_scripts/shared/scaling.ncl)
DEBUG   <<< Entering convert_units (diag_scripts/shared/scaling.ncl)
DEBUG   >>> Leaving convert_units (diag_scripts/shared/scaling.ncl)
DEBUG   <<< Entering ncdf_read (interface_scripts/auxiliary.ncl)
INFO    fatal: in ncdf_read (interface_scripts/auxiliary.ncl), /scratch/b/b381943/esmvaltool_output/recipe_wenzel14jgr_20220615_073550/run/diag_gammaLT_5/gammaLT_5/../../diag_gammaHist_Fig3and4/gammaHist_3and4/gIAV_1960-2005.nc does not exist
valeriupredoi commented 2 years ago

@sloosvel please assign recipe maintainers to issues related to recipes failing for reasons that we can't fix through Core fixes or other maintenance that we can do. In fact, assign them anyway so they are aware of issues that may branch out in diagnostics too :+1:

valeriupredoi commented 2 years ago

having said that - thud! Who's the maintainer of this and who is Sabrina Wenzel? @ESMValGroup/esmvaltool-coreteam have we decided anything wrt recipes that don't have maintainers? I can't recall :cry:

bouweandela commented 2 years ago

From the error message, it looks like diagnostic diag_gammaLT_5/gammaLT_5 depends on diagnostic diag_gammaHist_Fig3and4/gammaHist_3and4 and the author of this recipe has not correctly used ancestors to declare that dependency and get the path to the files produced by the ancestors from the interface, but has hardcoded a relative path in the recipe.

This will work if the recipe is run with max_parallel_tasks: 1 because then it is run from top to bottom, but when running in parallel it may break because the tool does not know that it should run diag_gammaHist_Fig3and4/gammaHist_3and4 before it can start diag_gammaLT_5/gammaLT_5, so depending on what is started when this may fail.

sloosvel commented 2 years ago

Ok, I will add a note in the docs for this recipe about having to run with max_parallel_tasks=1 .