Closed remi-kazeroni closed 1 year ago
The first round of recipe testing produced:
index.html
file was generated)Recipe | Problem | Related issue PR |
---|---|---|
recipe_autoassess_landsurface_soilmoisture | known missing climatology files (non-public) | marked as broken in https://github.com/ESMValGroup/ESMValTool/issues/3103 |
recipe_check_obs | known derivation issue for ERA5 | https://github.com/ESMValGroup/ESMValCore/issues/1388 |
For comparison, we released ESMValTool 2.7.0 with 4 non-working recipes (this could have been 5 if we used a stricter policy on missing data as done for this round of testing)
preproc
dirs for failed runs): /work/bd0854/b309192/recipe_testing/recipe_testing_v2p8/v08rc2/scripts/esmvaltool_output
Note: I will soon make a new post with a markdown list so that contributors can tick boxes after checking the output of their favourite recipes. After that, I'll tag the community.
And thanks very much to everyone who helped testing, fixing, maintaining recipes in the previous round of testing! It is very enjoyable to get results like this with v2.8.0rc2
.
Hi @ESMValGroup/esmvaltool-developmentteam and @ESMValGroup/esmvaltool-recipe-maintainers, the results from the second and last round of recipe testing for the release of ESMValTool and ESMValCore v2.8 are now available. I would be very grateful if you could take a look at the output of your favourite recipes (see list below) and tick the boxes if the output look good to you. If that is not the case, please report the issue by editing the list below or posting in this issue.
Deadline: Tuesday, March 28, noon (GMT) Release of ESMValTool v2.8 is scheduled for that day.
v2.7.0
: https://esmvaltool.dkrz.de/shared/esmvaltool/v2.7.0/debug.htmlv2.8.0rc2
: https://esmvaltool.dkrz.de/shared/esmvaltool/v2.8.0rc2/Some guidelines on how to inspect runs:
v2.8.0rc2
and the previous stable released version v2.7.0
Below is the list of 150 recipes currently available in the main branch. The comparison tool returns:
Action required: 120 out of 147 recipe runs need to be inspected by a human.
See complete output in: compare_v280_output.txt
List of recipes to be checked:
Note that the tool can now find many more files providing supplementary variables (ancillary variables and cell measures), provided that fx_variables
is not used in the recipe. This means that calculations done by the preprocessor functions area_statistics
, mask_landsea
, mask_landseaice
, volume_statistics
, and weighting_landsea_fraction
are more accurate. Numerical differences with previous versions are therefore expected. See Supplementary variables (ancillary variables and cell measures) in the preprocessor documentation for more information.
very many thanks @katjaweigel :beer: Anything you'd reckon can't be fixed with a short (in time) PR?
@valeriupredoi I think the frame for recipe_cmug_h2o and I hope the canvas for recipe_deangelis15nat, but for the second I have to find out how change it, first. (Both should be fixed in the ESMValTool diagnostics.)
godspeed with that @katjaweigel :racehorse:
Unfortunately I cannot reproduce the issue with the figures from recipe_deangelis15nat: Figure from test run: https://esmvaltool.dkrz.de/shared/esmvaltool/v2.8.0rc2/ Figure from my own test with the new Core, reduced version of the recipe (/work/bd1083/b380216/output/recipe_deangelis15nat_20230323_172923/):
@katjaweigel have you recreated the environment to pull in all the dependencies the testing environment used?
@valeriupredoi Thanks, you are right: I installed the new environment, but I forgot to turn it on, sorry!
I made a issue (#3132) and a PR (#3133) now to change the plot issues in recipe_deangelis15nat and recipe_cmug_h2o (both are really small changes).
@katjaweigel that's brilliant, very many thanks, I'll have a look in a jiffy 🍺
I made a issue (#3132) and a PR (#3133) now to change the plot issues in recipe_deangelis15nat and recipe_cmug_h2o (both are really small changes).
Thanks for that @katjaweigel. The new runs (and new plots) are available on the same website: https://esmvaltool.dkrz.de/shared/esmvaltool/v2.8.0rc2/
Thanks a lot @remi-kazeroni and @valeriupredoi!
Thanks everyone for checking the recipe results, that was very helpful for the release management team 👍 I see that about 2/3 of the recipes were checked and approved which is good enough to proceed with the release of ESMValTool v2.8.0. I'm closing this issue now. Nevertheless, feel free to continue checking recipe output later on and mark those that were checked. If needed, a new issue can be opened to document potential problems noticed later on.
Hi @remi-kazeroni, thanks for the nice overview. I noticed that several recipes that are not checked in the list above are listed as OK in the comparison tool output that you posted. Is this on purpose? For example:
..
recipe_combined_indices.yml: OK
..
recipe_consecdrydays.yml: OK
..
Hi @bouweandela, I overlooked that and did not put any [x] for the 27 recipes that were reported as unchanged by the comparison tool. I can still do that if you like. My experience is that it would still be better that someone quickly checks the output manually. We have seen problems that went unnoticed from release to release (like masking of 0s) and the comparison tool would report that results have not changed since the past release...
We have seen problems that went unnoticed from release to release (like masking of 0s)
That sounds like a serious issue with the comparison tool. Is it reported somewhere? The whole point of having a comparison tool is that you can rely on things being OK if it says they are OK.
We have seen problems that went unnoticed from release to release (like masking of 0s)
That sounds like a serious issue with the comparison tool. Is it reported somewhere? The whole point of having a comparison tool is that you can rely on things being OK if it says they are OK.
This was fixed in https://github.com/ESMValGroup/ESMValCore/pull/1823 and is in the v2.8.0 release. I think the point I'm trying to make is: we do not have a robust mechanism in place to record "known good output" for recipes merged into main
. If we compare recipe output affected by unnoticed bugs (like masking of 0s) or if outputs change because of some improvements (e.g. 1609), we would somehow need to record the "known good output" again. As long as this is not in place (maybe one day as part of a recipe test workflow), I would not fully rely on the OK from the comparison tool because there could be some uncertainty in the "known good output". That is why I personally feel it is safer to take a look at the final recipe results for a release. Nevertheless, the comparison tool has been very useful for me in various cases: comparing output between rcs, review of some PRs, ...
Would you say that the recipes with a checkmark above are known good output then? It would be good to take this to the tech lead meeting.
Would you say that the recipes with a checkmark above are known good output then? It would be good to take this to the tech lead meeting.
After a release with quite a few important enhancements and bugfixes, I think yes. Known good output would be those with a checkmark. Maybe it is not necessary that all recipe output are checked after each release, but just once in a while (once per year?) or if the Tech Lead Team says that there would be good reasons (major Core changes) to justify that.
This issue documents the round of recipe testing performed using the Core release candidate
v2.8.0rc2
.Release process
System and settings
conda
/mamba
Git branches and state
Tue 21 Mar 13:02:47 CET 2023
Installation and environment
Config user file
Main options: all default except
search_esgf: when_missing
ESMValTool version
Environment file
tool_280rc2.txt
Compute resources used
I used the newly added
generate.py
script. I made some modifications to it to enable the release manager to run all 150 recipes in one go, by doingpython generate.py
and adjusted SLURM settings for all "complicated" recipes. I will open a PR shortly to provide more details on that.On DKRZ-Levante
Note: this is the second and final round of testing for
v2.8.0
. I will publish the overview website and output of the comparison tool in this issue very soon. And then I will tag the community to check the output. Stay tuned!