ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
221 stars 128 forks source link

Recipe testing and comparison for release 2.7.0 #2881

Closed valeriupredoi closed 1 year ago

valeriupredoi commented 2 years ago

Sister and logical evolution of #2852 - I am commencing testing and comparison of recipes and recipes results in order to release 2.7.0 at the end of this week (hopefully). System parameters below, work done on DKRZ/Levante: submit files in /home/b/b382109/submit, output in /scratch/b/b382109/esmvaltool_output

System and settings

conda/mamba

(base) mamba --version
mamba 0.27.0
conda 22.9.0

Git branch and state

Date: 25 October 2022 14:22 BST

(base) git status
On branch release_270stable
Your branch is up to date with 'origin/release_270stable'.

nothing to commit, working tree clean

Environment

On Levante:

mamba env create -n tool270Test -f environment.yml
conda activate tool270Test

Environment file

ToolEnv270Test.yml

Extraneous file movements

I moved the autoassess-specific files to /home/b/b382109/autoassess_files - run was succesful for AA recipes then :+1:

Ad-hoc hacks (code changes)

Mods to config user file

Added DKRZ downloaded data pool as:

  CMIP6:
    - /work/bd0854/DATA/ESMValTool2/CMIP6_DKRZ
    - /work/bd0854/DATA/ESMValTool2/download/CMIP6
  CMIP5:
    - /work/bd0854/DATA/ESMValTool2/CMIP5_DKRZ
    - /work/bd0854/b309141/additional_CMIP5
    - /work/bd0854/DATA/ESMValTool2/download/cmip5/output1
    - /work/bd0854/DATA/ESMValTool2/download/cmip5

as @schlunma and @remi-kazeroni have suggested :beer:

Recipe runs

Recipe runs results (as of final on 27 October 2022) are listed in https://github.com/ESMValGroup/ESMValTool/issues/2881#issuecomment-1291878142 (with very many thanks to @remi-kazeroni for running the impossible to run ones!) and are as follows:

(*) means not counting/counting the one that had a DiagnosticError but was fixed but not PR-ed

Running the comparison

Login and access to the DKRZ esmvaltool VM

Results from recipe runs are stored on the VM; login with:

ssh youraccount@esmvaltool.dkrz.de

Get and install miniconda on VM

E.g. scp Miniconda3-py39_4.12.0-Linux-x86_64.sh b382109@esmvaltool.dkrz.de:~ from a file already on Levante.

Setting up the input files

If you wrote recipe runs output to Levante /scratch partition be aware that the data will be removed after two weeks, so you will have to move the output data to the /work partition, via e.g. a nohup job:

nohup cp -r /scratch/b/b382109/esmvaltool_output/* /work/bd0854/b382109/v270

/work is visible by the VM so you can run the compare tool straight on the VM.

NOTE do not store final release results on the VM including /preproc/ dirs, the total size for all the recipes output, including /preproc/ dirs is in the 4.5TB ballpark, much too high for the VM storage capacity

Running compare tool at VM

Input/output/run

Sanity check, as outputted by compare.py

Comparing recipe run(s) in:
/work/bd0854/b382109/v270
to reference in /mnt/esmvaltool_disk2/shared/esmvaltool/v2.6.0rc4

First pass result

Running the compare.py results in a few recipes not-OK (NOK) wrt plots differing from previous release v2.6.0, summary in https://github.com/ESMValGroup/ESMValTool/issues/2881#issuecomment-1294735465

Detailed plots inspection

Plots that differ for the 34 recipes that have them different is happening in https://github.com/ESMValGroup/ESMValTool/issues/2881#issuecomment-1295001054

valeriupredoi commented 2 years ago

Because I don't have the output for other versions in Levante. Other releases were ran in Mistral. The outputs are in the virtual machine. And it's indicated in the documentation anyway: https://docs.esmvaltool.org/en/latest/utils.html#comparing-recipe-runs

yes but you ran them on Levante and not on the VM, like I did myself - not saying you did wrong, am saying this is a mess including the instructions in the docs - moving data is bad, removing data is even worse

sloosvel commented 2 years ago

and on top of this all I don't have write permissions to /shared/esmvaltool to move the data

You should be able to sudo rsync or any other command

valeriupredoi commented 2 years ago

sudo rsync will not work from Levante since I am not in the list of sudoers. This fails:

rsync -avzh /scratch/b/b382109/esmvaltool_output/ b382109@esmvaltool.dkrz.de:/shared/esmvaltool/

I need write permissions to the /shared/esmvaltool/ partition. While I'm waiting for it, I'll rsync all that in my $HOME (silly, silly stuff)

schlunma commented 2 years ago

@valeriupredoi Rémi is on vacation. I don't have access to the machine altogether (Connection closed by 136.172.60.95 port 22), so I cannot help you here. If I remember correctly, @bouweandela also had access to it, maybe he can give you the rights?

valeriupredoi commented 2 years ago

Cheers Manu, not a worry just yet, fingers crossed I have enough disk quota so I can rsync all the results in my home on the VM, then I can run the comparison. Heads up though, to myself, should not rm the data from Levante so next RM can run/rerun comparisons, and only when all's good and set move it to the VM, keeping a copy on Levante 👍

valeriupredoi commented 2 years ago

OK it proves I really need rwx permissions to /shared/esmvaltool - the data volume for them results is in the ballpark of 30G - my rsync stopped and was not aware of it, was happy to see it's only 6-7G but the transfer it's totally incomplete. Even so, the limit on user's home on the VM looks to be 10G - bit of a tiny one given I needs miniconda guff too. Unless someone is a saint and moves my output from Levante to the VM shared dir, am stuck at this step, w/o being able to run the comparison - unless, of course, I rsync the 2.6.0 results back to Levante - which, as silly as it may sound, I might do if write access proves to be a lengthy process

remi-kazeroni commented 2 years ago

I would really recommend to go through this process (copying the recipe output to the VM, generating the debug.html page and running the comparison tool) with the previous RM or someone available to help. I reckon this is not well documented (partly for security issues since the VM shouldn't be publicly accessed) and it would be helpful to schedule a quick call with a previous RM who has worked with the VM. @sloosvel, would you be able to help V with these tasks? I'm quite unsure how the recipe output should be copied and the comparison tool ran. (And I'm out of office these days 😬 )

I agree there is room for discussion and improvement in the way with test recipes: is it good practice to use data downloaded long ago? How do we handle the preproc files which don't fit in the disk of the VM (1TB limit...)? Where to store "official" (release) output? I believe this will be partly addressed by the RTW and this discussion could take place for the next release. Given that 122/127 recipes run successfully and no new bug left unfixed, I think this release is on very good track! Well done @valeriupredoi! 🍻

valeriupredoi commented 2 years ago

@remi-kazeroni you're a legend writing work emails while on holidays! Thanks a lot, dude, I have now all manners of access - write access to /shared/esmvaltool, am transferring all output from /scratch to /work (under nohup, so I guess it'll do it overnight), so then tomorrow I can compare as much as I please :grin: And my sincere apologies for barking at poor @sloosvel for what I thought her removing the output data - proves out DKRZ is sneakily rm-ing it from /scratch after 2 weeks. Here's the thing though - this badly needs documenting - I mean the whole process, w/o actual login/specific machine details for privacy reasons, but the workflow is complicated enough to be written down; I'll try do it as much as possible - @sloosvel you should have done this after 2.6.0 with help from DKRZ tenants like @remi-kazeroni or @schlunma (apols if you did and I missed the instructions, am pretty good at not reading instructions). Provided all goes well with the copy to /work I will run compare tomorrow, may have to bug you for more info on debug html stuffs - but many thanks you all @remi-kazeroni @sloosvel @schlunma @bouweandela for the help so far :beer:

sloosvel commented 1 year ago

Hi @valeriupredoi please let me know if want to schedule a call, I have to say that I am quite confused by all your issues. I did not ran into any of that.

valeriupredoi commented 1 year ago

Hi @sloosvel - many thanks, am back on track now, no need for a call just yet, maybe if you could keep an eye on this issue if I ask for some help, that'd be awesome :beer: :+1:

valeriupredoi commented 1 year ago

OK comparison tool is now plodding along nicely - I have also added the instructions in the issue description - we can use that description to hatch us a nice doc entry - the next RM should not go through the Gates of VM Purgatory like I did yesterday :+1:

valeriupredoi commented 1 year ago

Comparison results

Run command and output stored

Per recipe result

Legend:

122 out of 127 final

Result

We need to look at plots for 34 recipes; we're good to go for 85 recipes; 3 have no reference in 2.6.0

bouweandela commented 1 year ago

And as mentioned in my previous comment, anav13 is a special case since data from output2 cannot be read with the default DRS (our fault, not CMIPs!). You could also try:

CMIP5: /work/bd0854/DATA/ESMValTool2/download/cmip5/output2

@schlunma Why are the project and product facets hardcoded in the path? This should work fine if the DRS starts with {project.lower}/{product}/.. (i.e. the one called ESGF in config-developer.yml) and the rootpath is set to /work/bd0854/DATA/ESMValTool2/download.

bouweandela commented 1 year ago

Oh wait, is that because you're trying to combine data downloaded with the ESGF DRS with the DKRZ DRS and we don't support per path DRS settings? ESMValGroup/ESMValCore#129

schlunma commented 1 year ago

Oh wait, is that because you're trying to combine data downloaded with the ESGF DRS with the DKRZ DRS and we don't support per path DRS settings? ESMValGroup/ESMValCore#129

Exactly. We found that using the full paths with project and output for the downloaded data is currently the cleanest way to include the DKRZ and ESGF rootpaths.

bouweandela commented 1 year ago

Actually, the cleanest way to make it work is just to set download_dir: /work/bd0854/DATA/ESMValTool2/download/ (provided you have write access there) and run with offline: false and only use the 'official' rootpaths for the CMIP projects. This works because the tool will always check if it already has a file before downloading it. In this case you're lucky that the ESGF DRS is similar to the DKRZ one, so your approach works too. Anyway, I'll see if I can do something about https://github.com/ESMValGroup/ESMValCore/issues/129 for the next release.

schlunma commented 1 year ago

We tried that too, but if I remember correctly there was a problem with this. Could be that it was an issue because at the beginning downloading was very slow (so there was no chance that slow recipes would run), which should be fixed now. I will give it another try sometime in the future.

It would be really great if you could do something about https://github.com/ESMValGroup/ESMValCore/issues/129 :rocket:

valeriupredoi commented 1 year ago

alright folks, as we have seen in https://github.com/ESMValGroup/ESMValTool/issues/2881#issuecomment-1294735465 we need to look at some recipes that have not had the same plots as in v2.6.0, these are 34 party poopers:

To quickly identify differing plots please have a look at this log https://esmvaltool.dkrz.de/shared/esmvaltool/compare270output_trimmed.txt

We can have a look at them in the run list for v2.7.0 https://esmvaltool.dkrz.de/shared/esmvaltool/v2.7.0/debug.html vs the v2.6.0 one https://esmvaltool.dkrz.de/shared/esmvaltool/v2.6.0/debug.html - I will start having me a look but by all means, @ESMValGroup/esmvaltool-developmentteam I could really use a hand here, especially since you (as recipe maintainer/developer) you know these things well, they're all beetles and bugs on coloured paper to me :grin:

bettina-gier commented 1 year ago

Is there a log file where we can see the differences? My recipe has a lot of plots and if just one of them differs as the list says it'd be easier to just look at that one =D

valeriupredoi commented 1 year ago

Is there a log file where we can see the differences? My recipe has a lot of plots and if just one of them differs as the list says it'd be easier to just look at that one =D

logfile coming right away - I'll post it in the comment above :beer:

valeriupredoi commented 1 year ago

before I post the log (currently curating it) to not lose you, Tina, here's the only bitty plot that differs for your recipe: recipe_gier2020bg.yml: results differ from reference run Reference run: /mnt/esmvaltool_disk2/shared/esmvaltool/v2.6.0rc4/recipe_gier2020bg_20220712_100159 Current run: /work/bd0854/b382109/v270/recipe_gier2020bg_20221025_142445 Differing files:

bettina-gier commented 1 year ago

You can pass that recipe, that's just a different sorting for the diff ensemble members in the histogram and looks diff cause they're not labeled. Cheers for the special extract for me ;)

valeriupredoi commented 1 year ago

You can pass that recipe, that's just a different sorting for the diff ensemble members in the histogram and looks diff cause they're not labeled. Cheers for the special extract for me ;)

@bettina-gier - legend, many thanks! :beer:

sloosvel commented 1 year ago

@valeriupredoi do you mind if I run recipe_climate_change_hotspot in jasmin, so that I can at least upload the results for this version?

valeriupredoi commented 1 year ago

@valeriupredoi do you mind if I run recipe_climate_change_hotspot in jasmin, so that I can at least upload the results for this version?

@sloosvel go for it! Cheers :beer: - but am planning on releasing tonight, it looks promising. If you run it then upload results to the v2.7.0 that'd be awesome, and the release is not affecting that :+1: It'd be great if you was around to approve the last PR thereby changing the version number, in an hour or so, no probs if you not will ask @bouweandela

TomasTorsvik commented 1 year ago

@valeriupredoi for recipe_ocean_example, the only obvious difference seems to be the transect plots, Diag_Transect_1, Diag_Transect_2 and Diag_Transect_3. These plots are empty in v2.6.0 and non-empty in v2.7.0 (see bugfix #2858).

Diag_Transect_1 picks up the mask data 1.e20, but this is probably a separate issue.

valeriupredoi commented 1 year ago

@TomasTorsvik brilliant, many thanks for looking! And a positive difference too, thanks to your PR :beer:

Diag_Transect_1 picks up the mask data 1.e20, but this is probably a separate issue.

Would you be OK to open an issue about this, please? And tag @ledm so we can fix that in 2.8. Many thanks! :beer:

TomasTorsvik commented 1 year ago

@valeriupredoi the same applies for recipe_ocean_bgc, the v2.7.0 have plots for Diag_Transect_No_Data and Diag_Transect_vs_Woa that are empty in v2.6.0. The other plots look OK to me.

valeriupredoi commented 1 year ago

Fantastic, cheers @TomasTorsvik :beer:

TomasTorsvik commented 1 year ago

@valeriupredoi I'm not sure, but it seems the difference in recipe_ocean_ice_extent can be connected with land/coastline boarders. See e.g.

v2.6.0 : https://esmvaltool.dkrz.de/shared/esmvaltool/v2.6.0/recipe_ocean_ice_extent_20220712_100640/plots/diag_ice_SHS/Global_seaice_timeseries/diag_CMIP5_HadGEM2-CC_OImon_historical_r1i1p1_sic_timeseries_SHS_ice_extent_diag_ice_SHS_1989_2004_ortho_map_Fractionalcover_2003DJF_0.png

v2.7.0 : https://esmvaltool.dkrz.de/shared/esmvaltool/v2.7.0/recipe_ocean_ice_extent_20221025_143917/plots/diag_ice_SHS/Global_seaice_timeseries/diag_CMIP5_HadGEM2-CC_OImon_historical_r1i1p1_sic_timeseries_SHS_ice_extent_diag_ice_SHS_1989_2004_ortho_map_Fractionalcover_2003DJF_0.png

valeriupredoi commented 1 year ago

@valeriupredoi I'm not sure, but it seems the difference in recipe_ocean_ice_extent can be connected with land/coastline boarders. See e.g.

v2.6.0 : https://esmvaltool.dkrz.de/shared/esmvaltool/v2.6.0/recipe_ocean_ice_extent_20220712_100640/plots/diag_ice_SHS/Global_seaice_timeseries/diag_CMIP5_HadGEM2-CC_OImon_historical_r1i1p1_sic_timeseries_SHS_ice_extent_diag_ice_SHS_1989_2004_ortho_map_Fractionalcover_2003DJF_0.png

v2.7.0 : https://esmvaltool.dkrz.de/shared/esmvaltool/v2.7.0/recipe_ocean_ice_extent_20221025_143917/plots/diag_ice_SHS/Global_seaice_timeseries/diag_CMIP5_HadGEM2-CC_OImon_historical_r1i1p1_sic_timeseries_SHS_ice_extent_diag_ice_SHS_1989_2004_ortho_map_Fractionalcover_2003DJF_0.png

Eagle-eyed, man :eagle: - coastline contours are more pronounced in 2.7 - cartopy change most probably, but they look the same to me!

valeriupredoi commented 1 year ago

OK this concludes the release testing marathon! Good news is there are not many bad apples among the recipes, bad news is there are a couple - see https://github.com/ESMValGroup/ESMValTool/issues/2881#issuecomment-1295001054 - we found a couple MAGICs project R recipes that look dubious and opened at least one issue about https://github.com/ESMValGroup/ESMValTool/issues/2890 - but since these recipes are unmaintained, developers who wrote them have in the meantime left the institutes they're listed under etc I am not going to hold the release for some Da Vinci Code-style tracking down; we need to think what we do with such recipes.

Oh and the ocean recipes by @ledm need some TLC but he's told me this for a while now, we should get together one time and fix them, no major bugs, but old crap that needs updating.

I declare this Tool ready for release! Many thanks to all who helped during this testing process @sloosvel @remi-kazeroni @schlunma @bettina-gier @TomasTorsvik and @bouweandela of course :grin: :beers:

valeriupredoi commented 1 year ago

it's out and about! :beer: https://pypi.org/project/ESMValTool/2.7.0/

bouweandela commented 1 year ago

You can pass that recipe, that's just a different sorting for the diff ensemble members in the histogram and looks diff cause they're not labeled. Cheers for the special extract for me ;)

@bettina-gier Would it be possible to sort the ensemble members in a way that is stable between runs? With the upcoming more regular recipe testing that @ehogan et al are working on, as described in #2723, issues like this will keep popping up.