ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
210 stars 121 forks source link

CMIP5 Omon CESM1 data on DKRZ has gone walkies πŸ˜• #3693

Open ehogan opened 5 days ago

ehogan commented 5 days ago

The following recipes all passed during earlier testing of v2.11.0, see https://github.com/ESMValGroup/ESMValCore/issues/2421. How they are now failing with missing data:

schlunma commented 5 days ago

Have you run the recipes with search_esgf=when_missing? For example, the data missing for recipe_wenzel14jgr.yml is in the shared download directory on Levante, which is currently only read when search_esgf=when_missing.

/work/bd0854/DATA/ESMValTool2/download/cmip5/output1/NSF-DOE-NCAR/CESM1-BGC/1pctCO2/mon/ocnBgchem/Omon/r1i1p1/v20121029/fgco2_Omon_CESM1-BGC_1pctCO2_r1i1p1_000101-014012.nc
ehogan commented 5 days ago

Have you run the recipes with search_esgf=when_missing? For example, the data missing for recipe_wenzel14jgr.yml is in the shared download directory on Levante, which is currently only read when search_esgf=when_missing.

Yes πŸ™

I can ls that path right now. Is that path mounted in some way? Could it have temporarily disappeared when the recipes were running? Perhaps I will try re-running the recipes ... πŸ€”

bouweandela commented 5 days ago

Maybe try running with search_esgf=always instead. The when_missing option will only search ESGF if no files are available at all for the dataset.

ehogan commented 5 days ago

Maybe try running with search_esgf=always instead. The when_missing option will only search ESGF if no files are available at all for the dataset.

Unfortunately this didn't work; I still see the same missing data failures πŸ™

schlunma commented 5 days ago

I get the exact same error with the current main branches.

Funnily enough, when using

rootpath:
  CMIP5:
    /work/bd0854/DATA/ESMValTool2/download: ESGF

in the config file (this only works with the latest main, not the release branch), ESMValCore finds the data, but returns this error:

2024-06-27 11:11:33,746 UTC [2721773] ERROR   Program terminated abnormally, see stack trace below for more information:
Traceback (most recent call last):
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 533, in run
    fire.Fire(ESMValTool())
  File "/work/bd0854/b309141/miniforge3/envs/esm/lib/python3.11/site-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/work/bd0854/b309141/miniforge3/envs/esm/lib/python3.11/site-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/work/bd0854/b309141/miniforge3/envs/esm/lib/python3.11/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 413, in run
    self._run(recipe, session)
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 455, in _run
    process_recipe(recipe_file=recipe, session=session)
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_main.py", line 130, in process_recipe
    recipe.run()
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/recipe.py", line 1090, in run
    filled_recipe = self.write_filled_recipe()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/recipe.py", line 1128, in write_filled_recipe
    recipe = datasets_to_recipe(USED_DATASETS, self._raw_recipe)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/from_datasets.py", line 341, in datasets_to_recipe
    dataset_recipe = _datasets_to_recipe(datasets)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/from_datasets.py", line 83, in _datasets_to_recipe
    recipe = _move_datasets_up(recipe)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/from_datasets.py", line 92, in _move_datasets_up
    _move_one_level_up(diagnostic, 'variables', 'additional_datasets')
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/from_datasets.py", line 119, in _move_one_level_up
    dataset_mapping[name] = {
                            ^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/from_datasets.py", line 120, in <dictcomp>
    _to_frozen(ds): ds
    ^^^^^^^^^^^^^^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/from_datasets.py", line 105, in _to_frozen
    return tuple(sorted((k, _to_frozen(v)) for k, v in item.items()))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/from_datasets.py", line 105, in <genexpr>
    return tuple(sorted((k, _to_frozen(v)) for k, v in item.items()))
                            ^^^^^^^^^^^^^
  File "/home/b/b309141/repos/ESMValCore/esmvalcore/_recipe/from_datasets.py", line 103, in _to_frozen
    return tuple(sorted(_to_frozen(elem) for elem in item))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '<' not supported between instances of 'str' and 'tuple'
2024-06-27 11:11:33,763 UTC [2721773] INFO
If you have a question or need help, please start a new discussion on https://github.com/ESMValGroup/ESMValTool/discussions
If you suspect this is a bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues
To make it easier to find out what the problem is, please consider attaching the files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
bouweandela commented 5 days ago

Unfortunately this didn't work; I still see the same missing data failures πŸ™

It looks like the data has just been deleted from ESGF, see e.g. this search: https://aims2.llnl.gov/search?project=CMIP5&activeFacets=%7B%22project%22%3A%22CMIP5%22%2C%22model%22%3A%22CESM1%28BGC%29%22%2C%22variable%22%3A%22msftmyz%22%2C%22ensemble%22%3A%22r1i1p1%22%2C%22cmor_table%22%3A%22Omon%22%7D Note there is no data for the historical experiment any more. If data is not on ESGF, esmvaltool cannot find it either, even if the files are in its download cache (i.e. the directory specified by download_dir in config-user.yml.

You could subscribe for the ESGF user mailinglist and ask there: https://esgf.github.io/mailing-list.html

bouweandela commented 5 days ago

@schlunma The error you are getting looks like it is unrelated, would it make sense to open a new issue for that?

valeriupredoi commented 5 days ago

@schlunma The error you are getting looks like it is unrelated, would it make sense to open a new issue for that?

Blast! Which reminds me, I should go check pyesgf for compat with Numpy 2.0 :fearful:

valeriupredoi commented 5 days ago

OK just checked JASMIN - data is there in /badc/cmip5/data/cmip5/output1/NSF-DOE-NCAR/CESM1-BGC/1pctCO2/mon/ocean/Omon/r1i1p1/latest/ but no fgco2

schlunma commented 5 days ago

@schlunma The error you are getting looks like it is unrelated, would it make sense to open a new issue for that?

See https://github.com/ESMValGroup/ESMValCore/issues/2466

ehogan commented 5 days ago

Unfortunately this didn't work; I still see the same missing data failures πŸ™

It looks like the data has just been deleted from ESGF, see e.g. this search: https://aims2.llnl.gov/search?project=CMIP5&activeFacets=%7B%22project%22%3A%22CMIP5%22%2C%22model%22%3A%22CESM1%28BGC%29%22%2C%22variable%22%3A%22msftmyz%22%2C%22ensemble%22%3A%22r1i1p1%22%2C%22cmor_table%22%3A%22Omon%22%7D Note there is no data for the historical experiment any more. If data is not on ESGF, esmvaltool cannot find it either, even if the files are in its download cache (i.e. the directory specified by download_dir in config-user.yml.

You could subscribe for the ESGF user mailinglist and ask there: https://esgf.github.io/mailing-list.html

Do you think it has been removed from ESGF by accident? Thinking about the release, if it was accidental, how quickly would an issue like this be resolved? Would it be better to just add these recipes to the list of broken recipes for now? I'm keen not to hold up the release any further! 😒

valeriupredoi commented 5 days ago

re-adding them on ESGF will take forever, and it won't happen - nobody wants to deal with CMIP5 data anymore. I say add this to the broken list, and carry on @ehogan :beer:

ehogan commented 5 days ago

re-adding them on ESGF will take forever, and it won't happen - nobody wants to deal with CMIP5 data anymore. I say add this to the broken list, and carry on @ehogan 🍺

Thanks @valeriupredoi; added to #3662 πŸ‘