ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
217 stars 127 forks source link

ESGF data not found #3737

Closed rswamina closed 1 month ago

rswamina commented 1 month ago

I am having some difficulty downloading daily pr data from ESGF nodes. I checked online and the data seems to be there. I am using the module of esmvaltool on JASMIN. The recipe I used is below:

# ESMValTool
# recipe_test_ssp245_daily_pr.yml
---
documentation:
  description: |
    This is a recipe to download data sets from ESGF nodes and extract IPCC regions.

  authors:
    - swaminathan_ranjini

  title: |

    Recipe to download data from ESGF nodes and extract regions.

  maintainer:
    - swaminathan_ranjini

datasets: 

  - {dataset: UKESM1-0-LL, project: CMIP6, exp: ssp245, ensemble: r(1:4)1i1p1f2, start_year: 2081, end_year: 2100, grid: gn}

  - {dataset: UKESM1-0-LL, project: CMIP6, exp: ssp245, ensemble: r(8:10)i1p1f2, start_year: 2081, end_year: 2100, grid: gn}

preprocessors:
  preproc_extract_region_land_NCA:
    extract_shape:
      shapefile : IPCC-AR6-shapefiles/IPCC-WGI-reference-regions-v4.shp
      decomposed : False
      method : contains
      crop: True
      ids: 
        - 'N.Central-America'
    mask_landsea:
      mask_out : sea

 diagnostics:
  day_pr_CIM:
    description: calculate annual means for region
    variables:
      pr:
        preprocessor: preproc_extract_region_land_NCA
        project: CMIP6
        mip: day
    scripts: null

The error message I get is :

2024-08-14 13:33:06,885 UTC [10927] INFO    Found input files for Dataset: pr, day, CMIP6, UKESM1-0-LL, ScenarioMIP, ssp245, r10i1p1f2, gn, v20210507, supplementaries: sftlf, fx, CMIP, piControl, r1i1p1f2, v20190705
2024-08-14 13:33:06,885 UTC [10927] ERROR   Could not create all tasks
2024-08-14 13:33:06,885 UTC [10927] ERROR   Missing data for preprocessor day_pr_CIM/pr:
- Missing data for Dataset: pr, day, CMIP6, UKESM1-0-LL, ScenarioMIP, ssp245, r21i1p1f2, gn, supplementaries: sftlf, fx, CMIP, piControl, r1i1p1f2
- Missing data for Dataset: pr, day, CMIP6, UKESM1-0-LL, ScenarioMIP, ssp245, r31i1p1f2, gn, supplementaries: sftlf, fx, CMIP, piControl, r1i1p1f2
- Missing data for Dataset: pr, day, CMIP6, UKESM1-0-LL, ScenarioMIP, ssp245, r41i1p1f2, gn, supplementaries: sftlf, fx, CMIP, piControl, r1i1p1f2
2024-08-14 13:33:07,323 UTC [10927] INFO    Maximum memory used (estimate): 0.3 GB
2024-08-14 13:33:07,324 UTC [10927] INFO    Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2024-08-14 13:33:07,325 UTC [10927] ERROR   Could not create all tasks
2024-08-14 13:33:07,326 UTC [10927] DEBUG   Stack trace for debugging:
Traceback (most recent call last):
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/esmvalcore/_main.py", line 518, in run
    fire.Fire(ESMValTool())
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/esmvalcore/_main.py", line 405, in run
    self._run(recipe, session)
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/esmvalcore/_main.py", line 447, in _run
    process_recipe(recipe_file=recipe, session=session)
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/esmvalcore/_main.py", line 127, in process_recipe
    recipe = read_recipe_file(recipe_file, session)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/esmvalcore/_recipe/recipe.py", line 73, in read_recipe_file
    return Recipe(raw_recipe, session, recipe_file=filename)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/esmvalcore/_recipe/recipe.py", line 720, in __init__
    self.tasks = self.initialize_tasks()
                 ^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/esmvalcore/_recipe/recipe.py", line 1046, in initialize_tasks
    tasks = self._create_tasks()
            ^^^^^^^^^^^^^^^^^^^^
  File "/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/site-packages/esmvalcore/_recipe/recipe.py", line 1034, in _create_tasks
    raise recipe_error
esmvalcore.exceptions.RecipeError: Could not create all tasks

I have included the following nodes in my esgfpy-client.yml file:

search_connection:
  urls:
    - 'https://esgf-data.dkrz.de/esg-search'
    - 'https://esgf-node.llnl.gov/esg-search'
    - 'https:/esgf-data1.llnl.gov/esg-search'
    - 'https://aims3.llnl.gov/esg-search'
    - 'https://esgf.ceda.ac.uk/esg-search'
    - 'https://esgf-node.ipsl.upmc.fr/esg-search'
    - 'https://esg-dn1.nsc.liu.se/esg-search'
    - 'https://esg-dn2.nsc.liu.se/esg-search'
    - 'https://esgf.nci.org.au/esg-search'
    - 'https://esgf.nccs.nasa.gov/esg-search'
    - 'https://esgdata.gfdl.noaa.gov/esg-search'
    - 'https://crd-esgf-drc.ec.gc.ca/esg-search'
    - 'https://esgf.bcs.es/esg-search'
distrib: true
timeout: 600  # seconds

My understanding was that if we can see the files on the ESGF node website, we should be able to download them. Is that not right?

Thanks!

rswamina commented 1 month ago

Tagging @valeriupredoi and @jprb-walton for some clarity on this.

valeriupredoi commented 1 month ago

hmmm @bouweandela what you reckon? @rswamina you not using any sort of credentials to try and logon/in to any of the ESGF nodes, are you?

valeriupredoi commented 1 month ago

wait a second: are you looking for r41 really? Dataset: pr, day, CMIP6, UKESM1-0-LL, ScenarioMIP, ssp245, r41i1p1f2, gn, supplementaries: sftlf, fx, CMIP, piControl, r1i1p1f2 - don't think UKESM did 40+ r's, plus your syntax is asking for r(1:4)1i1p1f2 which to means r14 is maximum

rswamina commented 1 month ago

Sorry..that is a typo and I think the error says r21i1p1f2 is missing which is not surprising. Let me rerun it...I have been looking so closely at this that I missed the '1' :(

rswamina commented 1 month ago

That worked and no I didn't use any credentials or at least I don't think I didn't set anything up. I have one further question before we close this issue. If different recipes download the same data, say exp: historical, start_year: 1990, end_year: 2014 from the same model across different recipes, then should the second or subsequent recipes reuse data that was once downloaded to a previously stated location? As in will ESMValTool re-download/overwrite the data or use what is there?

valeriupredoi commented 1 month ago

no, if you set the search_esgf: when_missing parameter-value in your user config file, then it'll download data from ESGF only if the data is missing from the local pool that you point it to :+1:

rswamina commented 1 month ago

Cheers! Thanks, @valeriupredoi . I will close this issue now.

valeriupredoi commented 1 month ago

cheers, Ranjini :beer: