ESMValGroup / ESMValCore

ESMValCore: A community tool for pre-processing data from Earth system models in CMIP and running analysis scripts.
https://www.esmvaltool.org
Apache License 2.0
42 stars 38 forks source link

Invalid ESGF query for CERES-EBAF datasets #2142

Closed rbeucher closed 1 year ago

rbeucher commented 1 year ago

Hi All,

So I cam across an issue affection quite a few recipes. When CERES-EBAF data are missing, the code sends an ESGF query with an empty time_frequency argument:

Starting new HTTPS connection (1): esgf.ceda.ac.uk:443 2023-06-08 07:19:48,809 UTC [176196] DEBUG https://esgf.ceda.ac.uk:443 "GET /esg-search/search?format=application%2Fsolr%2Bjson&limit=500&distrib=true&offset=0&type=File&project=obs4MIPs&source_id=CERES-EBAF&time_frequency=&variable=swcre HTTP/1.1" 400 832

This results in an invalid query.

files = esgf_search_files(esgf_facets) File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.1/lib/python3.10/site-packages/esmvalcore/esgf/_search.py", line 150, in esgf_search_files results = _search_index_nodes(facets) File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.1/lib/python3.10/site-packages/esmvalcore/esgf/_search.py", line 119, in _search_index_nodes results = context.search( File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.1/lib/python3.10/site-packages/pyesgf/search/context.py", line 141, in search return ResultSet(sc, batch_size=batch_size) File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.1/lib/python3.10/site-packages/pyesgf/search/results.py", line 42, in __init__ self.__get_batch(0) File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.1/lib/python3.10/site-packages/pyesgf/search/results.py", line 82, in __get_batch response = (self.context.connection File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.1/lib/python3.10/site-packages/pyesgf/search/connection.py", line 159, in send_search response = self._send_query('search', full_query) File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.1/lib/python3.10/site-packages/pyesgf/search/connection.py", line 210, in _send_query raise Exception("Invalid query parameter(s): %s" % content) Exception: Invalid query parameter(s): 2023-06-08 07:19:49,563 UTC [176196] INFO

I know the CERES-EBAF observation dataset is affected by issue ESMValGroup/ESMValTool#2974 but this seems unrelated.

Recipes affected are:

recipe_autoassess_landsurface_surfrad recipe_clouds_ipcc recipe_cmug_h2o recipe_deangelis15nat recipe_flato13ipcc_figures_92_95 recipe_lauer13jclim recipe_perfmetrics_CMIP5 recipe_validation recipe_wenzel16jclim

valeriupredoi commented 1 year ago

many thanks for raising this @rbeucher - am gonna move this to ESMValCore since the bug needs a fix plopped in there :+1:

valeriupredoi commented 1 year ago

Hi again @rbeucher - the faceted search is populated correctly when I run a test of recipe_validation.yml with the latest development installation of both ESMValCore and ESMValTool (see below); are you using any special credentials to connect to ESGF? Also, could you please post the output of esmvaltool version here and also the output for conda list esgf? Cheers :beer:

2023-07-17 13:13:44,180 UTC [29443] DEBUG   Searching https://esgf.ceda.ac.uk/esg-search for datasets using facets={'project': 'obs4MIPs', 'source_id': 'CERES-EBAF', 'time_frequency': 'mon', 'variable': 'rsut'}
2023-07-17 13:13:44,180 UTC [29443] DEBUG   Initializing backend: None /home/valeriu/.esmvaltool/cache/pyesgf-search-results
2023-07-17 13:13:44,180 UTC [29443] DEBUG   Initialized SQLiteDict with serializer: SerializerPipeline(name=pickle, n_stages=2)
2023-07-17 13:13:44,180 UTC [29443] DEBUG   Opening connection to /home/valeriu/.esmvaltool/cache/pyesgf-search-results.sqlite:responses
2023-07-17 13:13:44,181 UTC [29443] DEBUG   Initialized SQLiteDict with serializer: None
2023-07-17 13:13:44,181 UTC [29443] DEBUG   Opening connection to /home/valeriu/.esmvaltool/cache/pyesgf-search-results.sqlite:redirects
2023-07-17 13:13:44,185 UTC [29443] DEBUG   Cache directives from request headers: CacheDirectives()
2023-07-17 13:13:44,185 UTC [29443] DEBUG   Pre-read cache checks: Passed
2023-07-17 13:13:44,186 UTC [29443] DEBUG   Post-read cache actions: CacheActions(expire_after=86400)
2023-07-17 13:13:44,186 UTC [29443] DEBUG   Closing backend connections
2023-07-17 13:13:44,187 UTC [29443] DEBUG   Correcting facet 'modeling_realm' from 'None' to 'atmos' for obs4MIPs.CERES-EBAF.v20160610.rsut_CERES-EBAF_L3B_Ed2-8_200003-201404.nc
2023-07-17 13:13:44,187 UTC [29443] DEBUG   Correcting facet 'version' from 'None' to 'v20160610' for obs4MIPs.CERES-EBAF.v20160610.rsut_CERES-EBAF_L3B_Ed2-8_200003-201404.nc
2023-07-17 13:13:44,187 UTC [29443] DEBUG   Found the following files matching facets {'project': 'obs4MIPs', 'source_id': 'CERES-EBAF', 'time_frequency': 'mon', 'variable': 'rsut'}: 
ESGFFile:obs4MIPs/CERES-EBAF/v20160610/rsut_CERES-EBAF_L3B_Ed2-8_200003-201404.nc on hosts ['dpesgf03.nccs.nasa.gov']
2023-07-17 13:13:44,187 UTC [29443] DEBUG   Selected files:
ESGFFile:obs4MIPs/CERES-EBAF/v20160610/rsut_CERES-EBAF_L3B_Ed2-8_200003-201404.nc on hosts ['dpesgf03.nccs.nasa.gov']
2023-07-17 13:13:44,188 UTC [29443] DEBUG   Using input files for variable rsut of dataset obs4MIPs:
/home/valeriu/climate_data/obs4MIPs/CERES-EBAF/v20160610/rsut_CERES-EBAF_L3B_Ed2-8_200003-201404.nc (will be downloaded)
rbeucher commented 1 year ago

Hi @valeriupredoi ,

Here are the details. Curious that esmvaltool version returns a dev version. I am working in a conda environment, installed via conda-forge.

Singularity> esmvaltool version
ESMValCore: 2.9.0
ESMValTool: 2.9.0.dev48+g7e0c5a5b6
Singularity> which esmvaltool
/g/data/xp65/public/apps/med_conda/envs/access-med-0.1/bin/esmvaltool
Singularity> conda list esmvaltool
# packages in environment at /g/data/xp65/public/apps/med_conda/envs/access-med-0.1:
#
# Name                    Version                   Build  Channel
esmvaltool                2.9.0              pyhd8ed1ab_0    conda-forge
esmvaltool-ncl            2.9.0                hd8ed1ab_0    conda-forge
esmvaltool-python         2.9.0              pyhd8ed1ab_0    conda-forge
esmvaltool-r              2.9.0                hd8ed1ab_0    conda-forge
Singularity> conda list esgf
# packages in environment at /g/data/xp65/public/apps/med_conda/envs/access-med-0.1:
#
# Name                    Version                   Build  Channel
esgf-pyclient             0.3.1              pyh1a96a4e_2    conda-forge
Singularity> 
rbeucher commented 1 year ago

I am not using any specific credentials.

valeriupredoi commented 1 year ago

I am rather baffled - I have reproduced the install process you have (just in case there may be an issue with our deployed version - was dreading it haha) and all still works for me though - could you please understand why you have a dev version for esmvaltool? It shouldn't matter since all the esgf stuff is happening inside esmvaltool, but that's no bueno for a conda install to have any other version than stable 2.9.0. Can you maybe try and install esmvaltool yourself, w/o relying on a central install via a container, and rerun exactly this - I am using the recipe_validation.yml as my test bed :beer: Installing is easy (from conda), see instructions here https://docs.esmvaltool.org/en/latest/quickstart/installation.html#install-on-linux

valeriupredoi commented 1 year ago

installing the tool by yourself ensures that no code tampering happened (by whoever installed it on the system), and may expose what @remi-kazeroni mentions here https://github.com/ESMValGroup/ESMValTool/issues/3293#issuecomment-1642305543

rbeucher commented 1 year ago

Yes. I haven't been able to figure out why it is showing a Dev version. I'm using a conda environment inside a container. All installed by me... I'll keep investigating.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Valeriu Predoi @.> Sent: Thursday, July 20, 2023 1:35:42 AM To: ESMValGroup/ESMValCore @.> Cc: Romain Beucher @.>; Mention @.> Subject: Re: [ESMValGroup/ESMValCore] Invalid ESGF query for CERES-EBAF datasets (Issue #2142)

installing the tool by yourself ensures that no code tampering happened (by whoever installed it on the system), and may expose what @remi-kazeronihttps://github.com/remi-kazeroni mentions here ESMValGroup/ESMValTool#3293 (comment)https://github.com/ESMValGroup/ESMValTool/issues/3293#issuecomment-1642305543

— Reply to this email directly, view it on GitHubhttps://github.com/ESMValGroup/ESMValCore/issues/2142#issuecomment-1642317151, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AA4CVFSNZ7COWN6D6XOAKP3XQ75E5ANCNFSM6AAAAAA2M5J2IQ. You are receiving this because you were mentioned.Message ID: @.***>

rbeucher commented 1 year ago

So I can confirm that I am using 2.9.0 not a Dev version. The version command was picking up a remnant of a Dev installation in my '.local' folder.

valeriupredoi commented 1 year ago

that's not good ie having old installs/other dependencies still in path - makes me wonder what else is there in the path that may lead to a bit of a strange behaviour. Do you have access to an HPC perhaps? It'd be good to test there - or even a different machine where you plop a fresh installation from scratch, if you have such access - am afraid that, without being able to replicate the issue, I can't do much more, since this could be caused by things that I can't even think of - maybe @schlunma @remi-kazeroni or @bouweandela (when he's back from holidays) can help a bit more? :beer:

rbeucher commented 1 year ago

Yes I have reinstalled. It is a fresh conda installation and it's on HPC. We need to install a development version for debugging and development anyway so I'll get back to you when I find the root of the issue.

This is part of our effort to have a data pool and automated runs or the ESMValTool recipes at NCI / ACCESS-NRI. Hopefully we will have something similar to what you have at DKRZ soon and will be able to contribute.

rbeucher commented 1 year ago

@valeriupredoi It was indeed an issue with the DRS in the config-developer file. Thanks for pointing me toward this. I am closing this now.

valeriupredoi commented 1 year ago

oh man, big PHEW - was fast running out of ideas :grin: Great you figured it out and cheers muchly @remi-kazeroni for thinking of the possible hiccup! Welcome to ESMValTool (through brimstone and fire) :grin: Cheers for closing this!