ESGF / esgf-download

ESGF data transfer and replication tool
https://esgf.github.io/esgf-download/
BSD 3-Clause "New" or "Revised" License
15 stars 2 forks source link

Error when switching index nodes #42

Open meteorologist15 opened 6 months ago

meteorologist15 commented 6 months ago

Hello,

I recently attempted the following query:

esgpull search project:CMIP6 source_id:TaiESM1 variable:mc member_id:r1i1p1f1 experiment_id:amip

on both the 'esgf-node.llnl.gov' ESGF node and the "esgf-index1.ceda.ac.uk" ESGF node. I configured the API index node with the latter value and after the search query executed, there was a "list index out of range error" that popped up in the log...

Traceback (most recent call last):
  File "/net2/ker/anaconda3/envs/esgdownload/lib/python3.12/site-packages/esgpull/tui.py", line 164, in logging
    yield
  File "/net2/ker/anaconda3/envs/esgdownload/lib/python3.12/site-packages/esgpull/cli/search.py", line 177, in search
    results = esg.context.search(
              ^^^^^^^^^^^^^^^^^^^
  File "/net2/ker/anaconda3/envs/esgdownload/lib/python3.12/site-packages/esgpull/context.py", line 715, in search
    return fun(
           ^^^^
  File "/net2/ker/anaconda3/envs/esgdownload/lib/python3.12/site-packages/esgpull/context.py", line 628, in datasets
    results = self.prepare_search(
              ^^^^^^^^^^^^^^^^^^^^
  File "/net2/ker/anaconda3/envs/esgdownload/lib/python3.12/site-packages/esgpull/context.py", line 365, in prepare_search
    slices = _distribute_hits(
             ^^^^^^^^^^^^^^^^^
  File "/net2/ker/anaconda3/envs/esgdownload/lib/python3.12/site-packages/esgpull/context.py", line 235, in _distribute_hits
    offsets = _distribute_hits_impl(hits, offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net2/ker/anaconda3/envs/esgdownload/lib/python3.12/site-packages/esgpull/context.py", line 217, in _distribute_hits_impl
    accs[i] += steps[i]
    ~~~~^^^
IndexError: list index out of range

Any thoughts? Thanks.

svenrdz commented 5 months ago

Hi, sorry for the response time, I am still in the process of finding a balance in my time with other projects.

I have met the same error as you with the search command and the 2 index node hosts you provided. It seems the LLNL index url now redirects to the new metagrid frontend, I have no current knowledge about whether the search API still exists on their side, and it doesn't seem to exist on either of "esgf-node.llnl.gov" or the new "aims2.llnl.gov". Both return an HTTP error on the conventional /esg-search/search route, as described in the API documentation.

For the CEDA index, it works with the api.index_node value set to "esgf.ceda.ac.uk".

I think we can maybe improve the current state of things by providing a list of current working index node urls, somewhere in the documentation maybe.

As an aside, there is already a way to fetch all index node urls, however since "esgf-node.llnl.gov" is still part of the response, I don't think it is a reliable way to know which index currently works:

$ esgpull search --hints index_node --distrib true
[
  {
    "index_node": {
      "esg-dn1.nsc.liu.se": 989671,
      "esgdata.gfdl.noaa.gov": 26838,
      "esgf-data.dkrz.de": 1474336,
      "esgf-node.ipsl.upmc.fr": 1722123,
      "esgf-node.llnl.gov": 8412524,
      "esgf.ceda.ac.uk": 1220556,
      "esgf.nci.org.au": 353542
    }
  }
]