esgf2-us / metagrid

ESGF Search UI
https://metagrid.readthedocs.io/en/latest/
MIT License
16 stars 6 forks source link

Support python3 in wget scripts #545

Open rafa-guedes opened 1 year ago

rafa-guedes commented 1 year ago

Is your feature request related to a problem? Please describe

The wget scripts downloaded from the platform are defined for python2 and fail on python3. It would be helpful have an option to download python3 versions of the wget scripts.

Describe the solution you'd like

Change the following part of the script:

import sys
import re
import xml.etree.ElementTree as ET
import urllib2
openid = "$1"
username = "$2" or re.sub(".*/", "", openid)
e = ET.parse(urllib2.urlopen(openid))
servs = [el for el in e.getiterator() if el.tag.endswith("Service")]
for serv in servs:
    servinfo = dict([(re.sub(".*}", "", c.tag), c.text)
                     for c in serv.getchildren()])
    try:
        if servinfo["Type"].endswith("myproxy-service"):
            m = re.match("socket://(.*):(.*)", servinfo["URI"])
            if m:
                host = m.group(1)
                port = m.group(2)
                print "-s %s -p %s -l %s" % (host, port, username)
                break
    except KeyError:
        continue
else:
    sys.stderr.write("myproxy service could not be found\n")
    sys.exit(1)

Into:

import sys
import re
import xml.etree.ElementTree as ET
from urllib.request import urlopen
openid = "$1"
username = "$2" or re.sub(".*/", "", openid)
e = ET.parse(urlopen(openid))
servs = [el for el in e.iter() if el.tag.endswith("Service")]
for serv in servs:
    servinfo = {re.sub(r".*}", "", c.tag): c.text for c in serv}
    try:
        if servinfo["Type"].endswith("myproxy-service"):
            m = re.match(r"socket://(.*):(.*)", servinfo["URI"])
            if m:
                host = m.group(1)
                port = m.group(2)
                print("-s %s -p %s -l %s" % (host, port, username))
                break
    except KeyError:
        continue
else:
    sys.stderr.write("myproxy service could not be found\n")
    sys.exit(1)

Describe alternatives you've considered

The above changes works for me

sashakames commented 1 year ago

Hi @rafa-guedes the wget API should produce a .sh script. Can you describe what you did using the Metagrid UI to obtain a Python script instead?

rafa-guedes commented 1 year ago

Hi @sashakames I did get the bash script however inside the script there is a python code embedded in the openid_to_myproxy_args bash function which is defined in the python2 syntax. The bash scripts only work for me when I replace that code by the python3 equivalent as above.

sashakames commented 1 year ago

@rafa-guedes thank you for clarifying that. This is to download CORDEX data? For CMIP6 you should use -s to bypass the myproxy or for CORDEX use -H. We are updating the wget API as a separate effort but not yet deployed.

sashakames commented 1 year ago

@rafa-guedes The site at https://esgf-dev1.llnl.gov/ uses the new wget script. You could try that instead.

sashakames commented 10 months ago

CORDEX data is now unrestricted. We can migrate to the updated wget API once it is more stably hosted..