ESGF / esgf-pyclient

Search client for the ESGF Search API
https://esgf-pyclient.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
33 stars 18 forks source link

lm.logon timeout #85

Closed thomascrocker closed 2 years ago

thomascrocker commented 2 years ago

I'm attempting to use esgf-pyclient to help download some data, but am stuck logging on.

I have an OpenID account with CEDA. Which is https://ceda.ac.uk/openid/Thomas.Crocker My username to login at CEDA is tcrocker

All my attempts to connect all lead to: TimeoutError: [Errno 110] Connection timed out

I have tried:

$ OPENID = 'https://ceda.ac.uk/openid/Thomas.Crocker'

$ lm.logon_with_openid(openid=OPENID, password=None, bootstrap=True)
Enter myproxy username: tcrocker
Enter password for tcrocker: 

and

$ proxyhost = 'esgf-index1.ceda.ac.uk'

$ lm.logon(hostname=proxyhost, interactive=True, bootstrap=True)
Enter myproxy username: tcrocker
Enter password for tcrocker: 

and the same as above but with proxyhost set to esgf.ceda.ac.uk

Can anyone advise how to get this to work? I am based at the UK Met Office so I wonder if the problem could be related to our network firewall in some way?

bouweandela commented 2 years ago

Can you try with proxyhost = 'slcs.ceda.ac.uk'?

thomascrocker commented 2 years ago

Can you try with proxyhost = 'slcs.ceda.ac.uk'?

Same problem I'm afraid

thomascrocker commented 2 years ago

Not sure if significant, but might be worth pointing out that when logging in to the esgf website at https://esgf-data.dkrz.de/login/ and selecting my CEDA openID I am then redirected to https://ceda.ac.uk/OpenID/Provider/server where I enter my username and password, after submitting, I am redirected to this page: image asking me to confirm approval and passing my credentials back.. at which point I am finally redirected back to esgf in a logged in state. Could it be that the CEDA servers are expecting some sort of extra confirmation after receiving username and password before finally passing back the accepted authorisation?

bouweandela commented 2 years ago

As far as I understand it, your openid is provided by CEDA, so that's why you're taken to CEDA for logging in and then back to the website where you came from.

Maybe you could try a utility like tracepath or traceroute --resolve-hostnames to see if your network traffic gets stuck somewhere behind your firewall?

thomascrocker commented 2 years ago

As far as I understand it, your openid is provided by CEDA, so that's why you're taken to CEDA for logging in and then back to the website where you came from.

Maybe you could try a utility like tracepath or traceroute --resolve-hostnames to see if your network traffic gets stuck somewhere behind your firewall?

Thanks,

It looks like the traffic gets stuck very early on (see below).

I get a similar result if I do for example traceroute google.co.uk

What should I be asking my IT admins to do? (Assuming they will agree to it...)

$ traceroute slcs.ceda.ac.uk
traceroute to slcs.ceda.ac.uk (130.246.130.75), 30 hops max, 60 byte packets
 1  10.152.51.252 (10.152.51.252)  0.735 ms 10.152.51.251 (10.152.51.251)  0.610 ms 10.152.51.252 (10.152.51.252)  0.769 ms
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *
bouweandela commented 2 years ago

You could ask them to allow you to connect to the hostname you're trying to reach. If you use the --resolve-hostnames flag with traceroute you can see at which host your traffic stops (or at least which host still responds with the required diagnostic information).

thomascrocker commented 2 years ago

It looks like the version of traceroute I have installed doesn't have a --resolve-hostnames option. In any case I suspect the IPs being printed here are internal to our network.

I'll point them to this issue and see what response I get.

thomascrocker commented 2 years ago

Just a thought, assuming https is used to connect to the authentication server, I have an internal proxy server for https that needs to be configured, this is obviously setup in my browser, hence why logging in via the website works, is there a way to setup esgf-pyclient to use it?

bouweandela commented 2 years ago

I don't know, but maybe @agstephens or @cehbrecht can tell you more?

soay commented 2 years ago

Maybe you could try to set the linux system environment variables (http_proxy and HTTP_PROXY) to point to your proxy server. I'm not sure this works but might be worth a try.

thomascrocker commented 2 years ago

Maybe you could try to set the linux system environment variables (http_proxy and HTTP_PROXY) to point to your proxy server. I'm not sure this works but might be worth a try.

thanks for the suggestion. On my system http_proxy was set, but HTTP_PROXY was not. Unfortunately setting HTTP_PROXY (which I confirmed in an interactive terminal with os.environ) didn't make a difference.

thomascrocker commented 2 years ago

Small update here... Just to confirm I've got the package working as expected on JASMIN (and will probably download what I need there and then rsync over to where I need it) so the issue definitely lies with my network setup. I'll close this for the time being.