pydap / pydap

A Python library implementing the Data Access Protocol (DAP, aka OPeNDAP).
https://pydap.github.io/pydap/
MIT License
139 stars 87 forks source link

Can't get an authenticated connection to work to a THREDDS server #411

Open JimFluke opened 3 weeks ago

JimFluke commented 3 weeks ago

I am trying to use authentication credentials to connect to our TDS. I have tried embedding the credentials into the url, but I get this error:

url: https://fluke:d1ef3ce7e7c41de74192a362524ad0a460692a222d9dd796ee383b56e446d749%241%24d03ce0f88475505a68bd0eb37fa570df8120e59ccf62a4f580a55ad612f695c0e385893fe7205f7c181b221ab49bc817d4a33a2b2bb727fdc0ee3420e7e5b99e@gcin01.cira.colostate.edu/thredds/dodsC/cloudsat-data/2B-GEOPROF.P1_R05/2008/366/2008366031107_14239_CS_2B-GEOPROF_GRANULE_P1_R05_E02_F00.hdf

Traceback (most recent call last):
  File "/app/opendap_pydap_two.py", line 64, in <module>
    dataset = open_url(url)
              ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/client.py", line 68, in open_url
    handler = pydap.handlers.dap.DAPHandler(url, application, session, output_grid,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 71, in __init__
    self.make_dataset()
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 96, in make_dataset
    self.dataset_from_dap2()
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 109, in dataset_from_dap2
    pydap.net.raise_for_status(r)
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 38, in raise_for_status
    raise HTTPError(
webob.exc.HTTPError: 401 Unauthorized
<!doctype html><html lang="en"><head><title>HTTP Status 401 – Unauthorized</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 401 – Unauthorized</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Description</b> The request has not been applied to the target resource because it lacks valid authentication credentials for that resource.</p><hr class="line" /><h3>Apache Tomcat</h3></body></html>

But I understand this authentication method is from old documentation and will not work. So I have recently tried setting up a connection session:

url = 'https://gcin01.cira.colostate.edu/thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf'

session = setup_session(username, password, check_url=url)
dataset = open_url(url, session=session, protocol='dap4')

With this result:

Traceback (most recent call last):
  File "/app/opendap_pydap.py", line 49, in <module>
    session = setup_session(username, password, check_url=url)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/cas/urs.py", line 25, in setup_session
    session = get_cookies.setup_session(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/cas/get_cookies.py", line 81, in setup_session
    response = soup_login(
               ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/cas/get_cookies.py", line 144, in soup_login
    soup = BeautifulSoup(resp.content, "lxml")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/bs4/__init__.py", line 250, in __init__
    raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

This is an HDF4-EOS file being accessed from a THREDDS server, so the problem described in issue #401 will probably show up but only after the code gets passed this authentication problem.

Thanks!

Mikejmnez commented 3 weeks ago

@JimFluke thanks for reporting this issue.

It looks like you are missing lxml. Can you try pip install lxml and try again? hopefully it is just a dependency issue.

EDIT: I recently moved beautifulsoup4 and lxml to be installed as extra dependencies (and not as required dependencies) to make pydap more lightweight. This may have caused some trouble with authentication. Will investigate and report

Mikejmnez commented 3 weeks ago

Alternatively you can try install the complete server dependencies (as opposed to minimal dependencies) via conda:

conda install pydap-server

Let me know if that works

JimFluke commented 3 weeks ago

@Mikejmnez This is what I get when I pip install the lxml package:

2024-11-07 21:23:36,466 INFO    __main__: url: https://gcin01.cira.colostate.edu/thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf

/usr/local/lib/python3.11/site-packages/pydap/cas/get_cookies.py:129: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor.
  soup = BeautifulSoup(resp.content, "lxml")
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 199, in _new_conn
    sock = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/usr/local/lib/python3.11/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
OSError: [Errno 113] No route to host

I can get to the host with a browser from the same host I'm running the python script on, so I don't know why it's giving this error.

I'll try the conda install pydap-server method next.

JimFluke commented 3 weeks ago

But, it I try the same thing with the dap2 protocol it gives me this:

2024-11-07 22:08:49,782 INFO    __main__: url: https://gcin01.cira.colostate.edu/thredds/dodsC/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf

Traceback (most recent call last):
  File "/app/opendap_pydap.py", line 50, in <module>
    dataset = open_url(url, session=session, protocol=od_protocol)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/client.py", line 78, in open_url
    handler = pydap.handlers.dap.DAPHandler(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 98, in __init__
    self.make_dataset()
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 134, in make_dataset
    self.dataset_from_dap2()
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 178, in dataset_from_dap2
    raise_for_status(r)
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 37, in raise_for_status
    raise HTTPError(
webob.exc.HTTPError: 401 Unauthorized
<!doctype html><html lang="en"><head><title>HTTP Status 401 – Unauthorized</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 401 – Unauthorized</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Description</b> The request has not been applied to the target resource because it lacks valid authentication credentials for that resource.</p><hr class="line" /><h3>Apache Tomcat</h3></body></html>

Again, the authentication works through the browser, so I'm still confused.

ndp-opendap commented 3 weeks ago

The semantics of HTTP 401 Unauthorized include that the 401 error is an invitation for the client to resubmit the request with credentials if the client has them. I wonder - if the server that pyDAP is accessing is using a Single Sign-on Service for authentication, then the URL which returns the 401 may not be the same URL as the DAP service:

https://gcin01.cira.colostate.edu/thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf

But rather the URL of the authentication service.

I see that pretty frequently as an issue, but I don't know how pyDAP does it.

It might be the auth service URL could/would be passed into this call:

session = setup_session(username, password, check_url=url)

@Mikejmnez ?.

JimFluke commented 3 weeks ago

@Mikejmnez When I try this with conda install pydap-server I get the same results - with both dap2 and dap4 - as with adding lxml to the pip install. I'll look into the "auth service URL" and see what I find. Thanks!

Mikejmnez commented 3 weeks ago

Thanks @JimFluke that was useful - lxml needs to be included, but overall that does not fix your issue.

Like @ndp-opendap mentioned, we need to look at the auth process and I am not very familiar with this aspect so will need to some to look at and test.

JimFluke commented 2 weeks ago

@Mikejmnez @ndp-opendap That worked! I was eventually able to figure out what the check_url should be set to:

https://gcin01.cira.colostate.edu/thredds/restrictedAccess/DPCData

in my case. I got this from looking at the tomcat localhost_access_log.* file for the URL it was accessing when I was logging in with the browser. I was expecting setup_session() to need my digested password since I have the server configured to use those, but it requires my undigested password instead.

Thanks for all your help!

ndp-opendap commented 2 weeks ago

Nice work @JimFluke - It's a lot easier when the SSO is made a more visible part of the recipe. NASA's Earth Data Login requires similar invocation, but NASA makes a big deal about documenting EDL and how to use it.

Mikejmnez commented 2 weeks ago

@JimFluke Great news!

JimFluke commented 2 weeks ago

But, it only works with dap2. With dap4 I get the same No route to host error I got before.

ndp-opendap commented 2 weeks ago

But, it only works with dap2. With dap4 I get the same No route to host error I got before.

And this happens when you use:

url = 'https://gcin01.cira.colostate.edu/thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf'

or

url = 'dap4://gcin01.cira.colostate.edu/thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf'

???

Mikejmnez commented 2 weeks ago

@ndp-opendap The dap4 instead of https is effectively the same as specifying protocol= 'dap4' as argument. I am certain that that is how it is being used based on the original comment

url = 'https://gcin01.cira.colostate.edu/thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf'

session = setup_session(username, password, check_url=url) dataset = open_url(url, session=session, protocol='dap4')

It is very odd that you get two different behaviors if one use dap2 and dap4, because the check_url = "https://gcin01.cira.colostate.edu/thredds/restrictedAccess/DPCData" has no indication of dodsC or dap4 there... From pydap's perspective the auth is going through the same function, independent of dap2 or dap4. I wonder if it may be a TDS thing, since the URLs for DAP2 or DAP4 differ (as opposed to Hyrax, where the url to the data is exactly the same).

JimFluke commented 2 weeks ago

@ndp-opendap Substituting in dap4:// for the web protocol did not make any difference. Here is the full exception traceback:

2024-11-11 16:15:39,495 INFO    __main__: url: dap4://gcin01.cira.colostate.edu/thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf

2024-11-11 16:15:39,495 INFO    __main__: check_url: https://gcin01.cira.colostate.edu/thredds/restrictedAccess/DPCData

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 199, in _new_conn
    sock = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/usr/local/lib/python3.11/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
OSError: [Errno 113] No route to host

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 495, in _make_request
    conn.request(
  File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 441, in request
    self.endheaders()
  File "/usr/local/lib/python3.11/http/client.py", line 1298, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.11/http/client.py", line 1058, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.11/http/client.py", line 996, in send
    self.connect()
  File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 279, in connect
    self.sock = self._new_conn()
                ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 214, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fab500e2310>: Failed to establish a new connection: [Errno 113] No route to host

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/util/retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='gcin01.cira.colostate.edu', port=80): Max retries exceeded with url: /thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf.dmr (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fab500e2310>: Failed to establish a new connection: [Errno 113] No route to host'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/opendap_pydap.py", line 57, in <module>
    dataset = open_url(url, session=session, protocol=od_protocol)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/client.py", line 78, in open_url
    handler = pydap.handlers.dap.DAPHandler(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 98, in __init__
    self.make_dataset()
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 132, in make_dataset
    self.dataset_from_dap4()
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 148, in dataset_from_dap4
    r = GET(
        ^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 26, in GET
    response = follow_redirect(
               ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 78, in follow_redirect
    req = create_request(url, session=session, timeout=timeout, verify=verify)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 125, in create_request
    return create_request_from_session(url, session, timeout=timeout, verify=verify)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 139, in create_request_from_session
    session.head(url, allow_redirects=True, timeout=timeout, verify=verify)
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 624, in head
    return self.request("HEAD", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 700, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='gcin01.cira.colostate.edu', port=80): Max retries exceeded with url: /thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf.dmr (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fab500e2310>: Failed to establish a new connection: [Errno 113] No route to host'))

I know it's long, but I think I need to include it to show you something else that is confusing me. The top traceback - from the original exception - does not start with the open_url() call in my script. The last one does but not the first one. So maybe that's a clue?

@Mikejmnez I don't know why the TDS would be different, but it sure seems like it is.

JimFluke commented 2 weeks ago

I have now managed to notice that it is trying to connect through port 80! For both 'https://' and 'dap4://'. When I specify port 443 it still doesn't work, but I get a different error:

2024-11-11 16:50:06,283 INFO    __main__: url: dap4://gcin01.cira.colostate.edu:443/thredds/dap4/cloudsat-data/2B-GEOPROF.P1_R05/2013/180/2013180111833_38146_CS_2B-GEOPROF_GRANULE_P1_R05_E06_F00.hdf

2024-11-11 16:50:06,283 INFO    __main__: check_url: https://gcin01.cira.colostate.edu:443/thredds/restrictedAccess/DPCData
2024-11-11 16:50:06,283 INFO    __main__: od_protocol: dap4

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 507, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/http/client.py", line 1395, in getresponse
    response.begin()
  File "/usr/local/lib/python3.11/http/client.py", line 325, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/http/client.py", line 286, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/socket.py", line 718, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/util/retry.py", line 474, in increment
    raise reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/util/util.py", line 38, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 507, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/http/client.py", line 1395, in getresponse
    response.begin()
  File "/usr/local/lib/python3.11/http/client.py", line 325, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/http/client.py", line 286, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/socket.py", line 718, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/opendap_pydap.py", line 59, in <module>
    dataset = open_url(url, session=session, protocol=od_protocol)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/client.py", line 78, in open_url
    handler = pydap.handlers.dap.DAPHandler(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 98, in __init__
    self.make_dataset()
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 132, in make_dataset
    self.dataset_from_dap4()
  File "/usr/local/lib/python3.11/site-packages/pydap/handlers/dap.py", line 148, in dataset_from_dap4
    r = GET(
        ^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 26, in GET
    response = follow_redirect(
               ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 78, in follow_redirect
    req = create_request(url, session=session, timeout=timeout, verify=verify)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 125, in create_request
    return create_request_from_session(url, session, timeout=timeout, verify=verify)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pydap/net.py", line 139, in create_request_from_session
    session.head(url, allow_redirects=True, timeout=timeout, verify=verify)
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 624, in head
    return self.request("HEAD", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 682, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

And now I see that port 443 is being specified for the check_url. I'll see what happens without that.

JimFluke commented 2 weeks ago

Leaving out the port :443 string for the check_url does not make any difference.

JimFluke commented 6 days ago

@Mikejmnez @ndp-opendap When I went to upgrade from the thredds-docker:5.4 image to the 5.5 image I saw that we never had the dap4 service enabled, so it's no surprise that it did not work for me. Sorry for the red herring.

Note that adding dap4 does not work well for us. It ignores our authentication configuration. At least when using the website. And I still can't get it to work from Python.

Mikejmnez commented 5 days ago

@JimFluke - thanks for the heads up. I have not had much time to look at the authentication issue with pydap and thredds. I think it makes sense to stick with DAP2 for now, as I come to understand that Thredds has focused more on DAP2 than DAP4 in the past. Full disclosure both @ndp-opendap and I are not well versed with Thredds so it is taking us a bit of time regarding the thredds, dap4 and authentication issue. We are developers of Hyrax, the OPeNDAP server developed and maintained by OPeNDAP, Inc, and through the many years working with NASA, the Hyrax data server has focused more on DAP4.

That said - we are working closely with the Unidata folks to offer better pydap support/access to Thredds with DAP4.