Unidata / netcdf-java

The Unidata netcdf-java library
https://docs.unidata.ucar.edu/netcdf-java/current/userguide/index.html
BSD 3-Clause "New" or "Revised" License
146 stars 71 forks source link

Clean up DAP4 code plus minor mods to other -- non-dap4 -- code #1133

Closed DennisHeimbigner closed 1 year ago

DennisHeimbigner commented 1 year ago

Description of Changes

This PR make major changes to the DAP4 code. It also makes some small but necessary changes to non-DAP4 code, which will be described below.

This PR depends upon PR https://github.com/Unidata/netcdf-java/pull/1091, and should not be merged before https://github.com/Unidata/netcdf-java/pull/1091.

DAP4 Changes

Note: This PR Assumes that PR https://github.com/Unidata/netcdf-java/pull/1091 has been merged.

Non-DAP4 related Changes

TODO

PR Checklist

rschmunk commented 1 year ago

Got my fingers crossed that this deals with #985 enough that I can make first contact with a DAP4 server.

eigenbeam commented 1 year ago

I'm working on a NASA project that uses the netcdf-java library to retrieve data from a dataset served by the Earthdata OPeNDAP/Hyrax service. In order to do so, we need to pass along the Earthdata Login bearer token in the HTTP Authorization header. From what I can tell so far, this isn't possible at the moment. I'm looking to add support for this and submit a PR when I noticed this PR.

So I'm wondering if you could provide me some direction on adding support for this (if indeed it's not possible to add the bearer token HTTP header in the current release). Since this PR does a major re-org of the DAP4 code, it seems like I should base my PR on the changes here, and wait until this PR is merged before submitting mine.

More importantly, I'm looking for some help in determining how to add support for this use case. I'm not very familiar with the codebase yet, so I'm still working to figure that out. From an API point-of-view, specifically, i.e., how should a user of the API configure their token to be available when making HTTP requests for DAP4 datasets?

If this makes sense as a separate issue & discussion & PR I will open an issue and we can have the conversation there. If it makes sense to integrate this into this PR we can keep it here. Just let me know what you would prefer.

Thank you!

DennisHeimbigner commented 1 year ago

Can you describe what HTTP headers are sent and what is their form for including the bearer token? Once I know that, I can advise where to insert it. I is most likely to be in netcdf-c/libdispatch/dhttp.c and netcdf-c/libdispatch/drc.c.

eigenbeam commented 1 year ago

Can you describe what HTTP headers are sent and what is their form for including the bearer token?

Yes, the Earthdata Login (EDL) token (which is generated via a call to an EDL endpoint) is a JWT token, and once obtained, can be passed in the HTTP 'Authorization' header in the following format (where 'ABCD1234WXYZ0987' is the token value obtained from EDL):

Authorization: Bearer ABCD1234WXYZ0987

E.g., in a curl command it would be passed as:

--header 'authorization: Bearer ABCD1234WXYZ0987'

eigenbeam commented 1 year ago

I is most likely to be in netcdf-c/libdispatch/dhttp.c and netcdf-c/libdispatch/drc.c.

Do you have an idea how the API user would provide this using the Java API? I'm new to the lib & codebase, so still trying to tie things together.

For example, if in my client code I'm getting some data using the URL:

https://opendap.uat.earthdata.nasa.gov/collections/C1241426907-NSIDC_CUAT/granules/ATL08_20200103074826_01160605_005_01.h5?dap4.ce=/gt3l_land_segments_latitude

My understanding is I would do:

ucar.nc2.NetcdfFile.open("dap4:https://opendap.uat.earthdata.nasa.gov/collections/C1241426907-NSIDC_CUAT/granules/ATL08_20200103074826_01160605_005_01.h5?dap4.ce=/gt3l_land_segments_latitude")

So the question is where would I provide the token to be used when the underlying HttpDSP class issues the request to the server? Would I precede this NetcdfFile.open() call by providing something in the httpservices package with the token, so that the HttpDSP could use it if available?

DennisHeimbigner commented 1 year ago

Sorry, I misread your message -- thought it was about the c library. As for Java, the corresponding code is in netcdf-java/httpservices. It used the java.net socket and authentication mechanisms as I recall.

eigenbeam commented 1 year ago

Sorry, I misread your message -- thought it was about the c library. As for Java, the corresponding code is in netcdf-java/httpservices. It used the java.net socket and authentication mechanisms as I recall.

No problem. The httpservices pkg primarily seems to use the Apache Http Client lib (v4.5.x). So going back to my previous post, I'd like to nail down what the API for the user should look like. If a user wants to make a request for a URL served by Earthdata OPeNDAP and they have an Earthdata Login (EDL) token that needs to be passed in the header when the OPeNDAP request is made, would this change be confined to the httpservices package then? I.e., would the user do something like:

HttpSession.setGlobalCredentialsProvider(myAuthHeaderProvider)

followed by

ucar.nc2.NetcdfFile.open("dap4:https://opendap.uat.earthdata.nasa.gov/collections/C1241426907-NSIDC_CUAT/granules/ATL08_20200103074826_01160605_005_01.h5?dap4.ce=/gt3l_land_segments_latitude")

If so, then I believe I can confine this change to the httpservices package and make it independent of this DAP4 refactor PR. On the other hand, if you think this should be done by changing something in the DAP4 packages that you are refactoring, then I'd want to coordinate with you on that so that my change is compatible with your refactoring (this PR).

DennisHeimbigner commented 1 year ago

Httpservices is our attempt to isolate the http functionality we need. It is used by (at least) opendap and dap4. So the changes you want to make would, I expect, be completely independent of any dap4 (or opendap) changes.

eigenbeam commented 1 year ago

Httpservices is our attempt to isolate the http functionality we need. It is used by (at least) opendap and dap4. So the changes you want to make would, I expect, be completely independent of any dap4 (or opendap) changes.

Excellent. I'll open a separate issue and PR. Thanks!

DennisHeimbigner commented 1 year ago

Disabling because there is no way to fix the errors this causes on Jenkins.