Unidata / netcdf-c

Official GitHub repository for netCDF-C libraries and utilities.
BSD 3-Clause "New" or "Revised" License
512 stars 262 forks source link

Problem communicating to TDS opendap #1425

Closed dopplershift closed 5 years ago

dopplershift commented 5 years ago

I've found a real head scratcher. I've run into problems accessing some radar data on THREDDS instance running in jetstream, configured to serve NEXRAD data from the AWS S3 archive. It manifests as an error in Python as RuntimeError: NetCDF: Access failure, but I can reproduce with ncdump:

ncdump -v Reflectivity_HI http://thredds-aws.unidata.ucar.edu/thredds/dodsC/nexrad/level2/S3/2019/07/02/KDIX/KDIX20190702_194819_V06

produces

NetCDF: Access failure Location: file /usr/local/miniconda/conda-bld/libnetcdf_1559816324689/work/ncdump/vardata.c; line 478

This fails for me with ncdump with both 4.6.2 and 4.7.0. Here's where it gets weird--the above command works if we switch to Unidata's "local" TDS:

ncdump -v Reflectivity_HI http://thredds.ucar.edu/thredds/dodsC/nexrad/level2/KDIX/20190702/Level2_KDIX_20190702_1948.ar2v

yields pages and pages of output. Both files have identical SHA256 hashes, so the problem is not server-side file corruption.

One other data point: netCDF-java (through ToolsUI), has no problem accessing the thredds-aws OPeNDAP url and dumping the variable. Also, I have no problem accessing the data on thredds-aws using the CDMRemote protocol, so it would not appear to be a server issue. Both thredds-aws and thredds.ucar.edu are running the same version of the server.

For completeness sake, I'm seeing this on macOS 10.14.5.

dopplershift commented 5 years ago

Roping in @Unidata/thredds too. I feel like this is server-based, but the fact that netCDF-java talks to the problematic server would seem to belie that.

WardF commented 5 years ago

Interesting; those are certainly conflicting symptoms. I’ll check if I can duplicate as soon as I can (working on phone at the moment) and will trace through to the precise error. I’ll follow up with the root of the error as soon as I discover it.

WardF commented 5 years ago

Duplicated with self-compiled on OSX. Interesting, I will compile with logging and pass some additional DAP parameters that Dennis added (after I look them up) in the morning and see what falls out. Thanks, @dopplershift, This is an odd one. I believe that the root might be that the C library is being more stringent in some respects, 4.6.2 added some additional checks for various behaviors. They can largely be overridden, however. I don't know that's the case, however, so I will suss it out in the morning. Thanks!

WardF commented 5 years ago

I'm encountering the issue reading the .dods file. Still trying to get to the bottom of it but thusfar I can say that in the failing case, the call to readpacket() at ocread.c:135 returns state->packet.length == 0 and the working case returns state->packet.length == 1992. I'll continue to dig, but wanted to make a note of this. Also, seeing if I can tag in @DennisHeimbigner this way.

dopplershift commented 5 years ago

I'll note that the only difference between the .dods return is:

nexrad/level2/S3/2019/07/02/KDIX/KDIX20190702_194819_V06

versus

nexrad/level2/KDIX/20190702/Level2_KDIX_20190702_1948.ar2v

in the header. The one that works has a filename with an extension. That couldn't possibly be it?

WardF commented 5 years ago

Update: Using wget to check, I see the following:

I can use wget to fetch the entire .dods file from either server. I can also append ?Reflectivity to the query on either server and download just the Reflectivity variable. However, when I try to subset the variable, we see another failure.

Works

$ wget http://thredds.ucar.edu/thredds/dodsC/nexrad/level2/KDIX/20190702/Level2_KDIX_20190702_1948.ar2v.dods?Reflectivity%5fHI[0][0][0:1831]

Fails

$ wget http://thredds-aws.unidata.ucar.edu/thredds/dodsC/nexrad/level2/S3/2019/07/02/KDIX/KDIX20190702_194819_V06.dods?Reflectivity%5fHI[0][0][0:1831]

WardF commented 5 years ago

Issue discovered, the square brackets appear to need encoded to work; by replacing ] [ with %5D and %5B respectively, the subsetting query worked. Issue diagnosed, leaving the issue open while @dopplershift figures out how to address this. It appears to be out of our (C library) teams hands at this point.

lesserwhirls commented 5 years ago

So the root issue is that more and more web servers are following the standards for URL composition. For URIs, there are certain rules when characters should be encoded, but they sort of leave it up to the type of URI to specificy what those characters are. For URLs, the square bracket characters fall into the "unsafe" category and should always be encoded. Here is the relevant text from RFC 1738:

Characters can be unsafe for a number of reasons. The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs. The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text; the quote mark (""") is used to delimit URLs in some systems. The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it. The character "%" is unsafe because it is used for encodings of other characters. Other characters are unsafe because gateways and other transport agents are known to sometimes modify such characters. These characters are "{", "}", "|", "\", "^", "~", "[", "]", and "`".

All unsafe characters must always be encoded within a URL. For example, the character "#" must be encoded within URLs even in systems that do not normally deal with fragment or anchor identifiers, so that if the URL is copied into another system that does use them, it will not be necessary to change the URI and URL encoding.

In the past few years, the Tomcat servlet container started rejecting unencoded unsafe characters, and thus we get what we have here. As a side note, apache webservers appear to happily accept some of these characters, so requests with unencoded square brackets to thredds.ucar.edu will work just fine, as we run Tomcat behind Apache.

We had a nice discussion over in netCDF-Java land about this, because at one point we encoded URLs, then we stopped encoding some parts and so things broke, but now we are fully encoding again. If netCDF-C is constructing URLs, then I would say it is on the library to make sure things are encoded properly. Curl might have an option to encode URLs for you, so it might be a simple fix, but I'm not sure (maybe curl_easy_escape(...)?)

It turns out there are some web servers that simply choke on encoded URLs, but they are by far in the minority. The approach we took in netCDF-Java was to encode URLs by default, but allow users to set a flag to turn off the encoding behaviour if they needed that.

WardF commented 5 years ago

Encoding in the C library shouldn't be an issue, curl_easy_escape was the first thing that occurred to me. Let me poke around on this and see if this fix is as easy as one would expect.

WardF commented 5 years ago

It is not a necessarily easy fix by encoding here; given that it also fails using wget, it seems like a two pronged approach may be necessary here. I'll keep digging at it on my end, but sadly curl_easy_escape() was not a straight-forward panacea.

dopplershift commented 5 years ago

Well, wget fails because it only takes the url we give it--it doesn't do any encoding, that's up to the user I believe. Here's the proper example in curl:

>curl -G -v "http://thredds-aws.unidata.ucar.edu/thredds/dodsC/nexrad/level2/S3/2019/07/02/KDIX/KDIX20190702_194819_V06.dods?Reflectivity_HI[0][0][0:1831]"                                                 
*   Trying 149.165.170.40...
* TCP_NODELAY set
* Connected to thredds-aws.unidata.ucar.edu (149.165.170.40) port 80 (#0)
> GET /thredds/dodsC/nexrad/level2/S3/2019/07/02/KDIX/KDIX20190702_194819_V06.dods?Reflectivity_HI[0][0][0:1831] HTTP/1.1
> Host: thredds-aws.unidata.ucar.edu
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 400 
< Transfer-Encoding: chunked
< Date: Wed, 03 Jul 2019 20:46:07 GMT
< Connection: close
< Server: Apache
< 
* Closing connection 0

Fails. This works:

>curl -G -v "http://thredds-aws.unidata.ucar.edu/thredds/dodsC/nexrad/level2/S3/2019/07/02/KDIX/KDIX20190702_194819_V06.dods" --data-urlencode "Reflectivity_HI[0][0][0:1831]" 
*   Trying 149.165.170.40...
* TCP_NODELAY set
* Connected to thredds-aws.unidata.ucar.edu (149.165.170.40) port 80 (#0)
> GET /thredds/dodsC/nexrad/level2/S3/2019/07/02/KDIX/KDIX20190702_194819_V06.dods?Reflectivity_HI%5B0%5D%5B0%5D%5B0%3A1831%5D HTTP/1.1
> Host: thredds-aws.unidata.ucar.edu
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 200 
< Access-Control-Allow-Origin: *
< XDODS-Server: opendap/3.7
< Content-Description: dods-data
< Content-Type: application/octet-stream
< Transfer-Encoding: chunked
< Date: Wed, 03 Jul 2019 20:45:59 GMT
< Server: Apache
< 
Dataset {
    Byte Reflectivity_HI[scanR_HI = 1][radialR_HI = 1][gateR_HI = 1832];
} nexrad/level2/S3/2019/07/02/KDIX/KDIX20190702_194819_V06;

Data:
Warning: Binary output can mess up your terminal. Use "--output -" to tell 
Warning: curl to output it to your terminal anyway, or consider "--output 
Warning: <FILE>" to save to a file.
* Failed writing body (0 != 1840)
* Failed writing data
* Closing connection 0
DennisHeimbigner commented 5 years ago

Turns out there is a flaw in the query parser for URLS in ncuri.c I should be able to fix in a couple of hours.

DennisHeimbigner commented 5 years ago

Below is the text for the commit (same as pull request text) for this fix. Attached is a zip file containing the modified files. Needs testing.

gh1425.zip

++++++++++++++++++++++++++++++ Fix encoding of a DAP2 constraint specified outside the URL.

re: github issue https://github.com/Unidata/netcdf-c/issues/1425

The 'ncdump -v' command causes a constraint to be sent to the opendap code (in libdap2). This is a separate path from specifying the constraint via a URL.

This separate path encoded its constraint using code independent of and duplicative of that provided by ncuri.c and this duplicate code did not properly encode the constraint, which might include square brackets.

Solution chosen here was to get rid of the duplicate code and ensure that all URL escaping is performed in the ncuribuild function in the ncuri.c file.

Also removed the use of the NEWESCAPE conditional in ncuri.c because it is no longer needed.

dopplershift commented 5 years ago

I’m confused, why isn’t this being done as a regular pull request?

julienchastang commented 5 years ago

Perhaps we can resume this discussion when I'm back on Monday. Related: https://github.com/Unidata/thredds-docker/issues/209

Also potential relevant snippet to add to server.xml via a Tomcat solution:

<Connector server="Apache" port="8080" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443" relaxedQueryChars="[]:"/>

We already do this on

http://thredds-jetstream.unidata.ucar.edu/thredds/catalog.html

Not sure why this was not transferred over to

http://thredds-aws.unidata.ucar.edu/thredds/catalog.html

dopplershift commented 5 years ago

@julienchastang That would fix our server, yes, but doesn’t help netcdf-c with servers we don’t control, servers that are not misconfigured.

DennisHeimbigner commented 5 years ago

I think we need to bite the bullet and not use relaxedQueryChars.

DennisHeimbigner commented 5 years ago

Converted to pull request: https://github.com/Unidata/netcdf-c/pull/1439