IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
878 stars 488 forks source link

Make the API handle header-only ("HEAD") requests correctly #5545

Open landreev opened 5 years ago

landreev commented 5 years ago

Most of our API calls appear to handle header-only requests correctly on success. For example, if a datafile can be successfully downloaded with /api/access/datafile, you can also issue a HEAD call and get back the expected headers:

$ curl -I http://localhost:8080/api/access/datafile/NNNNNN
HTTP/1.1 200 OK
Server: GlassFish Server Open Source Edition  4.1 
X-Powered-By: Servlet/3.1 JSP/2.3 (GlassFish Server Open Source Edition  4.1  Java/Oracle Corporation/1.8)
Set-Cookie: JSESSIONID=...
Path=/; HttpOnly
Access-Control-Allow-Origin: *
Content-disposition: attachment; filename="120768.txt"
Content-Length: 10
Content-Type: text/plain; name="120768.txt"
Date: Thu, 14 Feb 2019 20:17:21 GMT

However, if a datafile cannot be served, for whatever reason, instead of the proper error code and headers, a HEAD results in a 500. For example, trying to download a non-existing file, returns a 404 and a neat error fragment:

$ curl http://localhost:8080/api/access/datafile/99999999
{"status":"ERROR","code":404,"message":"'/api/v1/access/datafile/131' endpoint does not exist on this server. Please check your code for typos, or consult our API guide at http://guides.dataverse.org."}

Trying a HEAD on the same url:

$ curl -I http://localhost:8080/api/access/datafile/99999999
HTTP/1.1 500 Internal Server Error
Transfer-Encoding: chunked
Date: Thu, 14 Feb 2019 20:27:00 GMT
Connection: close

The exception stack trace:

 Exception processing ErrorPage[errorCode=404, location=/404.xhtml]
javax.servlet.ServletException: getOutputStream() has already been called for this response
    at javax.faces.webapp.FacesServlet.service(FacesServlet.java:659)
    at org.apache.catalina.core.StandardWrapper.service(StandardWrapper.java:1682)
...

Suggests that it has something to do with the redirect to the 404.xhtml page (?) which is weird - because we specifically keep a bunch of exceptions handlers that should be preventing the 404s, 403s and other error codes from being redirected to the error html pages, generating the compact json fragments, as above. So it appears that these handlers do not work properly when we exit without generating any content...

This is not absolutely crucial to fix for any current practical purposes. But for the sake of compliance, we probably want HEADs to work, eventually.

pdurbin commented 2 years ago

Related:

cmbz commented 1 week ago

2024/09/30: Keeping for eventual prioritization. Probably a small investigation/fix, and is worthwhile but not highest priority. Omer will investigate.