wtsi-hgi / irobot

iRODS data brokerage service
GNU General Public License v3.0
0 stars 0 forks source link

Set default response content-type to `application/octet-stream` #10

Closed EWeatherill closed 7 years ago

EWeatherill commented 7 years ago

https://github.com/wtsi-hgi/irobot/blob/d318fed778e7ac32064b3f6c8c16e04eda4149b8/bissell/bissell.go#L113

@Xophmeister responded on the Dev Slack channel referencing: https://github.com/wtsi-hgi/irobot#using-the-accept-request-header. In essence, if a accept header isn't specified on a request, irobot should assume it's of type application-octetstream

jrandall commented 7 years ago

@Xophmeister ok, I used to have it that way but then I changed it in an attempt to comply with the following statement that it should otherwise return 406. I suggest changing the documentation to rely more on the HTTP standard. All that is required is to say that the two supported content types for data objects are application/octet-stream and application/vnd.irobot.metadata+json, that the default (if all quality factors are equal) is application/octet-stream, and that the server supports HTTP/1.1 content negotiation and will return a 406 status if negotiation fails.

Xophmeister commented 7 years ago

IMO, I believe it already says that:

The Accept request header is used productively to fetch an appropriate representation of the specific data object:

  • If it is omitted, or application/octet-stream is the primarily accepted media type, then the data (or ranges, thereof) for the specified data object will be returned, with checksums if available.
  • If the primarily accepted media type is application/vnd.irobot.metadata+json, then a JSON representation of the metadata (see below) for the specified data object will be returned.
  • Otherwise, a 406 Not Acceptable response will be returned.
jrandall commented 7 years ago

I am not arguing that it does not say that, I am just telling you what I thought it said, and that it would be much easier to understand if it explained a lot less. It also fails to mention support (or the lack thereof) for an accept header like Accept: application/* which I believe is not an optional part of parsing Accept headers according to the RFCs.

Xophmeister commented 7 years ago

Fair enough, but you ask for the documentation to be changed:

I suggest changing the documentation to rely more on the HTTP standard.

It's not clear to me how you would like it improved.

jrandall commented 7 years ago

also FWIW it seems like the metadata is really a different resource than the data object rather than really being an alternative representation, so the use of content negotiation for this purpose (instead of just using a different resource URI) is an unusual (and probably surprising) interface for an HTTP API, and would be pretty much impossible to guess without reading the docs.

Why not just have two separate resources /data/path/to/data/file.cram and /metadata/path/to/data/file.cram?

jrandall commented 7 years ago

I was suggesting to document this in much less detail by simply referring to RFC 2616 rather than attempting to (incompletely) explain how HTTP content negotiation works.

All that is required would be something like:

iRobot supports content negotiation based on the `Accept` header in the HTTP request. 
Currently, iRobot supports two content types for data objects:
 - `application/octet-stream`: 
        the actual data (or ranges, thereof) for the specified data object (with checksums
        included in response headers, if available)
 - `application/vnd.irobot.metadata+json`: 
        a JSON representation of the metadata (see below for details of format) associated 
        with the data object

If content negotiation fails, iRobot will respond with `406 Not Acceptable` response. 
Xophmeister commented 7 years ago

The original API was like that and I do agree with you that having very different representations at the same endpoint is a bit "exotic". It breaks the true purpose of content negotiation (i.e., representing the same information in different forms; say an image in JPEG or PNG). My justification is two-fold:

  1. iRobot is primarily used for getting data. The metadata representation is a "nice-to-have", so rather than giving it a special endpoint, I just bundle it together. If you really want the metadata, then reading the documentation doesn't seem like a huge expectation 😛
  2. It makes (again, in my opinion) the endpoint paths a bit cleaner. I personally didn't like them prefixed with /data/ and /metadata/ (there was also /checksums/ in the original, but that has been removed entirely and the checksums will now be inlined with the data response in the Content-MD5 header).

Regarding the documentation, I see your point, but I guess not everyone has read RFC2616 or RFC7231. I believe my documentation to be a bit more accessible than those. I can add a link to RFC7231 §5.3 as a shortcut for those in-the-know.