tus / tus-resumable-upload-protocol

Open Protocol for Resumable File Uploads
https://tus.io
MIT License
1.49k stars 102 forks source link

clarify Content-Type / Content-Encoding / Content-Language handling #160

Open nigoroll opened 4 years ago

nigoroll commented 4 years ago

see #158 for discussion

Acconut commented 4 years ago

I am not sure if adding recommendations about Content-Encoding and Content-Language headers is a good idea. If I understand correctly, these headers are mostly relevant for GET requests, which tus does neither mention nor use. Of course, if your server supports GET requests nevertheless, the Content-Encoding and Content-Language headers may also be included in HEAD responses but that in itself is not relevant for tus. Is it understandable what I mean?

The changes regarding metadata are good, but let's clarify the other point first.

nigoroll commented 4 years ago

(force-pushed only to resolve the merge conflict, no content changed)

re @Acconut regarding your interpretation that the HTTP metadata headers were specific to GET please see https://github.com/tus/tus-resumable-upload-protocol/issues/158#issuecomment-688232554 - they are specific to the resource and in the case of tus that is the uploaded object.

Regarding Content-Encoding, if a client chose to use it and the tus server simply ignored it, things would get terribly wrong. For example, if a client uploaded text/html with Content-Encoding: gzip and the server neither decoded the upload not stored the information about the encoding, the upload would be corrupt in the sense that the encoded form simply is not text/html.

Please note that RFC7231 explicitly considers uploads in this paragraph:

An origin server MAY respond with a status code of 415 (Unsupported Media Type) if a representation in the request message has a content coding that is not acceptable.

smatsson commented 4 years ago

The specific metadata keys documented herein are reserved for the respective use and MUST NOT be used for other purposes.

Note really a fan of this part at all. There are multiple implementations out there already and if they use any of the keys defined for anything else than this update says then they would be in violation with the spec.

In other parts I agree that this would be a good addition to the spec.

nigoroll commented 4 years ago

@smatsson would you have a better suggestion to reserving the metadata keys? My original suggestion was to make a breaking change and use Content-Type with the next tus major version.

smatsson commented 4 years ago

@nigoroll To be the devil's advocate: Why would we need to reserve metadata keys to begin with? The 1.0 version of protocol has been around since 2015 and haven't included any reserved keys. I think that the openness of not reserving metadata keys is a good thing as words have different meanings in different contexts. For example I would not translate "filetype" to Content-Type but rather to file extension ("It's a pdf"). I don't necessarily disagree with having reserved keys, it's just that it's 5 years to late. Breaking changes to a protocol that is out in the wild is not a good thing as issues will arise.

nigoroll commented 4 years ago

@smatsson we cannot have agreement on semantics without agreement on how to represent that semantics. My impression from the discussion in #158 was that the filename and filetype metadata keys were a de-facto standard because uppy/tusd use them. But you are right, no matter if we reserved metadata keys or brought back the original Content-Type semantics, both would be breaking changes and thus formally require a protocol major version bump. This PR was intended as a compromise trying to avoid that, but in fact I think that a new version with changed semantics would be the better option.

smatsson commented 4 years ago

@nigoroll I'm only concerned with the breaking changes and that the protocol will be ambiguous between implementations and when they were implemented (as the spec changed). As I said earlier I don't really object to the change per se, just the breaking change part. If you and @Acconut feel strongly that this should go in the spec it's fine by me.

Acconut commented 4 years ago

@nigoroll Would it be OK if I add some commits to this PR to show what I mean with some of my comments?

nigoroll commented 4 years ago

@Acconut sure

smatsson commented 4 years ago

Do we need to consider the Content-Encoding changing between requests? E.g. the file is created with Content-Encoding: gzip but the client then decides to not include it (and not encoding the content either) for subsequent requests?

Acconut commented 4 years ago

I adjusted the phrasing of the filename and filetype metadata keys to match the rest of the document and to also ensure that this is not a breaking change. Let me know what you think

@nigoroll There are also some questions from me above. Could you have a look at them when you have the time?