mozilla-services / cliquet

CLIQUET IS NOW DEPRECATED use kinto.core instead
http://github.com/Kinto/kinto
Other
64 stars 16 forks source link

Switch from Last-Modified to ETag headers? #251

Closed michielbdejong closed 9 years ago

michielbdejong commented 9 years ago

I think your Last-Modified headers are violating RFC7232. According to this RFC, you need to use that long human-readable date format, chosen in a time when 1000 milliseconds was still a reasonable versioning granularity. :)

This means compliant http clients would consider your Last-Modified headers to be malformed, possibly leading to unpredictable behavior.

It is also common for a browser or other http client (or any proxy or CDN along the way) to insert its own If-Modified-Since header whenever it has a version of the document in cache, and there is no telling how a server implementation would react to that.

Also, violating RFCs is naughty. ;)

GoogleDrive, Dropbox, ownCloud-flavoured WebDAV, CouchDB, and remoteStorage all use ETag instead of Last-Modified. We used to have Last-Modified millisecond-timestamps in the remoteStorage protocol as well, but switched to ETags, because they do effectively the same thing, but are treated as opaque strings by http clients.

To make requests fail if a pre-existing resource changed:

To make requests fail if a pre-existing resource did not change:

To make requests fail if a presumed-absent resource exists:

In http responses:

The downsides:

leplatrem commented 9 years ago

Thanks for this detailed feedback!

That's indeed a point we were not very proud of. And @edas already gave us a negative feedback about it :blush:

I like your proposition indeed [0], and cannot see any big impact regarding the downsides you mentioned. Let's see what @ametaireau and @Natim think about it.

Side notes:


[0] Actually I was confused about ETag, I thought that they had to be a checksum of records, and didn't like them.

michielbdejong commented 9 years ago

indeed, ETags can be any string you like, some websites even use them to effectively cookie you without sending you a cookie (they send a different ETag to each visitor, and can then track you by looking at the If-None-Match header your browser sends, very sneaky).

Yes, in the remoteStorage spec If-Match and If-None-Match are also optional, one reason is that if you do GET for the first time, then you don't have any last-seen version yet to refer to, so you need to retrieve the document for sure, and the other reason is that they are optional in http, so making them required would be considered 'subsetting http', which is not best practice. For instance, in the latest versions of remoteStorage (not in earlier ones) we require servers to support all of http/1.1 (including chunked-transfer uploads), so that we can say we're not subsetting http.

It's indeed nice to keep sending a Last-Modified header, also for debugging, but then with the human-readable format, rounded off to whole seconds.

EDIT: Actually, I think the way the not-subsetting-http argument works is that actual server implementations can always respond to a valid http request with an "I don't want to" response code, but http-based specs should not instruct them to.

almet commented 9 years ago

Hi @michielbdejong ! That's a pleasure to see you commenting here.

It's great to see we can benefit from your experience with remote storage here, and don't see any downside in what you mentioned. Our current approach is fragile and we should indeed update our code to use what's standard.

I've been reading a bit more about ETags because obviously we are missing knowledge about that :) Especially, the wikipedia page is handy.

Let's do it!

michielbdejong commented 9 years ago

\o/ :)