lloyd / connect-etagify

etagify is connect middleware to add ETag headers to cachable but non-static content.
49 stars 6 forks source link

First request to etagified resource does not etagify. Second does. #1

Open sporkmonger opened 12 years ago

sporkmonger commented 12 years ago
$ curl -i http://localhost:5000/
HTTP/1.1 200 OK
X-Powered-By: Gerbils
Cache-Control: public, max-age=3600
Content-Type: text/html; charset=utf-8
Content-Length: 19
Connection: keep-alive

Well that's weird.
$ curl -i http://localhost:5000/
HTTP/1.1 200 OK
X-Powered-By: Gerbils
ETag: "2657ab1143a8795c9ffaec18a56f1f0c"
Cache-Control: public, max-age=3600
Content-Type: text/html; charset=utf-8
Content-Length: 19
Connection: keep-alive

Well that's weird.
lloyd commented 12 years ago

That's by design, the idea is that etagify lazy generates checksums for content, cause the content is dynamic.... Make sense?

--lloyd

On May 12, 2012, at 8:55 AM, Bob Amanreply@reply.github.com wrote:

$ curl -i http://localhost:5000/ HTTP/1.1 200 OK X-Powered-By: Gerbils Cache-Control: public, max-age=3600 Content-Type: text/html; charset=utf-8 Content-Length: 19 Connection: keep-alive

Well that's weird. $ curl -i http://localhost:5000/ HTTP/1.1 200 OK X-Powered-By: Gerbils ETag: "2657ab1143a8795c9ffaec18a56f1f0c" Cache-Control: public, max-age=3600 Content-Type: text/html; charset=utf-8 Content-Length: 19 Connection: keep-alive

Well that's weird.


Reply to this email directly or view it on GitHub: https://github.com/lloyd/connect-etagify/issues/1

sporkmonger commented 12 years ago

Right but you're not calculating the ETag for the previous response (at least, I hope not) so I'm not sure I understand why it's necessary to wait until the second response to generate the ETag. There's no technical reason for this as far as I can see. It's a minor quibble of course, because the overwhelming majority of the time, the ETag is going to be set, but I still really don't understand your rationale here.

lloyd commented 12 years ago

Sorry, that's what I get for trying to type on a telephone.

So an etag needs to be sent in headers before the content. In order to calculate and send an etag for the first request, I'd have to implement content buffering. I've done that before (see postprocess), and it's more code that I'd like to add to etagify. So in the tradeoff between keeping the code simpler and not sending etags on the first response, vs implementing buffering and always having etags, I chose the former.

I do hate that this is so unexpected, you're not the first to be confused by this behavior.

lloyd commented 12 years ago

gosh, re-reading the code, I think it'd be pretty trivial to buffer. I'm not sure I understand my original rationale anymore.

https://github.com/lloyd/connect-etagify/blob/372223e41451f2fd7a0799d26cd5ad33d91d45f7/etagify.js#L75-97

sporkmonger commented 12 years ago

Yeah, having now taken a closer look at the code, it looks like what it's doing is actually incorrect and will cause caching problems. Because you're attaching the ETag to the second response using the MD5 from the first response and doing a lookup for it by path, you have no guarantees that the content body you used to generate the ETag hasn't changed between those two requests.

lloyd commented 12 years ago

Check out the readme: "If you serve non-static content that is gauranteed not to change between server restarts in production, the etagify middleware makes it easy to serve this content with proper etag headers based on a hash of your content."

Let's have this bug track sending stag headers the first time content is requested

sporkmonger commented 12 years ago

Hmm, OK. That honestly sounds pretty limiting. Basically that works for static content that you're serving with something other than the normal static handler. Incidentally, that's actually what I'm doing, but still, that's pretty limiting.