varnishcache / varnish-cache

Varnish Cache source code repository
https://www.varnish-cache.org
Other
3.65k stars 375 forks source link

Empty bodies and IMS/INM/304 #3341

Closed rezan closed 1 year ago

rezan commented 4 years ago

I believe an optimization was made to not attempt IMS/INM requests when we have no body. This prevents a backend from doing a 304 response in such cases. Some backends, like S3, would consider this non-optimal. Its also not clearly documented anywhere and can trip people up, especially when writing VTCs.

Should we revert this optimization and go back to attempting IMS/INM on all bodies?

Original commit: https://github.com/varnishcache/varnish-cache/commit/9371b61d76ae1e653cd2168b827a0e104a0d968d

nigoroll commented 4 years ago

In my mind, not asking for a conditional response should, under all circumstances, be more efficient for a backend than making a useless conditional request. In other words, if any S3 implementation required clients to make conditional requests on empty objects for better performance, I would both be interested in a reference and claim that the S3 implementation was inefficient to begin with.

dridi commented 4 years ago

In my mind, not asking for a conditional response should, under all circumstances, be more efficient for a backend than making a useless conditional request.

I disagree, it might be cheaper to reply to a last-modified or etag request if they are for example indexed somehow on the backend side.

The 304 status is supposed to be a fast path so any sane server implementation making use of conditionals should (in my mind ;) be more efficient at treating the conditionals.

Should we revert this optimization and go back to attempting IMS/INM on all bodies?

A middle ground could be a feature flag, but that's more complicated.

I don't personally have an opinion, should we check what the RFCs have to say and put this under the #3246 compliance umbrella?

rezan commented 4 years ago

In my mind, not asking for a conditional response should, under all circumstances, be more efficient for a backend than making a useless conditional request.

It might be better if you could explain why serving a 200 is faster than a 304?

nigoroll commented 4 years ago

It might be better if you could explain why serving a 200 is faster than a 304?

I am not saying that. I am saying that for an empty response, not having to run the validation should be more efficient.

(Actually, I did kinda say that, sorry. "under all circumstances" should have read "...for an empty response". Sorry, just noticed this now)

nigoroll commented 4 years ago

I disagree, it might be cheaper to reply to a last-modified or etag request if they are for example indexed somehow on the backend side.

Not having to run the validation should be faster than running one no matter if indexed or not.

But anyway, I did not mean to blow this up. Again, a reference to actual implementation data and the reasons might help.

rezan commented 4 years ago

Not having to run the validation should be faster than running one no matter if indexed or not.

So you are talking about using strcmp() on 2 headers, right? So strcmp() has more overhead than delivering an object payload? Not buying this. Definitely not true for Varnish.

Again, a reference to actual implementation data and the reasons might help.

So your argument is that 200 responses are faster than 304s because you can skip a strcmp(). And you want a reference proving the opposite? Sure, take a look at Varnish, when we match an ETag, a whole lot of delivery logic is skipped. Would be trivial to benchmark this and show its faster.

If Etags are slow, then I would expect the backend to simply not use them. I would not expect a backend to give me one and then have me try to guess that this is a bad case to use them for reasons that don't quite make sense...

Maybe think of it like this. I give you an ETag and you sometimes dont use it. That's a bug in my book. If the backend doesn't want us to use an ETag, then they should stop using them.

nigoroll commented 4 years ago

I regret having engaged with this ticket in the first place. I asked for a reference why an S3 backend should be faster for 304s than 200 with no body and what I get is an angry response. @rezan the difference in varnish is setting sendbody = 0 here vs. some lines further down

rezan commented 4 years ago

I asked for a reference why an S3 backend should be faster for 304s than 200 with no body and what I get is an angry response.

Didn't give you an angry response, you made a very conclusive statement saying that 200 responses are faster than a 304, and then you asked for material that proves otherwise. Maybe your initial statement was angry?

Even if we look at the empty body usecase, I don't think the solution here is for us to survey the ecosystem of backends and figure out who has optimized this path and how well. Nothing in the RFC mentions anything about this kind of behavior. I have also confirmed with my contact that ETags are a highly optimized path in their hyperscale S3 service. This is why this issue exists!

Anyway, to put your original statement to bed, my understanding is that we have 3 solutions:

So given your original statement, I do not think the solution here is to disable a subset of conditional requests in core.

dridi commented 4 years ago

@rezan there's no point to further mansplaining your point to Nils, he won't get notifications. Better let this issue rest until next bugwash.

bsdphk commented 4 years ago

bugwash consensus: Go for backing out optimization. Document with 200/304 data for posterity.

nigoroll commented 4 years ago

FTR: My concern is that backends might send no body in error with the valid Etag/LM. In that case, the current code would avoid the error to be exposed for longer than the ttl.

rezan commented 4 years ago

I did 5 requests to AWS S3 for an empty file, no etag, average response time of 147ms:

> time curl -vs http://localhost:8888/EMPTY_FILE -H "Host: varnish-s3-test.s3-us-west-2.amazonaws.com" -H "profile: s3_west_ro"
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8888 (#0)
> GET /EMPTY_FILE HTTP/1.1
> Host: varnish-s3-test.s3-us-west-2.amazonaws.com
> User-Agent: curl/7.58.0
> Accept: */*
> profile: s3_west_ro
> 
< HTTP/1.1 200 OK
< x-amz-id-2: 25DfBa8rFbbZreUXfdFwOayUuM2qBoTiwF9U31CO5XKmwdj722UaiW8PShdiA3ccYqm5C0lVYpw=
< x-amz-request-id: 8E08A78774C98648
< Date: Mon, 08 Jun 2020 16:31:35 GMT
< Last-Modified: Mon, 08 Jun 2020 16:25:53 GMT
< ETag: "d41d8cd98f00b204e9800998ecf8427e"
< Accept-Ranges: bytes
< Content-Type: binary/octet-stream
< Content-Length: 0
< Server: AmazonS3
< X-Varnish: 32893
< Age: 0
< Via: 1.1 varnish (Varnish/6.0)
< Connection: keep-alive
< 
* Connection #0 to host localhost left intact

real    0m0.203s
user    0m0.008s
sys 0m0.007s

With an Etag and 304 response, 115ms:

> time curl -vs http://localhost:8888/EMPTY_FILE -H "Host: varnish-s3-test.s3-us-west-2.amazonaws.com" -H "profile: s3_west_ro" -H "If-None-Match: \"d41d8cd98f00b204e9800998ecf8427e\""
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8888 (#0)
> GET /EMPTY_FILE HTTP/1.1
> Host: varnish-s3-test.s3-us-west-2.amazonaws.com
> User-Agent: curl/7.58.0
> Accept: */*
> profile: s3_west_ro
> If-None-Match: "d41d8cd98f00b204e9800998ecf8427e"
> 
< HTTP/1.1 304 Not Modified
< x-amz-id-2: 8DSJKu7lwOPVbOOk4fe7VTQ4/e5w0ZcQTMfr11qa61QOm2unI4clbt2Yz1lCmMz0FjshhTOtngs=
< x-amz-request-id: 914E32B6A4214718
< Date: Mon, 08 Jun 2020 16:35:16 GMT
< Last-Modified: Mon, 08 Jun 2020 16:25:53 GMT
< ETag: "d41d8cd98f00b204e9800998ecf8427e"
< Server: AmazonS3
< X-Varnish: 32899
< Age: 0
< Via: 1.1 varnish (Varnish/6.0)
< Connection: keep-alive
< 
* Connection #0 to host localhost left intact

real    0m0.102s
user    0m0.016s
sys 0m0.000s

My concern is that backends might send no body in error with the valid Etag/LM

For the S3 usecase, this would mean an invalid content length and the response would be an error. So the Etag would be dropped.

nigoroll commented 4 years ago

@rezan would it be possible to get the response times for a statistically relevant number, like 1000 requests? Also, could you clarify, is AWS pricing different based on the response status?

rezan commented 4 years ago

Also, could you clarify, is AWS pricing different based on the response status?

I don't think so.

rezan commented 4 years ago

I did 100 requests with no ETag, and 100 with an ETag:

No ETag: 81.0788ms ETag: 80.8085ms

https://filebin.varnish-software.com/lAbU0L39gZx3Jxin-reza-x1-g6

nigoroll commented 4 years ago

Thank you, @rezan . That would not appear to be a relevant difference, or would it?

rezan commented 4 years ago

Really hard to say much here. Looks like we are measuring network latency.

nigoroll commented 1 year ago

Bugwash discussed this ticket again: WONTFIX because no valid case was shown and we seem to all agree that a 304 for an empty body is pointless.