psf / cachecontrol

The httplib2 caching algorithms packaged up for use with requests.
Other
468 stars 123 forks source link

Does not support CacheControl: no-cache on responses #132

Open dstufft opened 8 years ago

dstufft commented 8 years ago

It appears that while this does correct handle CacheControl: no-store on responses, it does not correctly handle CacheControl: no-cache. Causing them to get cached in some cases.

ionrock commented 8 years ago

@dstufft Did you have any other info on what sort of use case causes the the the invalid cache entries? I have a test for the basic use case, but I suspect that an etag or some other header is mucking up the works.

dstufft commented 8 years ago

Well it's not a problem with a invalid cache entry, this is just the response has a CacheControl: no-cache (along with an Expires header) and CacheControl is ignoring the directive not to cache. If you look at https://github.com/ionrock/cachecontrol/blob/master/cachecontrol/controller.py#L258-L268 you can see it's handling no-store, but not no-cache.

cryzed commented 8 years ago

Any update on this?

jaap3 commented 7 years ago

I ran into a problem that's caused by this. https://bitbucket.org/hpk42/devpi/issues/345/check-interoperability-with-simple-page

The issue is as follows:

  1. A server (in my case devpi) responds with Cache-Control: no-cache and an Expires: response date - 1 header.
  2. CacheControl caches this response because it provides an Expires header.
  3. The client (in my case pip) requests the resource again, providing a max-age header.
  4. CacheControl finds a cached response, it's stale according to it's Expires date, but since it's still within the max-age it's considered "fresh".
  5. The client uses the cached response and behaves unexpectedly (in my case pip doesn't find a package that was released to the devpi server some moments before)

I'm unsure about the semantics of the Cache-Control: no-cache header. Should it be handled the same as `no-store``?

OrangeDog commented 6 years ago

I'm unsure about the semantics of the Cache-Control: no-cache header. Should it be handled the same as no-store?

No, no-cache is equivalent to max-age=0; must-revalidate, but also allows specifying only specific header fields that must revalidate. People thinking it means the same as no-store has led to a lot of broken cache implementations.


The "no-cache" response directive indicates that the response MUST NOT be used to satisfy a subsequent request without successful validation on the origin server. This allows an origin server to prevent a cache from using it to satisfy a request without contacting it, even by caches that have been configured to send stale responses.

If the no-cache response directive specifies one or more field-names, then a cache MAY use the response to satisfy a subsequent request, subject to any other restrictions on caching. However, any header fields in the response that have the field-name(s) listed MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. This allows an origin server to prevent the re-use of certain header fields in a response, while still allowing caching of the rest of the response.


The "no-store" response directive indicates that a cache MUST NOT store any part of either the immediate request or response. This directive applies to both private and shared caches. "MUST NOT store" in this context means that the cache MUST NOT intentionally store the information in non-volatile storage, and MUST make a best-effort attempt to remove the information from volatile storage as promptly as possible after forwarding it.