Open aeijdenberg opened 6 years ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/157832537
The labels on this github issue will be updated when the story is started.
Hey there @aeijdenberg,
Thanks for the request - I can understand how this could be a problem, but this is the first I'm hearing people caching requests to the Cloud Controller. Can you expand on why this is done as well as how often these caching problems come up?
@XenoPhex - as background we run HAProxy co-located and in front of our Gorouters.
Last week we experimented with enabling the recently added cache functionality in HAProxy, for the purpose of adding cacheability to content generated by tenant applications on the platform.
Since api.system.example.com
requests also go through the same serving path, this is where we encountered issues when pushing a new versions of an application whereby we could see from logs that an action would be initiated, and then polled for completion.
We traced the root cause of the specific issue that we saw to a bug in HAProxy caching, whereby it should never cache the response to a request that had an authorization header (such as those sent by the cf
CLI), which they have now patched and we should see in the next dot point release: https://nvd.nist.gov/vuln/detail/CVE-2018-11469 if interested.
While that will resolve the issues we saw, in the process I spent more time reading RFCs and studying UAA and CF HTTP headers than I'd expected, and noted the following:
POST /oauth/token HTTP/1.1
Host: login.system.example.com
...
HTTP/1.1 200 OK
...
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Cache-Control: no-store
...
Expires: 0
Pragma: no-cache
Pragma: no-cache
GET /v2/spaces/xxx/summary HTTP/1.1
Host: api.system.example.com
Authorization: [PRIVATE DATA HIDDEN]
...
HTTP/1.1 200 OK
... no cache related headers present ...
RFC7234 section 4.2.2 allows for a cache to calculate a "heuristic expiration time" on a response that doesn't including cache related headers and "whose status codes are defined as cacheable by default", which a 200 OK
is.
RFC7234 section 3.2 states that "a shared cache MUST NOT use a cached response to a request with an Authorization header".
While (4) is enough to fix this particular case, in theory a client could be using a private cache (defined in RFC7234 section 1 as "A private cache, in contrast, is dedicated to a single user; often, they are deployed as a component of a user agent", and such a private cache, as I understand it, would be within the RFC to serve a cached response to the polling GET requests as it sees fit.
As such it feels like we should be adding a Cache-Control: no-cache
to either the response in CloudController, or to the request to it. Since in many cases the cf
CLI is in fact deliberately polling and waiting for a response, it seemed to me that in the request (which is possible per RFC7234 section 5.2.1.4) might be the more appropriate place to explicitly request a non-cached response.
Hi @aeijdenberg thanks for creating this feature request. I'm tagging the CC API PM and anchor (@ssisil @tcdowney) 1) to see if they've heard of this feature request before 2) and we also believe this functionality (if we choose to support it) should be reside on the CAPI side.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed.
Some tasks, performed by the CLI such as those involved in
cf push
, involve starting a job, then polling for completion, or more generally for a change of state, until it's done.If an intermediate shared (or private) HTTP cache presents out of date results, this can result in unpredictable behaviour by the CLI.
Has the use of adding a
Cache-control: no-cache
header to requests generated by the CLI been considered?While I realize that a correctly operating shared cache should not normally cache requests that have an
Authorization
header (such as those generated by the CLI), a private cache is permitted to cacheGET
requests, so it might be good to be explicit that we're always looking for a non-cached result.