Closed lukenowak closed 3 years ago
With:
CONFIG proxy.config.http.negative_revalidating_enabled INT 0
I start to got 503 replies after waiting for max-age
time to pass, so it seems generally the behaviour is impacted by enabling that option.
@lukenowak According to the source code, ATS can return a stale cache if is_stale_cache_response_returnable()
returns true, that refers proxy.config.http.cache.max_stale_age
but does NOT refer proxy.config.http.negative_revalidating_lifetime
.
https://github.com/apache/trafficserver/blob/8.1.1/proxy/http/HttpTransact.cc#L5741
But I am not sure the difference between the purposes of these two variables.
Descriptions of these configs seem ambiguous when and which config should be used.
proxy.config.http.cache.max_stale_age
The maximum age allowed for a stale response before it cannot be cached.
proxy.config.http.negative_revalidating_lifetime
How long, in seconds, to consider a stale cached document valid if proxy.config.http.negative_revalidating_enabled is enabled and Traffic Server receives a negative (5xx only) response from the origin server during revalidation.
As @fdiary pointed out, max_stale_age
is used to check the age of cached contents.
https://github.com/apache/trafficserver/blob/08fe521a3974a05b01545c68fe1bcd5162dd4fc4/proxy/http/HttpTransact.cc#L5925
OTOH, negative_revalidating_lifetime
is only used to set the Expires
header of stale contents.
https://github.com/apache/trafficserver/blob/08fe521a3974a05b01545c68fe1bcd5162dd4fc4/proxy/http/HttpTransact.cc#L4350
If we take the doc literary, it seems like we should refer to negative_revalidating_lifetime
too when the negative_revalidating
is enabled.
I have some cases (note: I am using version patched with https://github.com/apache/trafficserver/pull/7422)
negative_revalidating_enabled = 0
negative_revalidating_lifetime = 0
max_stale_age = 30
max-age
< max_stale_age
)This case, TrafficServer does not simulate "stale-if-error", expected.
Setting negative_revalidating_lifetime
to some value impacts nothing.
negative_revalidating_enabled = 0
negative_revalidating_lifetime = 0
max_stale_age = 30
max-age
< max_stale_age
)max_stale_age
)This case, TrafficServer simulates a bit "stale-if-error", but only when backend is down (replies nothing).
Setting negative_revalidating_lifetime
to some value impacts nothing.
negative_revalidating_enabled = 1
negative_revalidating_lifetime = 0
max_stale_age = 30
max-age
< max_stale_age
)max_stale_age
)In this case, TrafficServer simulates "stale-if-error", expected. max_stale_age
is the switch to control the time frame of the feature.
Setting negative_revalidating_lifetime
to some value impacts case of step 7 with Expires
header becoming Date + negative_revalidating_lifetime
, but in step 9 there is no more Expires
header.
negative_revalidating_enabled = 1
negative_revalidating_lifetime = 0
max_stale_age = 30
max-age
< max_stale_age
)max_stale_age
)In this case, TrafficServer simulates "stale-if-error", expected. max_stale_age
is the switch to control the time frame of the feature.
Setting negative_revalidating_lifetime
to some value impacts nothing
So currently the documentation (in records.config) could state:
proxy.config.http.negative_revalidating_enabled
Add something like "proxy.config.http.cache.max_stale_age
can be used to configure how long the stale response will be given for available backend returning 5xx response."
proxy.config.http.negative_revalidating_lifetime
"...during revalidation, by setting Expire: header to value of Date: + value of the option"
proxy.config.http.cache.max_stale_age
"...cannot be cached. Is enabled by proxy.config.http.negative_revalidating_enabled
"
Well, I am not sure about this, maybe such documentation shall be kept in cache-basics as some kind of scenario, anyway this is how I understood and configure TrafficServer "negative revalidation" now.
@bneradt Can you please document this? Thank you.
I'm working on this and adding an AuTest. In case this is helpful to others, I'll make some initial observations.
My guess at the initial intention of this code was that setting the Expires header, observed earlier in this ticket, would set the freshness lifetime to the desired value. This doesn't work however, for at least three reasons:
First, the Expires header is set to the value of negative_revalidating_lifetime
in the future from the current time:
If that actually was effective, then when proxy.config.http.negative_revalidating_enabled
was enabled, every single cached response for a down server would always be considered fresh and served out of the cache because its Expires time would always be in the future. This is clearly not the intention. Rather, for this calculation to work, we'd want to set the Expires value to the time past the calculated age of the resource, not the offset from the current time.
The second reason this doesn't work is that, per RFC 7234 section 4.2.1, the max-age directive takes precedence over the Expires header field. Thus in all responses that has max-age, the Expires header would never even be inspected. Notice that @lukenowak 's examples all use max-age, and indeed most tests would because that's the easiest way to test freshness behavior. (Fixing this by setting the max-age directive, however, would not be a sufficient fix because of the following paragraph.)
Thirdly, by the time the Expires header is set in the code, we've already completed freshness calculations. Otherwise we wouldn't be in the code determining whether the stale response can be served. Recall that the negative revalidating feature influences behavior for stale cached responses in which the origin is unreachable. Thus at this point, setting the Expires header is too late to influence ATS's freshness calculation. We would have to set it then re-perform the freshness calculation. This is not done.
In order to make proxy.config.http.negative_revalidating_lifetime
behave like it is documented, the three above issues would need to be addressed.
For the sake of completeness, if that were done, the intention is that things should work like this:
proxy.config.http.negative_revalidating_enabled
is enabled.proxy.config.http.negative_revalidating_lifetime
. If the resource is fresh enough given this adjustment, reply with it.max_stale_age
. If its staleness is less than max_stale_age
it will reply with it, otherwise it will return a 5xx response.For now, with the bug described in this ticket, step 6 is broken and never results in the object's freshness being recalculated. Thus a user can only influence negative revalidating responses via max_stale_age
.
Thanks Brian -- fwiw, I tried a bash script a long while back https://github.com/apache/trafficserver/issues/3211 -- don't think it covered all your cases :)
Having configuration:
I have such scenario:
GET /test HTTP/1.1
origin repliesHTTP/1.1 200 OK
with header:Cache-Control: max-age=11, public
and some content, I seeTCP_MISS/200
insquid.log
, I am happy with the replyHTTP/1.1 503 Service Unavailable
withcache-control: no-cache
GET /test HTTP/1.1
, got proper 200 reply from cache, insquid.log
I seeTCP_REFRESH_MISS/200
sleep 15
(more thennegative_revalidating_lifetime
)GET /test HTTP/1.1
, got:squid.log
I seeTCP_REFRESH_MISS/200
It happens with 7.1.11, 8.1.0 and 8.1.1.
Note: Same happens if origin is just down, where I'd expect 5xx code from trafficserver after waiting for enough of time.