Soft Purges and Stale Content

trieloff commented 5 years ago

To the tune of the Fugees: Purging me softly 🎶 … with his API

Right now, all cache invalidation we do are hard purges of the entire cache, i.e. the entire cache will be invalidated and with the next request, every request to Fastly will block until the backend (Runtime) delivers a response.

Fastly offers a Soft purge API that instead of removing the cache key entirely, it just marks it as outdated, so that (with the right configuration) with the next request the old cached version will be delivered while at the same time a backend request will be made to update the cache, which means that the request immediately following a purge will be fast, but have stale content and up-to-date content will only be delivered with the following request.

Generally, soft purges and stale content deliver a faster and therefore better visitor experience and should therefore:

be the only option for the bot
be the default option for the CLI (debatable, I'd love to hear your counter-arguments @tripodsan @davidnuescheler)
for Helix Pages, I'm not sure

Fastly does not have a "soft purge all" API, so we will have to use a Surrogate-Key value (e.g. all) that is used for every response and soft-purge that Surrogate-Key value.

What needs to be done:

I'll file individual issues once we have general agreement

[ ] in helix-dispatch: add a stale-while-revalidate directive to the Surrogate-Control or Cache-Control headers
[x] in helix-publish (https://github.com/adobe/helix-publish/issues/152): append the value all to the Surrogate-Key header for all responses. We do this in helix-publish, because some requests (large binaries) bypass helix-dispatch.
[ ] in helix-bot: use fastly.softPurgeKey('all') instead of fastly.purgeAll()
[ ] in helix-cli: conditionally use fastly.softPurgeKey('all') instead of fastly.purgeAll()

tripodsan commented 5 years ago

for Helix Pages, I'm not sure

currently, we disable all caching for helix-pages

trieloff commented 5 years ago

I thought we had something like a five minute cache for Helix Pages.

tripodsan commented 5 years ago

I thought we had something like a five minute cache for Helix Pages.

no: https://github.com/adobe/helix-experimental-dispatch/blob/b213ec745a26fe451438b2cb31d7e86e938f99b7/src/index.js#L73

trieloff commented 5 years ago

Spoke with @davidnuescheler yesterday, paraphrasing two pieces of feedback:

The stale-while-revalidate time frame should be long (like a year or infinity)

Most purges should happen at mid-granularity, so neither all nor the exact resource, but the container of the resource, e.g. the GitHub repo. The rationale is that tracking the exact resource dependencies might end up being too hard, and permission management will dictate that most repos are relatively small, so you won't get over-purging.

davidnuescheler commented 5 years ago

I think that there is a good compromise to keep coarse grained dependencies in the surrogate keys. In my experience the per resource based granularity for a cache flush really creates a lot of a management overhead, while for example a per repository granularity may be a good starting point. this also allows people to separate things that they want flushed together into different content repos...

tripodsan commented 3 years ago

closing as for helix 3 the problems are different

adobe / helix-home

Soft Purges and Stale Content #51

What needs to be done: