adobe / helix-home

The home of Project Helix
54 stars 84 forks source link

Soft Purges and Stale Content #51

Closed trieloff closed 3 years ago

trieloff commented 5 years ago

To the tune of the Fugees: Purging me softly 🎶 … with his API

Right now, all cache invalidation we do are hard purges of the entire cache, i.e. the entire cache will be invalidated and with the next request, every request to Fastly will block until the backend (Runtime) delivers a response.

Fastly offers a Soft purge API that instead of removing the cache key entirely, it just marks it as outdated, so that (with the right configuration) with the next request the old cached version will be delivered while at the same time a backend request will be made to update the cache, which means that the request immediately following a purge will be fast, but have stale content and up-to-date content will only be delivered with the following request.

Generally, soft purges and stale content deliver a faster and therefore better visitor experience and should therefore:

Fastly does not have a "soft purge all" API, so we will have to use a Surrogate-Key value (e.g. all) that is used for every response and soft-purge that Surrogate-Key value.

What needs to be done:

I'll file individual issues once we have general agreement

tripodsan commented 5 years ago
  • for Helix Pages, I'm not sure

currently, we disable all caching for helix-pages

trieloff commented 5 years ago

I thought we had something like a five minute cache for Helix Pages.

tripodsan commented 5 years ago

I thought we had something like a five minute cache for Helix Pages.

no: https://github.com/adobe/helix-experimental-dispatch/blob/b213ec745a26fe451438b2cb31d7e86e938f99b7/src/index.js#L73

trieloff commented 5 years ago

Spoke with @davidnuescheler yesterday, paraphrasing two pieces of feedback:

The stale-while-revalidate time frame should be long (like a year or infinity)

Most purges should happen at mid-granularity, so neither all nor the exact resource, but the container of the resource, e.g. the GitHub repo. The rationale is that tracking the exact resource dependencies might end up being too hard, and permission management will dictate that most repos are relatively small, so you won't get over-purging.

davidnuescheler commented 5 years ago

I think that there is a good compromise to keep coarse grained dependencies in the surrogate keys. In my experience the per resource based granularity for a cache flush really creates a lot of a management overhead, while for example a per repository granularity may be a good starting point. this also allows people to separate things that they want flushed together into different content repos...

tripodsan commented 3 years ago

closing as for helix 3 the problems are different