openstreetmap / operations

OSMF Operations Working Group issue tracking
https://operations.osmfoundation.org/
98 stars 13 forks source link

Fastly soft purge tiles based on diff updates #947

Open Firefishy opened 1 year ago

Firefishy commented 1 year ago

Fastly supports soft purging which can be used to tell Fastly that a URL is stale and should be refreshed. The stale cache will still be used if the upstream is unavailable.

As part of the osm2pgsql diff import step a list of tiles to invalidate should be created. osm2pgsql has a native implementation of generating the list of tiles to expire.

Potentially the list of expired files should be added to a queue which is then asynchronously soft purged at fastly.

Firefishy commented 1 year ago

If the soft purging could be implemented then we could allow fastly to cache tiles for significantly longer eg:

Cache-Control: public, max-age=86400 # 1 hour - browsers
Surrogate-Control: max-age=2592000 # 30 days - fastly, header dropped to client
tomhughes commented 1 year ago

Note that we don't use osm2pgql's expiry list but instead our own simpler one.

Firefishy commented 1 year ago

A quick hacky of a script to submit URLs into a redis queue: https://gist.github.com/Firefishy/58a250759b29638d3c2e6842fb2ea5aa

and a hacky script to submit the purge requests from queue (likely should be async submit): https://gist.github.com/Firefishy/c890afbbb2bd67649f7ab244880dbf07

Firefishy commented 1 year ago

Note that we don't use osm2pgql's expiry list but instead our own simpler one.

Here is the current expiry implementation which is focused at metatile expiry: https://github.com/openstreetmap/chef/blob/master/cookbooks/tile/files/default/bin/expire-tiles-single

Firefishy commented 1 year ago

The Fastly API rate limit documentation:

Single-URL and surrogate key purges: limited to an average of 100,000 purges per customer per hour. https://developer.fastly.com/reference/api/#rate-limiting

Purging purely based on metatiles is within the realms of possibility at 1666 metatiles per minute.

pnorman commented 1 year ago

We need to identify how many purge requests per second we'll be sending, as Fastly rate limits it to 1666/minute per customer, and we should be well below that.

Frequent purges are generally a design flaw when using a CDN.

Firefishy commented 1 year ago

I am proposing we use a queue (redis?), we can define the rate at which we process the queue. We could monitor queue length / queue time.

Since https://github.com/openstreetmap/chef/commit/edb9dbcdb71d38d0707e1da419d56c620004bb18 we now collect the data to inform the estimation.