openstreetmap / operations

OSMF Operations Working Group issue tracking
https://operations.osmfoundation.org/
99 stars 12 forks source link

Dirty tiles cached for 7 days #1096

Open nrenner opened 5 months ago

nrenner commented 5 months ago

Actual

When expired tiles can't be rerendered on-the-fly, the old (dirty) version is returned.

Currently, these dirty tiles get a Cache-Control: max-age=604800 and a corresponding Expires response header, that will cache them for seven days.

Expected

I would have expected dirty tiles to be cached for a very short time frame only, as described in #959:

Tiles which are known to be dirty when served are given a randomised expiry up to ModTileCacheDurationDirty which we have as 15 minutes.

Probably implemented in mod_tile.c and configured in tile.conf.erb?

Examples

Browser

Test case: z13 tile in a dense region with a very recent edit.

  1. Tile is clean. Last rendered at Tue Jun 08 10:38:36 2004. Last accessed at Sat Jun 08 10:38:36 2024
  2. Browser tile requests returned after 2 seconds with Cache-Control: max-age=604800:
    screenshot-browser-requests
    • screenshot with custom response headers, called with network tab open and "Disable cache" activated
  3. Tile is clean. Last rendered at Sat Jun 08 10:40:49 2024. Last accessed at Sat Jun 08 10:40:49 2024
    • rendering took 16 seconds (Last rendered 10:40:49 - request started / Expires time 10:40:33),
      confirming that we indeed got dirty tiles

Script

Script requesting tile status and headers every minute, for an active HOT task with ongoing edits.

Example sequence:

Tile is clean. Last rendered at Fri Jun 07 12:46:34 2024. Last accessed at Fri Jun 07 12:47:25 2024
HTTP/2 200 
etag: "2a2d2b6d5ffc3ee6aaefcdad689469ab"
cache-control: max-age=12579, stale-while-revalidate=604800, stale-if-error=604800
x-tilerender: ysera.openstreetmap.org
date: Fri, 07 Jun 2024 15:29:10 GMT
x-served-by: cache-muc13959-MUC
x-cache: MISS

Tile is clean. Last rendered at Mon Jun 07 12:46:34 2004. Last accessed at Fri Jun 07 12:47:25 2024
HTTP/2 200 
etag: "2a2d2b6d5ffc3ee6aaefcdad689469ab"
cache-control: max-age=604800, stale-while-revalidate=604800, stale-if-error=604800
x-tilerender: ysera.openstreetmap.org
date: Fri, 07 Jun 2024 15:30:13 GMT
x-served-by: cache-muc13978-MUC
x-cache: MISS

Tile is clean. Last rendered at Fri Jun 07 15:30:43 2024. Last accessed at Fri Jun 07 15:30:43 2024
HTTP/2 200 
etag: "cf80c42b67a6618d80f8fbb3b6ac0fce"
cache-control: max-age=6991, stale-while-revalidate=604800, stale-if-error=604800
x-tilerender: ysera.openstreetmap.org
date: Fri, 07 Jun 2024 15:31:13 GMT
x-served-by: cache-muc13945-MUC
x-cache: MISS

Tile is clean. Last rendered at Mon Jun 07 15:30:43 2004. Last accessed at Fri Jun 07 15:31:13 2024
HTTP/2 200 
etag: "cf80c42b67a6618d80f8fbb3b6ac0fce"
cache-control: max-age=604800, stale-while-revalidate=604800, stale-if-error=604800
x-tilerender: ysera.openstreetmap.org
date: Fri, 07 Jun 2024 15:32:16 GMT
x-served-by: cache-muc13977-MUC
x-cache: MISS

Tile is clean. Last rendered at Fri Jun 07 15:32:37 2024. Last accessed at Fri Jun 07 15:32:37 2024
HTTP/2 200 
etag: "04a889c787d54d0f7bbfe0c190136d38"
cache-control: max-age=5173, stale-while-revalidate=604800, stale-if-error=604800
x-tilerender: ysera.openstreetmap.org
date: Fri, 07 Jun 2024 15:33:17 GMT
x-served-by: cache-muc13928-MUC
x-cache: MISS
tomhughes commented 5 months ago

No, dirty tiles do not get that cache age, at least not from upstream which is what we control. The expiry time for dirty tiles is 900s:

https://github.com/openstreetmap/chef/blob/d15475a0ad553d8a25888589158651850d638376/cookbooks/tile/templates/default/tile.conf.erb#L23

tomhughes commented 5 months ago

It's unclear to me what exactly you think you're showing in your extremely verbose report - as far as I can see there is no indication of whether a returned tile is dirty or not so are you trying to match a tile status call and a tile call and hoping nothing changes in between?

tomhughes commented 5 months ago

Ah I've seen the script now but that is useless because it makes a status call (which may well say the tile is dirty) and then a tile call, which may well re-render the tile and return a clean version and hence not return the dirty expiry.

nrenner commented 5 months ago

as far as I can see there is no indication of whether a returned tile is dirty or not

That's why this is so verbose, as I'm trying to reason about it using indirect hints.

so are you trying to match a tile status call and a tile call and hoping nothing changes in between?

In the browser example I ran a loop to check the status every second while making the call and it didn't change until 16 seconds after (browser request started time 10:40:33; or 17 seconds according to Last accessed):

2024-06-08T10:40:31+00:00
Tile is clean. Last rendered at Tue Jun 08 10:38:36 2004. Last accessed at Sat Jun 08 10:38:36 2024
2024-06-08T10:40:32+00:00
Tile is clean. Last rendered at Tue Jun 08 10:38:36 2004. Last accessed at Sat Jun 08 10:40:32 2024
...
2024-06-08T10:40:48+00:00
Tile is clean. Last rendered at Tue Jun 08 10:38:36 2004. Last accessed at Sat Jun 08 10:40:32 2024
2024-06-08T10:40:49+00:00
Tile is clean. Last rendered at Sat Jun 08 10:40:49 2024. Last accessed at Sat Jun 08 10:40:49 2024

Ah I've seen the script now but that is useless because it makes a status call (which may well say the tile is dirty) and then a tile call, which may well re-render the tile and return a clean version and hence not return the dirty expiry.

Could be, but unlikely because minimum time per tile for z13 on ysera was > 6 seconds, and timeout to wait for rendering seems to be 2 seconds?

What else would be the reason for those cache-control: max-age=604800, when last rendered time in these examples was only minutes or a few hours before?

Spiekerooger commented 5 months ago

This is a mod_tile issue as nrenner rightfully found out. The 900s ModTileCacheDurationDirty config setting never gets applied in mod_tile for tiles that are marked dirty by the expiry mechanisms. And by that the dirty tiles returned by the tilerenderers have the way-too-high expiry time for the cdn edge server caches.