tilezen / tilezen-tasks

Tilezen x-repo tasks
0 stars 0 forks source link

☔ Add mask zoom support #7

Open nvkelso opened 7 years ago

nvkelso commented 7 years ago

tl;dr Using mask zooms (er, zoom masks) will reduce global tile tiles by >30%.

Our existing architecture assumes a max zoom of 16, and that all tiles between 0 and 16 are generated. With a goal of a global tile build, that's 5,726,623,061 tiles. But if we evaluated if, say, a zoom 10 tile had any interesting content and if it didn't, we didn't generate any child tiles (at zooms 11, 12, 13, 14, 15, 16 say over the ocean with only repetitive ocean child tiles), then we'd end up with more around 4,009,055,573 tiles (a 30% reduction).

When combined with 512px tiles and max zoom of 15 or 14 we could see up to 83% total tile reduction.

There are two ways we could implement this:

  1. Constant max zoom (say zoom 8 or zoom 10). We believe this is the approach MapBox takes.
  2. Variable max zoom (based on content, would require a server manifest file, and client logic to parse the manifest file). We believe this is the approach Esri takes (and generally works better for clustered data – e.g. works better for thematic data versus a reference map).

Areas of investigation:

  1. Investigate how Tangram and other renderers support max zoom currently.
    • For instance, the Tangram sources block can specify a generic max_zoom, could it also take a function that reads from a manifest file?
    • What about Unity?
    • What about Python scrapers?
    • Mapzen terrain service has a max zoom of 15 now, which resulted in tons of wasted pre-rendered tiles, and makes going to zoom 16 prohibitive.
    • How would this interface with future Mapzen (satellite) imagery service?
  2. Investigate if Esri has published their max zoom manifest spec, and how they mod MapboxGL to support same.
    • Would a variable max_zoom make it difficult for MapBox-based rendering stacks to use our vector tiles?
  3. Investigate engineering approach in Tilezen stack to accomplish either implementation approach.
    • vector-datasource
    • tilequeue
    • tileserver

After the approach is settled on, this umbrella issue also tracks engineering to implement our max zoom solution.

zerebubuth commented 7 years ago

Just for completeness, it's possible to do the variable zoom without requiring a manifest file. The client can try zoom z, then z-1, z-2, ... until it hits the mask zoom. This might work better if there are very few zooms between the max zoom and mask zoom. If we're thinking about 15 for max zoom, and 10 for the mask, then that would mean a maximum of 5 round-trips for a mask zoom tile.

As a middle ground, a system with one or more "mid" mask zooms would also work. For example; if there's a 404 at the original zoom then jump up to mask1 and retry, before jumping up to mask2. If max zoom were 15, then we might have mask1 and mask2 set to 12 and 10 respectively. Then a fully-masked tile only requires 3 round trips.

I think the manifest is a better idea (requiring two round-trips, if you count the manifest itself), but it is far more complex. So we should be make sure it's the right balance for us.

zerebubuth commented 7 years ago

As an aside, if we had HTTP/2.0, then we could do this with a server-push to avoid round-trips. For example; if the client requests a fully masked tile at z15, then we might send the 404 response immediately followed by a server-push for the z10 tile, or whichever tile in the hierarchy above it exists.

The client would still need to have some extra logic to deal with the server-push and understand that it meant that all tile descendants of the pushed tile don't exist and not to request them.

nvkelso commented 7 years ago

/cc @bcamper for comment from Tangram side

bcamper commented 7 years ago

I like the idea of supporting multiple mask zooms, but specified per data source, as @zerebubuth was suggesting. For example, that might look something like:

sources:
  mapzen:
    url: ...
    mask_zoom: [7, 10, 14]

In this example, if a z15 tile 404s, then Tangram looks for a z14 mask tile; if that fails, then it tries the z10 tile, and finally z7. If the initial 404 occurred on a z12 tile, then it starts at z10 and works back (e.g. starts with the first available mask zoom that is less than the current zoom).

I like this because it's more flexible than a fixed max zoom, and can be tuned to fit different data sources, but it's also straightforward enough that it could be reasonably implemented by different clients. While the idea of a manifest is interesting, it presupposes both special server-side generation and possibly complex client logic that make it less attractive to me.

zerebubuth commented 7 years ago

I think there are advantages to the manifests which go beyond variable max zoom, though. Not only do manifests mean zero retries, but if they also name the tiles then we can use them to implement zero-downside updates.

At the moment, we have a trade-off between caching and freshness. We want to set the cache TTL very long to keep as much data cached at the edge or on-device as possible, but that limits data update frequency. With manifests, we can have infinite cache TTLs for tiles, and still support instant data updates.

Changing the "protocol" like this does make the client implementation more complex, but in return gives nicer cache behaviour and a better user experience. I think the increase in client complexity would be less than the increase to support refined tiles, and less than the increase was from WMS to "slippy" tiled maps.

bcamper commented 7 years ago

Can you explain more about how the manifest works? I'm not getting the connection to TTLs.

zerebubuth commented 7 years ago

A manifest can just be a list of tiles which exist, which allows for client-side 404 checking. But they can also be a list of the tiles' names, which allows more flexible and useful behaviour. For example, let's say that the manifest for tile pyramid starting at 12/1206/1539 might be something like the following:

{
  "12/1206/1539": "fea6949a-fa75-4a0f-97f7-3d41a53c5527",
  "13/2412/3078": "59a4cc4f-a1f5-4469-8763-6711e4e3e7c9",
  "14/4824/6156": "bca52e6f-dd66-421f-a3c0-f175f76abed7",
  "14/4824/6157": "5fbf1232-735b-4d09-a8df-91ba3aecabef",
  "14/4825/6156": "729ff5a3-fd84-492a-b803-c92774f18ade",
  "14/4825/6157": "94d4a7a3-a8a1-4e28-97be-6e5d730b14ec",
  "13/2412/3079": "a646d9f3-bce3-4fd9-a8e4-9a342b48353a",
  "14/4824/6158": "051a0258-4db0-4705-90f3-7c5ee5160b74",
  "14/4824/6159": "f91b60ad-9467-4e4b-bdff-6ca1c4f04966",
  "14/4825/6158": "6a8e44f4-363b-419b-a38b-c2aeba8c07f8",
  "14/4825/6159": "1bb20d18-53f2-41ca-80c0-090ea7cce90b",
  "13/2413/3078": "e38aaf21-3adf-4320-8dbf-00a42c9e197f",
  "14/4826/6156": "eb785382-46ba-4290-949c-55460cebcc91",
  "14/4826/6157": "83e55bb4-c9ea-41f0-8b45-058637c88458",
  "14/4827/6156": "5a1f20de-1079-4575-9d5f-725867cf84e2",
  "14/4827/6157": "8fb70a4f-de5c-44e9-b747-8a3a1b734819",
  "13/2413/3079": "956d5068-01b2-4cf8-bc23-e11bc3419f71",
  "14/4826/6158": "01f5c61b-7bf1-40e1-8ab1-d4cf628ed612",
  "14/4826/6159": "51c846af-33c7-44e6-802c-c3ff48fad448",
  "14/4827/6158": "c1561ee4-ea7f-4abc-ae5f-1c7e5b557eaa",
  "14/4827/6159": "44670f55-88e7-466c-bf6e-3192046678e0"
}

If the client wants to fetch 14/4826/6157, it requests http://tile.mapzen.com/mapzen/vector/v2/83e55bb4-c9ea-41f0-8b45-058637c88458.mvt and gets a tile with a 1 year max-age, which it and Fastly can cache indefinitely.

When we update that tile at some point in the future (whether minutes or months) then we store it as a different UUID, say cd65749e-8bef-451b-b88a-acb0137a186b, and update the manifest. The manifest file itself has a shorter TTL, of similar length to our tile TTL today. Each tile version is given a unique identifier, which means we can update tile contents without purging or expiring old versions of tiles. It makes it very easy for clients to determine if the tiles stored in local cache are valid or not - if they've got the same name in the manifest, they're valid and up-to-date.

At the moment, each tile cached at the client or edge has to be checked against upstream once its max-age expires, which means added latency even if the response is 304. Manifests allow us to eliminate the need for 304s and the latency of upstream checks on tiles. Clients only have to do the upstream checks on manifests, which are fewer than tiles, and generally smaller (depending on the tile).

Additionally, clients are in charge of the decision about whether to use a stale local tile or download updated data. Clients might choose to prioritise the download of missing tiles over stale ones, and not even update the manifest. At the moment, the browser cache will make the upstream check as soon as the max-age expires and as far as I know it's not possible for the client app to control that. (Although I suppose one might use local storage to take control over that.)

Finally, there can be "special" names such as blank_water or blank_land which mean that the client application can elide those downloads entirely, as they likely have those two tiles in cache already.

(PS: I could swear I wrote something about this on another ticket, but I can't find it now. Hopefully that's enough detail above. Please let me know if not, I'm happy to expand on this idea at length :wink: )

nvkelso commented 7 years ago

Poking around Esri's fork of MapboxGL here is their variable max zoom implementation:

See also:

nvkelso commented 7 years ago

Might be lods like in:

tileInfo: {
rows: 512,
cols: 512,
dpi: 96,
format: "pbf",
origin: {
x: -20037508.342787,
y: 20037508.342787
},
spatialReference: {
wkid: 102100,
latestWkid: 3857
},
lods: [
{
level: 0,
resolution: 78271.51696399994,
scale: 295828763.795777
},
{
level: 1,
resolution: 39135.75848200009,
scale: 147914381.897889
},
bcamper commented 7 years ago

@nvkelso I might be missing something, but the Mapbox stuff just looks like similar version of what we do for overzooming, I don't see anything specifically about masking (again, I might just not see it). Ditto in the blog posts, is there anything more specific about their implementation? I didn't see a mention.

bcamper commented 7 years ago

@zerebubuth how much detail / how large would you expect such a manifest to be? It seems like it could be... quite a lot... depending on how it's organized (from your per-z14 tile example).

zerebubuth commented 7 years ago

At a minimum, a manifest file would just be a single object mapping each tile address (or offset) to a UUID. Since it's just a JSON file, we could pack some other useful information in there, or back-fill masked manifests with references up to their largest parent. If there's more useful information (number of features for buffer pre-allocation, for example) then we could include that too.

If we do manifests of 5 zoom levels (0-4, 5-9, 10-14) then each file should be a maximum of 18.3kB uncompressed. Depending on how we assign UUIDs, we could compress that quite a bit. For manifests of 4 zoom levels (0-3, 4-7, 8-11, 12-15) then each would be at most 4.6kB.

We could have different sizes of pyramids at different levels, or do pyramids of pyramids if we want to bring the same immutability benefits to the manifest files themselves.

nvkelso commented 7 years ago

@bcamper It's something like:

The resulting mesh of polygons is multiscale, representing different levels of detail as
defined in the input map. Each polygon is sized to enclose no more than the specified
number of vertices of features from the input map, as determined by their density, 
distribution, and the inherent generalization that occurs when creating vector tiles.

guid-a74116eb-d038-4642-b116-dd2deae2ffd3-web

That then feeds into:

Which seems like it would work just as well with raster data like terrain or imagery which include bbox footprints (that can be represented as polygons).

Weirdly it seems like raster basemaps from Esri are 100% coverage at the specified levels of detail (more like your mask_zoom: [0,5,10,15] proposal).

nvkelso commented 7 years ago

Looks like the key file is tilemap, here's an example:

Which is a really big binary tree thing?

{"index":[[[1,[1,1,1,[1,[1,1,1,[1,1,1,1]],[1,1,1,[1,1,[1,1,1,[1,1,1,1]],[1,1,[1,1,1,1],1]]],[1,[[1,1,1,1],1,[1,1,1,1],[1,[1,1,1,1],[1,1,1,1],[1,1,1,1]]],[[1,[1,1,[1,1,1,1],1],1,[1,1,1,1]],[1,1,1,1],1,1],[[1,1,1,1],[1,[1,1,1,1],1,1],[1,1,1,1],[[1,1,1,1],[1,1,1,1],1,1]]]]],[1,1,[1,[1,1,1,[1,1,1,[1,1,[1,1,1,[1,1,1,1]],[1,1,[1,1,1,1],[1,1,1,1]]]]],[1,1,[[1,1,1,1],1,[1,1,1,1],[1,1,1,1]],1],[[1,1,1,[1,1,1,1]],[[1,[1,1,1,[1,1,1,1]],[1,1,1,1],[1,1,1,1]],[[1,[1,1,1,1],1,[1,1,1,1]],[[1,1,[1,1,1,1],[1,1,1,1]],[1,1,[1,1,1,1],1],[1,1,1,1],[1,1,1,1]],[1,1,1,1],[1,1,1,1]],[[1,1,1,1],1,1,[1,1,1,1]],[1,[1,1,1,1],[1,1,1,1],[1,1,1,1]]],[1,[[1,1,1,1],1,1,[1,1,1,1]],1,[1,[1,[1,1,[1,1,1,1],1],1,[1,1,[1,1,1,1],1]],1,1]],[[[1,1,1,1],[1,1,1,1],[1,1,1,1],[1,1,1,1]],[[1,1,1,1],[1,1,1,1],[1,1,1,1],[1,1,1,1]],[[[1,1,1,1],1,[1,1,1,1],1],[1,1,1,[1,1,1,1]],[[[1,1,1,1],[[1,1,1,1],1,1,1],1,1],1,1,1],[[1,1,1,1],[1,1,1,1],[1,1,1,1],1]],[[1,1,1,1],[1,1,[1,1,1,1],1],[1,1,[1,1,1,1],[[1,1,1,1],1,1,1]],[[1,1,1,1],[1,1,1,1],[1,1,1,1],1]]]]],[[1,1,[1,1,[[1,1,1,[1,1,1,1]],1,[[1,1,1,1],[1,1,1,1],[1,1,1,1],[1,1,1,1]],[1,[1,1,1,1],[1,1,1,1],[1,1,1,1]]], ...

Looks like their code looks to see if a tilemap.json file exists and parses it with this?

Another sample:

When it's not "indexed", it uses the normal LOD thing (like mask_zoom: [0,5,10,15]):

Their powerpoint presentation:

screen shot 2017-02-28 at 17 43 38
zerebubuth commented 7 years ago

Which is a really big binary tree thing?

It's almost certainly a quad-tree, represented as a recursive list of lists with the leaves as 1s. Do they do any tiling on top of this? If they don't, that implies that every rendered tile corresponds to a 1 and the (uncompressed) JSON would be at least 2x TOI size, or about 37MB for us. It probably compresses well, since it has a lot of repeating patterns such as [1,1,1,1], and a limited alphabet.

However, because it's not tiled it has to be updated whenever we change the max zoom due to data updates, which wouldn't be great for caching vector tiles. However it might be good for terrain or aerial, which I would expect not to change so often (and in more predictable ways).

I also have a bad feeling that when such a big chunk of JSON is parsed into the browser's RAM then it could balloon to be huge, although that could be coded around (adding complexity) by using incremental parsing techniques and more compact data structures that keep the number of pointers down.

Looks like their code looks to see if a tilemap.json file exists and parses it with this?

Wow. That's either minified, or someone's training for the Obfuscated JavaScript Contest. I can't tell if it's doing anything clever, although the relatively short length of the code suggests not.

bcamper commented 7 years ago

Thanks, will check out those links. To clarify, mask_zoom: [0,5,10,15] does not imply 100% coverage at each of those zoom levels. It implies different areas of the map have 100% coverage for at least one of those zoom levels (we'd probably just assume this is always true for z0). If the client got a 404, it would successively check the "candidate" mask zoom levels above it until it found a tile.

For example, consider a scenario where we have 100% coverage at z7, most non-water areas have coverage to z12, and more populous areas have coverage at z15. This would look like: mask_zoom: [7, 12, 15]. The worst-case would be loading a map at z17 (e.g. anything above z15) in the middle of the ocean where there is only z7 coverage.

The client would do this: request z15 tile -> 404 -> request z12 tile -> 404 -> request z7 tile -> 200

If the user panned around from there, the browser would still have to check for new z15 tiles coming into view, but as those fail it would quickly fall back to previously fetched z12 or z7 tiles that cover the same area.

iandees commented 7 years ago

A couple thoughts:

  1. This ticket starts with the assumption that everyone knows the problem we're trying to solve. Can someone write something about what we're trying to solve?
  2. The terminology is super confusing to me. Can we switch to "zoom mask" or something? In computer graphics. a mask is something used to block out information over an area so a "zoom mask" makes more sense if we're trying to describe the max zooms for individual areas of the world.
bcamper commented 7 years ago

I agree someone (not me :) should explain the problem / goals in more detail. I don't care about "mask zoom" vs. "zoom mask", just to say that "variable max zooms" is a good way of describing how I am thinking of this (at least the parts I wrote).

zerebubuth commented 7 years ago

Here's my attempt at a definition of the problem:

  1. Tiles are more resilient to failure and have lower latency if they're already rendered and fetched from cache or storage, rather than being rendered on-demand. This motivates us to do "global tile build".
  2. Many areas of the world are very detailed, which means high information density. We want our tiles to be small enough that they're usable over mobile connections, which means we've got to provide fairly small tiles. We seem to have settled on a z15 or z16 tile as a good compromise between spatial compactness and number of requests.
  3. A large number of areas of the world are not so detailed, and are just featureless water, land or Antarctica / Greenland. Rendering and storing those areas down to z15 or z16 is wasteful of database queries, S3 PUTs and cache space, and means the client has to make requests for tiles which contain nothing interesting.

One solution to this problem is to be able to zoom to z16 in information-dense areas, but stop at an earlier zoom for less information dense areas. However, if there are areas of the world which we haven't rendered down to a constant zoom of 15 or 16, we need some kind of protocol so that clients can find the information they're looking for.

Re: "mask zoom" or "zoom mask", I agree. I remember hearing the term "fallback zoom" to mean the same thing, and it's a pity that didn't catch on.

nvkelso commented 7 years ago

I like @zerebubuth's summary.

I'll also add that a solution should work for:

zerebubuth commented 7 years ago

I think that means:

Personally, I think that manifests give us the most flexibility. But they're also the biggest change to implement. I would suggest that we implement them internally, and have a microservice to translate to traditional z/x/y, then we can gradually transition external clients such as Tangram to the benefits of manifests.

nvkelso commented 7 years ago

Looks like mapsforge does their mask zooms as [5,7,11,14]:

See also:

nvkelso commented 7 years ago

From out tile sync call this Monday:

Two major options:

Discussion:

User groups:

bcamper commented 7 years ago

It seems like rather than a zoom mask vs. manifest choice directly corresponding to more/less work for the client, we should be thinking more about what the server vs. client implementations are, with various options to mix and match them.

As @zerebubuth suggests, we could have a manifest-based system server-side, but have a middleware that delivers tiles in an old-fashioned x/y/z URL format for clients. That provides full backwards compatibility with no client work, but provides no client benefit (while providing server-side/tile generation benefit). Actually, the same could probably done with manifests on the server side, but with clients following a zoom mask definition: clients would make more requests testing the various zoom mask levels than they would if they directly knew the right tile to request from a manifest, but this could still represent a savings, and they wouldn't have the added complexity or bandwidth of downloading manifests.

I'm not advocating any specific path here, just pointing out that there are probably ways to keep the decisions somewhat independent on the client vs. the server (and/or to phase them in as @zerebubuth suggested).