mapbox / vector-tile-spec

Mapbox Vector Tile specification
https://www.mapbox.com/vector-tiles/specification/
Other
891 stars 210 forks source link

Encoding features that can be “merged” from multiple tiles #104

Open anandthakker opened 6 years ago

anandthakker commented 6 years ago

This issue really encompasses a cluster of related problems:

Inter-tile feature identity: when a single logical feature exists in multiple tiles, there’s no guaranteed way to know that its instances in different tiles are all manifestations of the same feature. An identity relation would be useful for, e.g.:

Clipping detection: features that span multiple tiles are almost always clipped to each tile’s bounds (with some buffer). Clipping may introduce new points in linestring or polygon geometries (e.g., to close an otherwise open polygon ring). There is no way for a points that are clipping artifacts to be distinguished from those that come from the original geometry. As a result, Mapbox GL currently needs tiles with linestring or polygon to use a buffer so that it does not, for example, render polygon outlines or line caps at tile boundaries. (Related: https://github.com/mapbox/vector-tile-spec/issues/6)

Geometry reconstitution: Even when the inter-tile identity problem is not an issue (e.g., features have unique ids), recovering the original geometry—or reassembling the subset of it that exists in the tiles you have—is a challenge: at each place where a geometry has been ‘cut’ at a tile boundary, we need a way to associate the corresponding points or segments from each tile’s portion of the geometry. (See also https://github.com/mapbox/vector-tile-spec/issues/8) Support for this in the spec would be useful for, e.g.:


These problems also introduce new inter-tile consistency concerns that haven’t been an issue in previous versions of the spec.

cc @ericfischer @flippmoke @jfirebaugh @kkaefer

e-n-f commented 6 years ago

The approach I have been working on in https://github.com/mapbox/tippecanoe/tree/reconstruct, is that each feature that is split across tiles is given a new 64-bit integer clipid message in the feature, which is unique within the tileset. (The clipid is assigned at the moment of first clipping, so it is not present in the low zooms, only in the first clipped zoom and beyond.)

Compositing of layers will need to be careful to make sure that the clipids are still unique.

Each clipped feature also has a geomids message, which is a list of ids for nodes within the geometry. It is a series of pairs: each is a skip and an id (so that space is not spent in the list on nodes that have no id). The geomids for nodes will be assigned to points on the tile boundary at the moment of clipping. A geomid that appears in zoom level N will also appear at the same location in zoom levels N+1 and higher, but I don't know if this is important. The geomid is unique within the feature, not globally.

MultiPoints:

LineStrings:

Polygons:

Other uncertainties:

anandthakker commented 6 years ago

Capturing some notes from last week's conversation w/ @jfirebaugh @ericfischer @flippmoke :

anandthakker commented 6 years ago

I wanted to explore what it could look like to encode mergability data without using a clipid that's unique within the tileset.

For single-part geometries, this is pretty straightforward: geomids as described by @ericfischer above would suffice, with the caveat that if a client had only two non-adjacent tiles containing fragments of a single original feature, it wouldn't be able to relate them.

For multi* geometries, the problem is that if you have, e.g., two subgeometries where each is wholly in its own tile, there would be no geomid indicating their relationship.

@kkaefer previously proposed a slightly different scheme that could address this. Summarizing his idea as I understand it:

@kkaefer's example:

This consists of two features, one MultiLineString that has three LineStrings, only one of which is crossing a tile boundary, and one LineString that has points on the tile boundary.

The upper feature would be encoded as:

Left tile Right tile
MoveTo 132/66
LineTo 239/107
LineTo 150/147
ContinuityID 0
MoveTo 299/134 (outside the tile extent)
MoveTo 341/31 (outside the tile extent)
ContinuityID 0
MoveTo 220/46
LineTo 151/17
ContinuityID 0
LineTo 288/9 (outside the tile extent)
MoveTo -106/147 (outside the tile extent)
ContinuityID 0
MoveTo 43/124
LineTo 85/31
ContinuityID 0
MoveTo -36/45 (outside the tile extent)
MoveTo -105/17 (outside the tile extent)
ContinuityID 0
LineTo 32/9

The lower feature would be encoded as:

Left tile Right tile
MoveTo 256/161 (outside the tile extent)
ContinuityID 0
LineTo 159/199
LineTo 188/234
ContinuityID 0
LineTo 331/243 (outside the tile extent)
MoveTo 0/161
ContinuityID 0
LineTo -97/199 (outside the tile extent)
MoveTo -68/234 (outside the tile extent)
ContinuityID 0
LineTo 75/243
LineTo 0/206

_(Note: in this example, it appears that continuity ids are actually only unique with respect to a given MoveTo or LineTo segment, not within a feature. That might be all that's strictly necessary, but I could see it being simpler to have them be unique to the feature)_

e-n-f commented 6 years ago

OK, I can see the advantage of an ID that is only locally unique, especially in the case where just one tile from a larger area is being regenerated, since the necessary knowledge is then confined to just the adjacent tiles. Tagging the movetos also solves the MultiLineString sequence problem that I had been worrying about.

The disadvantage I see is that you have to look through the geometry of the features in the adjacent tile to find the feature that continues the one you are looking at instead of being able to look it up by ID.

I would still prefer that ID to be associated with a phantom point on the tile boundary than with the segment, because if the entire segment is preserved with its original coordinates, the buffer for that segment will be extremely large in some cases (imagine a LineString that jumps straight from one continent to another). I am OK with coordinates of that magnitude, but GL has overflow problems once you even get to 32768 tile units from the origin, so the tiles cease to render compatibly by existing clients. (And if you do clip closer in, for compatibility, you also need to add a point on the boundary to avoid visible discontinuity due to rounding error.)

I think phantom points will also generalize better to polygons, where the portion of the geometry in the buffer will have to be more schematic than the original if it is to preserve topology without preserving the entire original shape.

anandthakker commented 6 years ago

Yeah, @ericfischer I was just realizing the overflow issue and sketching out the same conclusion: we can still tag MoveTos and also clip them using interpolated points just as we do for LineTos.

img_2501

I'm not actually sure whether I think local id's are preferable -- just wanted to put it on the table for consideration.

e-n-f commented 6 years ago

The only disadvantage I see to tagging the movetos is that in the case of the intercontinental MultiLineString (or the MultiPolygon of Alaska), we will end up with a long string of tiles that contain nothing but two movetos. But it's probably necessary as the way to know whether you have loaded the entire feature.

anandthakker commented 6 years ago

Last week, @jfirebaugh and I started thinking through the implications of "bufferless" rendering in GL. Summarizing our findings here.

Background: buffer-based rendering in GL

Currently, buffers are important for rendering in GL. We render each layer one tile at a time, with each tile responsible for the region within its boundaries. Preparing each tile for rendering must happen independently, without access to other tiles. Buffers are important because:

Bufferless rendering

In a "bufferless" rendering strategy, we'd alter how we divide responsibility between tiles: instead of each tile being responsible for the features that are rendered within its boundaries, each tile would be responsible for the features whose geometries are located within its boundaries. Clipping detection is straightforward, since we'll now have metadata distinguishing the "real" end of a geometry from a clipping artifact.

The difference between source & rendered geometry gives rise to more issues/constraints:

Next steps:

mourner commented 6 years ago

Renderers will need to request more tiles, maintaining a 1-tile buffer around the viewport so that they can render geometries located outside the viewport but visible within it.

@anandthakker do we really need a 1-tile buffer, or just a fixed pixel buffer around a viewport? The latter should cover the use case while loading only slightly more tiles — not nearly as many as a 1-tile buffer.

anandthakker commented 6 years ago

@mourner 1-tile buffer would be the worst case: if the viewport lines up with tile boundaries, for example, really would need the full 1-tile buffer. But it's true that in other cases, you might not need any/many additional tiles

ansis commented 6 years ago

@anandthakker thanks for the summary! I'm not sure if this ticket is the best place to ask, but what are the advantages of bufferless rendering? I can think of:

Another possible rendering issue to think about for bufferless rendering is the draw order of features within the same layer but in different tiles. Could a feature in tile A be rendered between two features in tile B? If yes, could this be specified by the data order somehow or would layers be considered unordered lists of features? How would the renderer actually render this with bufferless rendering?

These questions might be a bit off-topic for this issue. Is there a bufferless rendering issue open yet in any of the repos?

anandthakker commented 6 years ago

@ansis yeah, I think avoiding stencil clipping to tile boundaries and reducing tile size are the key benefits. I think in practice we see buffers quite a bit larger than 25px, so we could be avoiding quite a bit of duplicated data (and not just lines -- polygons too, and for that matter, dense point layers).

Good point about draw order -- we may need to require in the spec that the relative ordering of any pair of features is consistent in all the tiles in which they appear.

ansis commented 6 years ago

Good point about draw order -- we may need to require in the spec that the relative ordering of any pair of features is consistent in all the tiles in which they appear.

Without buffers it wouldn't be possible to determine the relative order of two features that are entirely in different tiles but have overlapping renderings. Actually rendering them in the right order would also be a challenge with bufferless. We'd probably need to merge the gl buffers across tiles instead of rendering each tile's batch separately.

Not supporting ordering might be a good choice

jfirebaugh commented 6 years ago

@ansis Can you post an example of the type of issue you're thinking of?

anandthakker commented 6 years ago

Ah. Yeah, this is tricky:

img_2543

If the RH tile is drawn first, how would we avoid the blue line's join drawing above the portion of the red line that's in the LH tile?

jfirebaugh commented 6 years ago

The draw order is:

for (layer in layers)
  for (tile in tiles)
    draw(layer, tile)

So I don't think that particular scenario is an issue.

anandthakker commented 6 years ago

It is if those two features are in the same layer (colored w/ data-driven styling)

ansis commented 6 years ago

@anandthakker that's a worse one than I was thinking of!

@jfirebaugh it is an issue if the LH tile is drawn first in @anandthakker's example (assuming bottom up rendering. RH first for top-down opaque rendering)


My issue was one that already exists for our bufferless circle rendering and would exist for bufferless line rendering as well. We can't draw C above B above A currently.

We'd need to merge the data into one sorted buffer that can be drawn in one draw call. This could be a performance hit. It could be doable. With bufferless tiles we also lose the ability to specify the draw order using the data order but this could be ok.

unnamed

anandthakker commented 6 years ago

Since geometry reconstruction is significantly more complicated than feature identity or marking clipping/buffer artifacts, here's a roundup of potential use cases for full geometry reconstruction.

🏗 = actually needs reconstruction ✂ = probably doable without doing reconstruction

I think the next questions here are:

  1. Is anything missing? And then, once the answer to this is "no"...
  2. ... do we feel that these use cases justify the complexity that geometry reassembly adds to the spec and its implementations?
varna commented 2 years ago

I'm using Google Maps JS with deck.gl MTSLayer to show MTS features on map. I don't have Mapbox GL (and queryRenderedFeatures installed). Trying to figure out a way to query an untiled Feature now.