tilezen / vector-datasource

Tilezen vector tile service - OpenStreetMap data in several formats
https://www.nextzen.org/
Other
507 stars 119 forks source link

water features with boundary=yes gappy in some areas #735

Open meetar opened 8 years ago

meetar commented 8 years ago

Notably around Manhattan and some nearby areas:

screen shot 2016-04-20 at 2 55 39 pm

Viewable here as of 14 Oct 2016: http://tangrams.github.io/explorer/#13.0/40.7609/-73.9744/boundary/true

zerebubuth commented 8 years ago

Comment from Slack, for context:

boundaries are gappy where there's two polygons with tangential sides. both subtract each other's boundaries.

the previous approach (unioning all the water, then dividing the boundaries up) didn't have this problem but was far, far too slow. we should have a go and see if clipping the polygons to just a little bigger than a tile makes it manageable. otherwise, this might need some thinking (or NEAT tiles) to solve "properly".

nvkelso commented 8 years ago

Confirmed this is still a problem on dev with v1 tiles.

In the case of NYC area, we think this is because of the coastline derived water polygon (openstreetmapdata.com) is mostly coincidental with the riverbank polygons for Hudson, East River, and various riverbank channels of those rivers. Which is a widespread data problem we'll need to engineer around.

nvkelso commented 8 years ago
screen shot 2016-09-21 at 11 23 36
layers:
    water:
        data: { source: osm }
        draw:
            polygons:
                order: 2
                color: '#353535'
        boundary:
            filter: { boundary: true }
            draw:
                lines:
                    order: 3
                    color: red
                    width: 2px
zerebubuth commented 8 years ago

Hopefully this diagram will help explain what's going on. We have data from the database which is like the top row, with water polygons overlapping. The tops of the polygons don't overlap exactly. Although the difference has been exaggerated here to be easier to see, often the differences can be only thousandths of a "meter" unit. The bottoms of the polygons show the opposite: they align exactly.

The second row shows what's "mathematically correct" with a black dashed line standing in for some sort of tie-break between A and B.

We currently try to make this work by taking the boundary of each polygon and subtracting any other polygons which overlap it. This leads to the next few rows in the diagram, showing each polygon's boundary with the other polygon subtracted.

On the final row, these are combined together, showing the tops of the polygons "stitching" or "dashing" and the bottoms missing entirely!

text4264

I think this is because the current algorithm is commutative. The current algorithm could be summarised as:

for each polygon p:
  b = boundary(p)
  for each polygon p' != p:
    b -= p'
  output b

The missing boundary problem arises because there are some segments of the boundaries of polygons which are exactly the same, and so get subtracted from both by each other, leaving a gap. The stitching problem occurs because polygons can have very similar - but not identical - boundaries, so both subtract small segments of each other's boundary.

The previous algorithm, which was replaced because of performance problems, was similar to:

u = empty
for each polygon p:
  u = u.union(p)
b = boundary(u)
for each polygon p:
  b' = b.intersection(p)
  output b'
  b -= p

Although this is slower, creating a single boundary line up-front ensures that there are no missing sections of the boundary. Doing the subtraction in an ordered way allows us to buffer each polygon by a small amount, which will reduce (but probably not eliminate) the "stitching" or "dashing" problem.

nvkelso commented 8 years ago

Super useful diagrams, thank you!

nvkelso commented 5 years ago

Still a problem.