rapidsai / cuspatial

CUDA-accelerated GIS and spatiotemporal algorithms
https://docs.rapids.ai/api/cuspatial/stable/
Apache License 2.0
612 stars 154 forks source link

[BUG]: Basic predicate operations for Polygon.contains(LineString) fail in small specific case. #1198

Open thomcom opened 1 year ago

thomcom commented 1 year ago

Version

23.08

On which installation method(s) does this occur?

Rapids-Compose

Describe the issue

It turns out that our strategy of using three basic predicates with pip, intersection, and point equality does not work with a particular arrangement of Polygon.contains(Linestring)

That arrangement is visualized here:

image

In this example, a LineString that shares only border vertices with a containing polygon can't be distinguished as to whether or not the interior segment is inside or outside of the polygon. This effect occurs regardless of the length of the LineString, so long as the LineString segments are all on the border of the polygon.

The normal approach works well if any vertex is added to the LineString that is not on the border. We'll have to think of a deeper solution to this particular test case.

Minimum reproducible example

concave = cuspatial.GeoSeries([
    Polygon([
        (0, 0),
        (0, 1),
        (0.25, 1.0),
        (0.5, 0.5),
        (0.75, 1.0),
        (1, 1),
        (1, 0),
        (0, 0)
    ])
])
convex = cuspatial.GeoSeries([
    Polygon([
        (0, 0),
        (0, 1),
        (0.25, 1.0),
        (0.5, 1.5),
        (0.75, 1.0),
        (1, 1),
        (1, 0),
        (0, 0)
    ])
])
rhs = cuspatial.GeoSeries([LineString([(0, 1), (1, 1)])])
print(concave.contains(rhs))
print(convex.contains(rhs))

Relevant log output

0    False
dtype: bool
0    False
dtype: bool

Environment details

No response

Other/Misc.

No response

thomcom commented 1 year ago

The following algorithm describes the approach we defined for correcting Polygon.contains. This whole routine should be written in libcuspatial.

lhs.contains(rhs) where lhs (left-hand side) is a Polygon and rhs (right-hand side) is a Polygon or LineString does not depend on the vertices that define the rhs geometry, but depends on the line segments.

The operation is the same for polygons or linestrings in the rhs. A polygon is simply a closed LineSegment and a LineString is, by definition, unclosed.

The algorithm for computing lhs.contains(rhs) is as follows:

intersection = linestring_intersection_for_contains(lhs, rhs)

linestring_intersection_for_contains should compute two results for the lhs and rhs, for many lhs and rhs in parallel:

possibles = ~intersection[0]

possibles is the subset of lhs/rhs pairs for which the rhs may be contained in the lhs, because there is no proper intersection between them.

split = add_vertex_to_linestring(rhs[possibles], intersection[1])

add_vertex_to_linestring splits the segment containing each vertex into two segments at that vertex.

difference = pairwise_linestring_difference(split, intersection[2])

pairwise_linestring_difference subtracts each segment in the second argument from the linestring in the first argument. Split and difference could be combined.

centers = segments_to_centroids(difference)

pip_count = point_in_polygon_count(lhs, centers)

return pip_count == centers.sizes