libgeos / geos

Geometry Engine, Open Source
https://libgeos.org
GNU Lesser General Public License v2.1
1.17k stars 353 forks source link

Implement polygon subdivision and expose via C API #717

Open brendan-ward opened 1 year ago

brendan-ward commented 1 year ago

This migrates this issue from Trac: https://trac.osgeo.org/geos/ticket/1075 with minor updates here to reflect PyGEOS => Shapely 2.0.

We would like to expose subdivide functionality in Shapely (originally raised in pygeos #244 and attempted in pygeos #256) and downstream libraries like GeoPandas.

This is relatively similar in intent to ST_Subdivide in PostGIS (​https://postgis.net/docs/ST_Subdivide.html). While this could be handled in client libraries, there may be broader appeal to this functionality in GEOS, as well as the ability to leverage some of the internals for better performance.

For highly complex, irregular polygons (e.g., country or continent boundaries with coastlines), performing subdivision first can greatly speed up spatial index, predicate operations (e.g., intersects), and subsequent intersections.

Essentially, the algorithm in ST_Subdivide recursively subdivides a polygon along the longest axis (width or height) until a target number of vertices are reached. Lower dimension results are discarded along the way.

Since each step in the recursion involves clipping (using intersection) the polygon into a given half, it seems like the noding required for each intersection operation may be a performance bottleneck, and it also involves holding a lot of geometries in memory.

It may be possible recursively calculate the subdivision points in advance, perhaps similar in overall intent to STRtree (or perhaps using a KdTree to partition vertices), and perform a final step to clip the polygon by those partitions. Using an approach that sorts vertices no more than necessary will also likely yield performance benefits (i.e., once vertices are bucketed into groups of N vertices, they do not need to be sorted further).

Presumably the output from this C API would always be a MultiPolygon, and I'm assuming the input would typically be a Polygon, but would need to decide if MultiPolygon input also makes sense (operation and output would be the same, just another level of looping).

dr-jts commented 1 year ago

Presumably the output from this C API would always be a MultiPolygon, and I'm assuming the input would typically be a Polygon, but would need to decide if MultiPolygon input also makes sense (operation and output would be the same, just another level of looping).

Since the output will in general be edge-touching polygons, providing it as a MultiPolygon would be invalid. It should be a GeometryCollection.

theroggy commented 9 months ago

For information, Spatialite offers an ST_SubDivide function, based on the rttopo library.

However, when I tried to use it, it turns out that it has robustness issues. Issue submitted here: https://git.osgeo.org/gitea/rttopo/librttopo/issues/43

EDIT: update: the issue seems to be solved in newer version of liblwgeom, the library librttopo seems to be based on... but the fix wasn't ported to librttopo. Hopefully it will happen, but doesn't seem trivial.