manaakiwhenua / vector2dggs

DGGS indexer for vector data
https://pypi.org/project/vector2dggs/
GNU Lesser General Public License v3.0
6 stars 0 forks source link

MultiPolygon input with some invalid geometries raises exception #23

Closed alpha-beta-soup closed 1 year ago

alpha-beta-soup commented 1 year ago

Invalid input:

MULTIPOLYGON (((3119458.185189217 7571519.097363896, 3119458.185189232 7571519.097363843, 3119459.7551563787 7571512.447556131, 3119461.3469764567 7571507.733667886, 3119465.2474705516 7571493.714958249, 3119465.2487741904 7571493.709811926, 3119466.8242175067 7571479.729848367, 3119466.606453425 7571468.837552614, 3119465.9542556694 7571461.11235766, 3119466.335023812 7571455.260869999, 3119467.4797128793 7571449.423873049, 3119469.076620066 7571446.146891871, 3119467.2411173247 7571436.93538059, 3119464.509318866 7571433.820400377, 3119456.202147156 7571429.564513377, 3119454.9845944983 7571426.991843466, 3119454.277359838 7571424.429867966, 3119455.1000610054 7571421.64477695, 3119456.0361964665 7571420.215887405, 3119457.2293895404 7571417.357124316, 3119457.2293895446 7571417.357124307, 3119454.3578499476 7571408.8896815125, 3119447.362811734 7571402.879130931, 3119447.3628117247 7571402.879130931, 3119441.295549791 7571402.639042578, 3119438.5252920645 7571402.367713664, 3119438.5252920436 7571402.367713663, 3119457.2204702315 7571469.566269882, 3119443.6704017855 7571543.159236515, 3119444.668530347 7571541.303743437, 3119451.9696993395 7571528.8253820855, 3119458.185189192 7571519.097363936, 3119458.185189217 7571519.097363896)), ((3119438.525292046 7571402.367713663, 3119438.5252920436 7571402.367713663, 3119438.5252920436 7571402.367713663, 3119438.525292046 7571402.367713663)))

shapely.validation.explain_validity: Too few points in geometry component[3119438.52529205 7571402.36771366]

The result of shapely.validation.make_valid:

GEOMETRYCOLLECTION (POLYGON ((3119458.185189232 7571519.097363843, 3119459.7551563787 7571512.447556131, 3119461.3469764567 7571507.733667886, 3119465.2474705516 7571493.714958249, 3119465.2487741904 7571493.709811926, 3119466.8242175067 7571479.729848367, 3119466.606453425 7571468.837552614, 3119465.9542556694 7571461.11235766, 3119466.335023812 7571455.260869999, 3119467.4797128793 7571449.423873049, 3119469.076620066 7571446.146891871, 3119467.2411173247 7571436.93538059, 3119464.509318866 7571433.820400377, 3119456.202147156 7571429.564513377, 3119454.9845944983 7571426.991843466, 3119454.277359838 7571424.429867966, 3119455.1000610054 7571421.64477695, 3119456.0361964665 7571420.215887405, 3119457.2293895404 7571417.357124316, 3119457.2293895446 7571417.357124307, 3119454.3578499476 7571408.8896815125, 3119447.362811734 7571402.879130931, 3119447.3628117247 7571402.879130931, 3119441.295549791 7571402.639042578, 3119438.5252920645 7571402.367713664, 3119438.5252920436 7571402.367713663, 3119457.2204702315 7571469.566269882, 3119443.6704017855 7571543.159236515, 3119444.668530347 7571541.303743437, 3119451.9696993395 7571528.8253820855, 3119458.185189192 7571519.097363936, 3119458.185189217 7571519.097363896, 3119458.185189232 7571519.097363843)), LINESTRING (3119438.525292046 7571402.367713663, 3119438.5252920436 7571402.367713663))

But note that now there's a LINESTRING that wasn't part of the GeometryCollection before; the input was all of one type, MULTIPOLYGON. So the input is correctly noted to be invalid, and is made valid, but now the input has a surprising type.

Then note how small the LINESTRING geometry part is. If we apply a buffer of 0 after make_valid, we get a cleaner output that does not cause later exceptions to be raised: geometry.buffer(0)

POLYGON ((3119458.185189232 7571519.097363843, 3119459.7551563787 7571512.447556131, 3119461.3469764567 7571507.733667886, 3119465.2474705516 7571493.714958249, 3119465.2487741904 7571493.709811926, 3119466.8242175067 7571479.729848367, 3119466.606453425 7571468.837552614, 3119465.9542556694 7571461.11235766, 3119466.335023812 7571455.260869999, 3119467.4797128793 7571449.423873049, 3119469.076620066 7571446.146891871, 3119467.2411173247 7571436.93538059, 3119464.509318866 7571433.820400377, 3119456.202147156 7571429.564513377, 3119454.9845944983 7571426.991843466, 3119454.277359838 7571424.429867966, 3119455.1000610054 7571421.64477695, 3119456.0361964665 7571420.215887405, 3119457.2293895404 7571417.357124316, 3119457.2293895446 7571417.357124307, 3119454.3578499476 7571408.8896815125, 3119447.362811734 7571402.879130931, 3119447.3628117247 7571402.879130931, 3119441.295549791 7571402.639042578, 3119438.5252920645 7571402.367713664, 3119438.5252920436 7571402.367713663, 3119457.2204702315 7571469.566269882, 3119443.6704017855 7571543.159236515, 3119444.668530347 7571541.303743437, 3119451.9696993395 7571528.8253820855, 3119458.185189192 7571519.097363936, 3119458.185189217 7571519.097363896, 3119458.185189232 7571519.097363843))

And we're also back to a single Polygon.

Shapely notes:

A positive distance produces a dilation, a negative distance an erosion. A very small or zero distance may sometimes be used to “tidy” a polygon.

I think that's an appropriate thing to do to any input polygon that we're already modifying with make_valid.