openoereb / pyramid_oereb

pyramid_oereb is an open-source implementation of the server side part for the Swiss "Cadastre of Public-law Restrictions on landownership" (PLR-cadastre).
Other
8 stars 22 forks source link

Clarify requirements for invalid geometries #1111

Closed jwkaltz closed 2 years ago

jwkaltz commented 3 years ago

The requirements for processing by pyramid_oereb in case of invalid geometries should be clarified. This use case can happen when data was produced according to Interlis in a form which is valid in Interlis, but which is considered invalid by the OGC ("Ring Self-intersection").

In #1109 some tests were added to see the current behavior of pyramid_oereb when an invalid geometry is processed. When processing in Python 3.6 and Shapely 1.6, the processing still works when such a geometry is processed. However, when using Shapely 1.7 the same code generates an exception (see #1013).

Therefore, the community needs to specify the requirements for processing by pyramid_oereb when an invalid geometry (invalid according to OGC):

vvmruder commented 3 years ago

This is a difficult question indeed. Since we try to combine this 2 not connectable worlds. In some past user group meeting we had some inputs in this by BS (@svareg) and BE (@peterschaer). Could we maybe use this issue to gather information in the WIKI about best practices on "how to import Interlis into PostGIS"? Because I think this is where we should look. This OGC incompatible data should already be rejected on input into the DB and not at time of processing. @kdeininger: Input on that?

kdeininger commented 3 years ago

Yes, this is a quite difficult question. Staying with Shapely 1.6 is not a long term solution, so we will have to specify the handling of invalid geometries somehow. I agree with @vvmruder, that invalid data should be rejected on import. But the problem (and maybe also the solution) is, that the import is not part of this project, except the script for the federal data. All users of pyramid_oereb should be responsible for their data and have to ensure valid geometries.

We could provide a script for the standard schema trying to fix invalid geometries using ST_MakeValid. But this would be only a PostGIS solution. Everyone should somehow be able to split self-intersecting polygons into valid multi-polygons using FME, GEOS, QGIS or any other tool.

jwkaltz commented 3 years ago

All users of pyramid_oereb should be responsible for their data and have to ensure valid geometries.

This is clear, but the actual problem, as I understand it, is that these are geometries which are valid according to Interlis (and the Interlis validation tools), but invalid according to the OGC-compliant tools. Ideally, we would add a "FAQ" about what to do in these cases, if the community agrees on what should be done. Plus maybe some script as you suggest.

vvmruder commented 3 years ago

I talked to @svareg. Some time ago we already discussed this topic and we decided to put some best practices, scripts, tools in the docs. This would contain especially the handling of data transformed from Interlis into PostGIS. We will try to improve the docs until end of march since the info needs to be gathered. In addition @peterschaer had some inputs on that. He will be available later in the month.

@jwkaltz Do you think this is sufficient?

jwkaltz commented 3 years ago

Thanks for the infos @vvmruder , I will be happy to have a look once the information is gathered. We can then also analyse more precisely what the impact of a shapely upgrade will be.

svareg commented 3 years ago

Hello Everyone,

I'll try to give more infos about what the problem was for us in BS: First, we are using curves in our geometries. That's a problem because shapely 1.6.4cannot work with curves. (I don't know for shapely 1.7).

To be able to work with our geometries in pyramid_oereb, we had to convert them to geometries without curves. To do this, we are using the postgres function st_curvetoline:

st_curvetoline(li.the_geom, 0.0025::double precision, 1)

By doing this, invalid geometries were generated, with self-intersecting parts, because we had the following errors in the initial geometries :

image

The black curve really intersects the blue line, because the angle defined in the curved geometry is wrong. Our geometries were not precise enough.

So as a first temporary solution, we used st_makevalid:

st_makevalid(st_curvetoline(li.the_geom, 0.0025::double precision, 1))

This worked, but actually the initial problem was still here, because the initial geometry was invalid. Therefore, we asked the Cadastral Surveying to correct the geometries.

Now, when using just st_curvetoline, the generated geometries are valid, and we don't need st_makevalidanymore.

Also I generally agree with @kdeininger, all users of pyramid_oerebshould be responsible for their data and have to ensure valid geometries, and the invalid data should be rejected during the import.

@vvmruder @jwkaltz I hope those explanations can help for a Wiki ar a FAQ.

peterschaer commented 3 years ago

We in BE had several hundred geometries in our PostGIS database. These geometries lead to errors in pyramid_oereb preventing extracts to be produced for certain parcels. The invalid geometries were in the parcels but also in plrs (especially land use plans).

This is our solution: We have various input sources (Interlis 1/2, ESRI FGDB, ESRI EGDB). They all contain curves. We import everything with FME where we use ArcStroker to eliminate the curves. FME is then writing the data to PostGIS. There we detect invalid geometries with st_isvalid(). Those invalid geometries are then being corrected with st_makevalid(). Sometimes the result of st_makevalid() is of a different geometry type (e.g. GeometryCollection instead of Mulitpolygon). In those cases the corrected geometry has to be manipulated again with st_collectionextract() to get the correct geometry type.

The user will never see this manipulated data. In the WMS we use another data source that is not manipulated during import. The manipulated data is exclusively used by pyramid_oereb. And here possible small sliver areas are anyway filtered out (we have a minimum size of 1 square meter).

KunoEpper commented 3 years ago

SZ encountered the problem without any arcs involved. Our problem was caused by polygons which inner and outer boundary intersect in one point (which is valid in INTERLIS as well as in OGC [for INTERLIS see test-case RSU.T05 in https://github.com/geoadmin/suite-interlis/blob/master/doc/attributes.md#geometric-attributes-surfaces-and-tessellations; for OGC see chapter 6.1.11 in https://portal.ogc.org/files/?artifact_id=25355).

BUT: There are two ways how such polygon are coded in the transferfile (XTF):

  1. with two \<BOUNDARY>-tags (one for outer and one for inner boundary) and

  2. with only one \<BOUNDARY>-tag (see attached file).

Both are valid in INTERLIS (ilivalidator-1.11.10) but only first contains a valid OGC geometry. Second fails OGC-simple (self-intersection in 2D) and caused problems in pyramid_oereb (shapely 1.7).

our solution for oereb-data:

And parallel:

Files:

mki-c2c commented 2 years ago

I performed some further analysis of geometrical operations with shapely. Handling of invalid geometries has strongly changed between shapely versions 1.6, 1.7, 1.8

Basis for the tests are the following invalid polygons:

Both geometries result in the "butterfly" shape below: image

Another invalid "crescent" geometry is used for test purpose: image

mki-c2c commented 2 years ago

Dependencies of different shapely versions:

shapely 1.6

installation under python 3.7

Shapely 1.6.4.post1 still depends on system lib libgeos-c1v5 (libgeos-c1v5/bionic 3.6.2-1build2 on ubuntu 18.04) if libgeos-c is not installed the command import shapely.geometry fails

Shapely 1.6.4.post2 does not depend any longer on libgeos

installation under python 3.8

both 1.6.4.post1 and 1.6.4.post2 fail to install without libgeos-c (SError: Could not find library geos_c or load any of its variants ['libgeos_c.so.1', 'libgeos_c.so'])

behaviour of GEOS

behaviour of shapely 1.6.4.post1 depends on the libgeos version:

shapely 1.7

no dependency on geos_c

shapely 1.8

incompatible with python <=3.5 1.8 has a new method shapely.validation.make_valid -> geometries are partly made valid for geometric operations

behaviour of different shapely versions:

The code below tests the intersection of a valid "box" with an "invalid" butterfly.

import shapely.geometry
    shapely.geometry.box(0, 0, 1, 1).intersection(shapely.geometry.Polygon([(0,0), (.5, .5), (1, 0), (1, 1), (0.5, 0.5), (0,1), (0, 0)])).wkt

geometry type in different shapely versions

python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 1, 1).intersection(shapely.geometry.Polygon([(0,0), (.5, .5), (1, 0), (1, 1), (0.5, 0.5), (0,1), (0, 0)])).type)'
version geometry type
1.6.4.post1 + libgeos 3.7.1 Polygon
1.6.4.post1 + libgeos 3.8.0 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.6.4.post2 Polygon
1.7.1 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.8.0 MultiPolygon

resulting geometry validity in different shapely versions

python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 1, 1).intersection(shapely.geometry.Polygon([(0,0), (.5, .5), (1, 0), (1, 1), (0.5, 0.5), (0,1), (0, 0)])).is_valid)'
version geometry validity
1.6.4.post1 + libgeos 3.7.1 False
1.6.4.post1 + libgeos 3.8.0 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.6.4.post2 False
1.7.1 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.8.0 True

1.6.4 behaviour depending on version of libgeos:

ring number for different shapely versions

Using another geometry (crescent-shape forming an island) shapely 1.8 converts the invalid crescent to a valid polygon with a hole touching the border

python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 3, 2).intersection(shapely.geometry.Polygon([(0, 0), (3, 0), (3, 2), (1.5, 1.5), (2, 1), (1, 1), (1.5, 1.5), (0, 2), (0, 0)])).type)'
version geometry type
1.6.4.post1 + libgeos 3.7.1 Polygon
1.6.4.post1 + libgeos 3.8.0 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.6.4.post2 Polygon
1.7.1 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.8.0 Polygon
python -c 'import shapely.geometry;print(len(shapely.geometry.box(0, 0, 3, 2).intersection(shapely.geometry.Polygon([(0, 0), (3, 0), (3, 2), (1.5, 1.5), (2, 1), (1, 1), (1.5, 1.5), (0, 2), (0, 0)])).interiors))'
version number of interior rings
1.6.4.post1 + libgeos 3.7.1 0
1.6.4.post1 + libgeos 3.8.0 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.6.4.post2 0
1.7.1 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.8.0 1
python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 3, 2).intersection(shapely.geometry.Polygon([(0, 0), (3, 0), (3, 2), (1.5, 1.5), (2, 1), (1, 1), (1.5, 1.5), (0, 2), (0, 0)])).is_valid)'
version geometry validity
1.6.4.post1 + libgeos 3.7.1 False
1.6.4.post1 + libgeos 3.8.0 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.6.4.post2 False
1.7.1 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.8.0 True
python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 3, 2).intersection(shapely.geometry.Polygon([(0, 0), (3, 0), (3, 2), (1.5, 1.5), (2, 1), (1, 1), (1.5, 1.5), (0, 2), (0, 0)])).wkt)'
version geometry wkt
1.6.4.post1 + libgeos 3.7.1 POLYGON ((3 2, 3 0, 0 0, 0 2, 1.5 1.5, 1 1, 2 1, 1.5 1.5, 3 2))
1.6.4.post1 + libgeos 3.8.0 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.6.4.post2 POLYGON ((3 2, 3 0, 0 0, 0 2, 1.5 1.5, 1 1, 2 1, 1.5 1.5, 3 2))
1.7.1 shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
1.8.0 POLYGON ((3 0, 0 0, 0 2, 1.5 1.5, 3 2, 3 0), (1 1, 2 1, 1.5 1.5, 1 1))

further particularities of shapely versions

self-crossing geometry always raises an error (shapely 1.6, 1.7, 1.8):

python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 1, 1).intersection(shapely.geometry.Polygon([(0,0), (1, 1), (1, 0), (0,1), (0, 0)])).is_valid)'

shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.

shapely 1.8.0 -> use of new function make_valid: 2 different invalid geometries result in the same valid one

python -c 'import shapely.validation;print(shapely.validation.make_valid(shapely.geometry.Polygon([(0,0), (.5, .5), (1, 0), (1, 1), (0.5, 0.5), (0,1), (0, 0)])))'

MULTIPOLYGON (((0 1, 0.5 0.5, 0 0, 0 1)), ((1 0, 0.5 0.5, 1 1, 1 0)))

python -c 'import shapely.validation;print(shapely.validation.make_valid(shapely.geometry.Polygon([(0,0), (1, 1), (1, 0), (0,1), (0, 0)])))'

MULTIPOLYGON (((0 1, 0.5 0.5, 0 0, 0 1)), ((1 0, 0.5 0.5, 1 1, 1 0)))

vvmruder commented 2 years ago

@mki-c2c for this detailed summary. It confirms a little what we discussed before. Better do not import invalid data...

Remark:

Regarding the make_valid method of shapely I would be really really careful. The experience of handling geometries teaches that this leads to unpredictable behavior. I never would used such thing unsupervised and on a prod system. Especially not when handling official data which calculates areas and delivers percentage of coverage.

mki-c2c commented 2 years ago

for completeness, find here some geometry results with postgis:

### self-crossing geometry not accepted

postgres=# select st_intersection(st_geomfromtext('POLYGON ((0 0, 1 1, 1 0, 0 1, 0 0))'), st_geomfromtext('POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))')); ERROR: lwgeom_intersection_prec: GEOS Error: TopologyException: Input geom 0 is invalid: Self-intersection at 0.5 0.5

### self-crossing geometry can be made valid explicitly

postgres=# select st_astext(st_intersection(st_makevalid(st_geomfromtext('POLYGON ((0 0, 1 1, 1 0, 0 1, 0 0))')), st_geomfromtext('POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))'))); st_astext

MULTIPOLYGON(((0 1,0.5 0.5,0 0,0 1)),((0.5 0.5,1 1,1 0,0.5 0.5))) (1 row)

### both geometries are self-intersecting and invalid

postgres=# select st_isvalid(st_geomfromtext('POLYGON ((0 0, .5 .5, 1 0, 1 1, .5 .5, 0 1, 0 0))')); NOTICE: Ring Self-intersection at or near point 0.5 0.5 st_isvalid

f (1 row)

postgres=# select st_isvalid(st_geomfromtext('POLYGON ((0 0, 1 1, 1 0, 0 1, 0 0))')); NOTICE: Self-intersection at or near point 0.5 0.5 st_isvalid

f (1 row)

### makevalid creates the same multipolygon for both geometries

postgres=# select st_astext(st_makevalid(st_geomfromtext('POLYGON ((0 0, .5 .5, 1 0, 1 1, .5 .5, 0 1, 0 0))'))); st_astext

MULTIPOLYGON(((0 0,0 1,0.5 0.5,0 0)),((1 0,0.5 0.5,1 1,1 0))) (1 row)

postgres=# select st_astext(st_makevalid(st_geomfromtext('POLYGON ((0 0, 1 1, 1 0, 0 1, 0 0))'))); st_astext

MULTIPOLYGON(((0 0,0 1,0.5 0.5,0 0)),((1 0,0.5 0.5,1 1,1 0))) (1 row)

mki-c2c commented 2 years ago

additional information on validitty calculation in GEOS can be found in the documentation of the GEOS function setSelfTouchingRingFormingHoleValid

https://libgeos.org/doxygen/classgeos_1_1operation_1_1valid_1_1IsValidOp.html#ae2997fc9c969f19517184377df9249de