Closed jwkaltz closed 2 years ago
This is a difficult question indeed. Since we try to combine this 2 not connectable worlds. In some past user group meeting we had some inputs in this by BS (@svareg) and BE (@peterschaer). Could we maybe use this issue to gather information in the WIKI about best practices on "how to import Interlis into PostGIS"? Because I think this is where we should look. This OGC incompatible data should already be rejected on input into the DB and not at time of processing. @kdeininger: Input on that?
Yes, this is a quite difficult question. Staying with Shapely 1.6 is not a long term solution, so we will have to specify the handling of invalid geometries somehow. I agree with @vvmruder, that invalid data should be rejected on import. But the problem (and maybe also the solution) is, that the import is not part of this project, except the script for the federal data. All users of pyramid_oereb
should be responsible for their data and have to ensure valid geometries.
We could provide a script for the standard schema trying to fix invalid geometries using ST_MakeValid. But this would be only a PostGIS solution. Everyone should somehow be able to split self-intersecting polygons into valid multi-polygons using FME, GEOS, QGIS or any other tool.
All users of
pyramid_oereb
should be responsible for their data and have to ensure valid geometries.
This is clear, but the actual problem, as I understand it, is that these are geometries which are valid according to Interlis (and the Interlis validation tools), but invalid according to the OGC-compliant tools. Ideally, we would add a "FAQ" about what to do in these cases, if the community agrees on what should be done. Plus maybe some script as you suggest.
I talked to @svareg. Some time ago we already discussed this topic and we decided to put some best practices, scripts, tools in the docs. This would contain especially the handling of data transformed from Interlis into PostGIS. We will try to improve the docs until end of march since the info needs to be gathered. In addition @peterschaer had some inputs on that. He will be available later in the month.
@jwkaltz Do you think this is sufficient?
Thanks for the infos @vvmruder , I will be happy to have a look once the information is gathered. We can then also analyse more precisely what the impact of a shapely upgrade will be.
Hello Everyone,
I'll try to give more infos about what the problem was for us in BS:
First, we are using curves in our geometries. That's a problem because shapely 1.6.4
cannot work with curves. (I don't know for shapely 1.7)
.
To be able to work with our geometries in pyramid_oereb
, we had to convert them to geometries without curves.
To do this, we are using the postgres function st_curvetoline
:
st_curvetoline(li.the_geom, 0.0025::double precision, 1)
By doing this, invalid geometries were generated, with self-intersecting parts, because we had the following errors in the initial geometries :
The black curve really intersects the blue line, because the angle defined in the curved geometry is wrong. Our geometries were not precise enough.
So as a first temporary solution, we used st_makevalid
:
st_makevalid(st_curvetoline(li.the_geom, 0.0025::double precision, 1))
This worked, but actually the initial problem was still here, because the initial geometry was invalid. Therefore, we asked the Cadastral Surveying to correct the geometries.
Now, when using just st_curvetoline
, the generated geometries are valid, and we don't need st_makevalid
anymore.
Also I generally agree with @kdeininger, all users of pyramid_oereb
should be responsible for their data and have to ensure valid geometries, and the invalid data should be rejected during the import.
@vvmruder @jwkaltz I hope those explanations can help for a Wiki ar a FAQ.
We in BE had several hundred geometries in our PostGIS database. These geometries lead to errors in pyramid_oereb preventing extracts to be produced for certain parcels. The invalid geometries were in the parcels but also in plrs (especially land use plans).
This is our solution:
We have various input sources (Interlis 1/2, ESRI FGDB, ESRI EGDB). They all contain curves. We import everything with FME where we use ArcStroker to eliminate the curves. FME is then writing the data to PostGIS. There we detect invalid geometries with st_isvalid()
. Those invalid geometries are then being corrected with st_makevalid()
. Sometimes the result of st_makevalid()
is of a different geometry type (e.g. GeometryCollection instead of Mulitpolygon). In those cases the corrected geometry has to be manipulated again with st_collectionextract()
to get the correct geometry type.
The user will never see this manipulated data. In the WMS we use another data source that is not manipulated during import. The manipulated data is exclusively used by pyramid_oereb. And here possible small sliver areas are anyway filtered out (we have a minimum size of 1 square meter).
SZ encountered the problem without any arcs involved. Our problem was caused by polygons which inner and outer boundary intersect in one point (which is valid in INTERLIS as well as in OGC [for INTERLIS see test-case RSU.T05 in https://github.com/geoadmin/suite-interlis/blob/master/doc/attributes.md#geometric-attributes-surfaces-and-tessellations; for OGC see chapter 6.1.11 in https://portal.ogc.org/files/?artifact_id=25355).
BUT: There are two ways how such polygon are coded in the transferfile (XTF):
with two \<BOUNDARY>-tags (one for outer and one for inner boundary) and
with only one \<BOUNDARY>-tag (see attached file).
Both are valid in INTERLIS (ilivalidator-1.11.10) but only first contains a valid OGC geometry. Second fails OGC-simple (self-intersection in 2D) and caused problems in pyramid_oereb (shapely 1.7).
our solution for oereb-data:
And parallel:
Files:
I performed some further analysis of geometrical operations with shapely. Handling of invalid geometries has strongly changed between shapely versions 1.6, 1.7, 1.8
Basis for the tests are the following invalid polygons:
Both geometries result in the "butterfly" shape below:
Another invalid "crescent" geometry is used for test purpose:
Shapely 1.6.4.post1 still depends on system lib libgeos-c1v5 (libgeos-c1v5/bionic 3.6.2-1build2 on ubuntu 18.04)
if libgeos-c is not installed the command import shapely.geometry
fails
Shapely 1.6.4.post2 does not depend any longer on libgeos
both 1.6.4.post1 and 1.6.4.post2 fail to install without libgeos-c (SError: Could not find library geos_c or load any of its variants ['libgeos_c.so.1', 'libgeos_c.so'])
behaviour of shapely 1.6.4.post1 depends on the libgeos version:
no dependency on geos_c
incompatible with python <=3.5
1.8 has a new method shapely.validation.make_valid
-> geometries are partly made valid for geometric operations
The code below tests the intersection of a valid "box" with an "invalid" butterfly.
import shapely.geometry
shapely.geometry.box(0, 0, 1, 1).intersection(shapely.geometry.Polygon([(0,0), (.5, .5), (1, 0), (1, 1), (0.5, 0.5), (0,1), (0, 0)])).wkt
python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 1, 1).intersection(shapely.geometry.Polygon([(0,0), (.5, .5), (1, 0), (1, 1), (0.5, 0.5), (0,1), (0, 0)])).type)'
version | geometry type |
---|---|
1.6.4.post1 + libgeos 3.7.1 | Polygon |
1.6.4.post1 + libgeos 3.8.0 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.6.4.post2 | Polygon |
1.7.1 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.8.0 | MultiPolygon |
python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 1, 1).intersection(shapely.geometry.Polygon([(0,0), (.5, .5), (1, 0), (1, 1), (0.5, 0.5), (0,1), (0, 0)])).is_valid)'
version | geometry validity |
---|---|
1.6.4.post1 + libgeos 3.7.1 | False |
1.6.4.post1 + libgeos 3.8.0 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.6.4.post2 | False |
1.7.1 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.8.0 | True |
Using another geometry (crescent-shape forming an island) shapely 1.8 converts the invalid crescent to a valid polygon with a hole touching the border
python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 3, 2).intersection(shapely.geometry.Polygon([(0, 0), (3, 0), (3, 2), (1.5, 1.5), (2, 1), (1, 1), (1.5, 1.5), (0, 2), (0, 0)])).type)'
version | geometry type |
---|---|
1.6.4.post1 + libgeos 3.7.1 | Polygon |
1.6.4.post1 + libgeos 3.8.0 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.6.4.post2 | Polygon |
1.7.1 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.8.0 | Polygon |
python -c 'import shapely.geometry;print(len(shapely.geometry.box(0, 0, 3, 2).intersection(shapely.geometry.Polygon([(0, 0), (3, 0), (3, 2), (1.5, 1.5), (2, 1), (1, 1), (1.5, 1.5), (0, 2), (0, 0)])).interiors))'
version | number of interior rings |
---|---|
1.6.4.post1 + libgeos 3.7.1 | 0 |
1.6.4.post1 + libgeos 3.8.0 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.6.4.post2 | 0 |
1.7.1 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.8.0 | 1 |
python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 3, 2).intersection(shapely.geometry.Polygon([(0, 0), (3, 0), (3, 2), (1.5, 1.5), (2, 1), (1, 1), (1.5, 1.5), (0, 2), (0, 0)])).is_valid)'
version | geometry validity |
---|---|
1.6.4.post1 + libgeos 3.7.1 | False |
1.6.4.post1 + libgeos 3.8.0 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.6.4.post2 | False |
1.7.1 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.8.0 | True |
python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 3, 2).intersection(shapely.geometry.Polygon([(0, 0), (3, 0), (3, 2), (1.5, 1.5), (2, 1), (1, 1), (1.5, 1.5), (0, 2), (0, 0)])).wkt)'
version | geometry wkt |
---|---|
1.6.4.post1 + libgeos 3.7.1 | POLYGON ((3 2, 3 0, 0 0, 0 2, 1.5 1.5, 1 1, 2 1, 1.5 1.5, 3 2)) |
1.6.4.post1 + libgeos 3.8.0 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.6.4.post2 | POLYGON ((3 2, 3 0, 0 0, 0 2, 1.5 1.5, 1 1, 2 1, 1.5 1.5, 3 2)) |
1.7.1 | shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. |
1.8.0 | POLYGON ((3 0, 0 0, 0 2, 1.5 1.5, 3 2, 3 0), (1 1, 2 1, 1.5 1.5, 1 1)) |
python -c 'import shapely.geometry;print(shapely.geometry.box(0, 0, 1, 1).intersection(shapely.geometry.Polygon([(0,0), (1, 1), (1, 0), (0,1), (0, 0)])).is_valid)'
shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.
python -c 'import shapely.validation;print(shapely.validation.make_valid(shapely.geometry.Polygon([(0,0), (.5, .5), (1, 0), (1, 1), (0.5, 0.5), (0,1), (0, 0)])))'
MULTIPOLYGON (((0 1, 0.5 0.5, 0 0, 0 1)), ((1 0, 0.5 0.5, 1 1, 1 0)))
python -c 'import shapely.validation;print(shapely.validation.make_valid(shapely.geometry.Polygon([(0,0), (1, 1), (1, 0), (0,1), (0, 0)])))'
MULTIPOLYGON (((0 1, 0.5 0.5, 0 0, 0 1)), ((1 0, 0.5 0.5, 1 1, 1 0)))
@mki-c2c for this detailed summary. It confirms a little what we discussed before. Better do not import invalid data...
Remark:
Regarding the make_valid
method of shapely I would be really really careful. The experience of handling geometries teaches that this leads to unpredictable behavior. I never would used such thing unsupervised and on a prod system. Especially not when handling official data which calculates areas and delivers percentage of coverage.
postgres=# select st_astext(st_intersection(st_geomfromtext('POLYGON ((0 0, .5 .5, 1 0, 1 1, .5 .5, 0 1, 0 0))'), st_geomfromtext('POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))')));
st_astext
-------------------------------------------------------------------
MULTIPOLYGON(((0 1,0.5 0.5,0 0,0 1)),((0.5 0.5,1 1,1 0,0.5 0.5)))
(1 row)
### self-crossing geometry not accepted
postgres=# select st_intersection(st_geomfromtext('POLYGON ((0 0, 1 1, 1 0, 0 1, 0 0))'), st_geomfromtext('POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))')); ERROR: lwgeom_intersection_prec: GEOS Error: TopologyException: Input geom 0 is invalid: Self-intersection at 0.5 0.5
### self-crossing geometry can be made valid explicitly
MULTIPOLYGON(((0 1,0.5 0.5,0 0,0 1)),((0.5 0.5,1 1,1 0,0.5 0.5))) (1 row)
### both geometries are self-intersecting and invalid
f (1 row)
f (1 row)
### makevalid creates the same multipolygon for both geometries
MULTIPOLYGON(((0 0,0 1,0.5 0.5,0 0)),((1 0,0.5 0.5,1 1,1 0))) (1 row)
MULTIPOLYGON(((0 0,0 1,0.5 0.5,0 0)),((1 0,0.5 0.5,1 1,1 0))) (1 row)
additional information on validitty calculation in GEOS can be found in the documentation of the GEOS function setSelfTouchingRingFormingHoleValid
The requirements for processing by pyramid_oereb in case of invalid geometries should be clarified. This use case can happen when data was produced according to Interlis in a form which is valid in Interlis, but which is considered invalid by the OGC ("Ring Self-intersection").
In #1109 some tests were added to see the current behavior of pyramid_oereb when an invalid geometry is processed. When processing in Python 3.6 and Shapely 1.6, the processing still works when such a geometry is processed. However, when using Shapely 1.7 the same code generates an exception (see #1013).
Therefore, the community needs to specify the requirements for processing by pyramid_oereb when an invalid geometry (invalid according to OGC):