larsop / resolve-overlap-and-gap

This code is moving to https://gitlab.com/nibioopensource/resolve-overlap-and-gap. The plan is to use Postgis Topology to resolve overlaps and gaps for a simple feature layers. Norwegian Institute of Bioeconomy Research (NIBIO)
GNU General Public License v3.0
3 stars 1 forks source link

st_getFaceGeometry return null #7

Closed larsop closed 4 years ago

larsop commented 4 years ago

When I add data i break everything into cells and glue together at later stage. Sometimes faces and the edges are handled incorrectly at the cell borders. If I have face where this is the case I can often run this command to rebuild the simple feature geo,

command_string = FORMAT('select ST_BuildArea(ST_Collect(e.geom)) from %1$s.face f, %1$s.edge_data e where ST_Covers(ST_Expand(f.mbr,%3$s),e.geom) and face_id = %2$s', _atopology, _face_id,tolerance); execute command_string into face_geo;

The reason why we get a null geometry from st_getFaceGeometry is that that one or more of the edges often belong to very big face, and then left and right are often equal.

Im now doing some testing different smaller datasets than ar5 to be able to this the code in reasonable time.

Running this sql and getting a big number and long lines usually indicates a lot of errors.

SELECT count(*) from topo_sr16_mdata_01.edge_data where right_face = left_face;

Why is this happening ?

How can we prevent from happening ?

Is is possible to fix it if it happens ?

strk commented 4 years ago

ST_BuildArea uses only the edges you pass to it, so clearly it won't work if you don't give it ALL the edges forming a face. My recommendation here is to add the reference grid to the topology, so to make sure every single face of the topology will be covered exactly by one and one only grid cell.

This would imply adding all lines of the grid (horizontal lines, vertical lines) with TopoGeo_addLinestring.

SELECT count(*) from topo_sr16_mdata_01.edge_data where right_face = left_face;

The count you obtain above is the count of so called "dangling edges" (if one of the end nodes is shared with other edges) or "isolated edges" (if both end nodes are only connected to it).

Having those kind of edges may be the result of different things, they do not necessarely represent a malformed topology so are allowed (non-areal features would result in such kind of edges).

If you are sure you always ever passed full valid polygons or rings to the topology population functions, then the only reason for dangling or isolated edges to appear would be dimensional collapse due to either snapping or robustness (impossibility to represent an intersection point without moving a line so much to collapse to another one).

Catching these occurrences as early as possible would help building over those errors.

strk commented 4 years ago

Sorry, I just noticed you mention ST_GetFaceGeometry in the ticket summary (although you don't mention that in the ticket description). The only reason why ST_GetFaceGeometry should return NULL would be an invalid topology. ValidateTopology function might or might not identify those invalidities, if it does not it would be a bug.

larsop commented 4 years ago

The problem with https://postgis.net/docs/manual-3.0/ValidateTopology.html is that it can not be run for given area so I am not able to use it on big data sets. I need to bee able i verify/fix for a selected area if this should work I belive.

larsop commented 4 years ago

I just added ST_BuildArea to test if all the edges that are needed to build the the simple feature polygon are in the egde and that seems to be the case , also when st_GeoFaceGeometry is null and mbr i not null. So it does not look like any edges area missing.

This does not happing where often.

larsop commented 4 years ago

I did a test where I added the edges that did not very well with big datasets. I got more error and it took quite a time to remove the cell borders. I also I had to keep track the part of the cell bords that overlapped with the edges from the input and that should not be removed.

Here is the branch where I did the test https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology_use_cell_borders

strk commented 4 years ago

The problem with https://postgis.net/docs/manual-3.0/ValidateTopology.html is that it can not be run for given area

This could be a good improvement, although by definition a topology is a set of related features so the amount of features overlapping a given area may very well extend outside of it and eventually get to require looking at the whole topology (imagine specifying an area which contains the external edge of a big country, in order to verify correctness of those edges the code will still need to walk the whole external ring of the country, which would span the whole topology).

That said, it could be useful to have functions to check specific validities. The https://postgis.net/docs/manual-3.0/GetRingEdges.html function may be a component of such checker function (malformed linking could result in infinite loops or non-closed rings)

strk commented 4 years ago

On Fri, Apr 24, 2020 at 01:56:29AM -0700, Lars Aksel Opsahl wrote:

I just added ST_BuildArea to test if all the edges that are needed to build the the simple feature polygon are in the egde and that seems to be the case, also when st_GeoFaceGeometry is null and mbr i not null. So it does not look like any edges area missing.

The ST_GetFaceGeometry function is using ST_BuildArea with a collection of edges having the specified face recorded as either being on their left or right side:

https://git.osgeo.org/gitea/postgis/postgis/src/branch/master/liblwgeom/lwgeom_topo.c#L2732-L2777

So if your ST_BuildArea works and ST_GetFaceGeometry does not, we may be in presence of an error with the values of left_face and right_face for some of the edges.

Can you provide a minimal dataset and accompanying query showing the problem ? The dataset would be a dump of the topology (reduced as much as possible).

larsop commented 4 years ago

On Fri, Apr 24, 2020 at 01:56:29AM -0700, Lars Aksel Opsahl wrote: I just added ST_BuildArea to test if all the edges that are needed to build the the simple feature polygon are in the egde and that seems to be the case, also when st_GeoFaceGeometry is null and mbr i not null. So it does not look like any edges area missing. The ST_GetFaceGeometry function is using ST_BuildArea with a collection of edges having the specified face recorded as either being on their left or right side: https://git.osgeo.org/gitea/postgis/postgis/src/branch/master/liblwgeom/lwgeom_topo.c#L2732-L2777 So if your ST_BuildArea works and ST_GetFaceGeometry does not, we may be in presence of an error with the values of left_face and right_face for some of the edges.

Yes that's the problem , when I checked the last time some edges referring to a very big face and not the ones close by.

Can you provide a minimal dataset and accompanying query showing the problem ? The dataset would be a dump of the topology (reduced as much as possible).

I will find an example

larsop commented 4 years ago

Here is example where we have a dangling egde which should not be dangling

Screenshot 2020-04-24 at 11 58 28

Screenshot 2020-04-24 at 12 00 17

larsop commented 4 years ago

So basically what I need to after after all cells are glued together, I need find not truly dangling edges (like the one above) and update these with the correct left and right face .

Is this possible/difficult ?

strk commented 4 years ago

What you should find out is HOW those not-truly dangling edges were allowed in the database. May I suggest to use the QGIS TopologyViewer plugin of DBManager to further inspect the topology ? I'm happy to look at it myself, if you handle to produce a small enough version to transfer

strk commented 4 years ago

Updating the left and right face for an edge could be as simple as running an UPDATE against the topology's edge_data table, finding the right values may not be that simple. Corruption of the topology should not happen in the first place.

larsop commented 4 years ago

Updating the left and right face for an edge could be as simple as running an UPDATE against the topology's edge_data table, finding the right values may not be that simple. Corruption of the topology should not happen in the first place.

Yes this is may main problem I was sure about how about figure out what face to use in a simple way.

larsop commented 4 years ago

What you should find out is HOW those not-truly dangling edges were allowed in the database.

I belive this happens in this way . 1) A part of edge are inserted in phase one and then it's truly dangling egde. 2) The rest of edge are inserted and we a complete set og edges that can form a polygon.

Usually this goes ok and we get face with a valid geometry based in this edges, but sometimes this fail. I have not been able catch when happens.

May I suggest to use the QGIS TopologyViewer plugin of DBManager to further inspect the topology ?

Yes I can use this plugin , but when I run I see logs faces that have this problem when I created for instance a simple feature layer based this Topology layer.

I'm happy to look at it myself, if you handle to produce a small enough version to transfer

I have not been able to produce a simpe test case here and it seems to vary from time to time and layer to layer.

strk commented 4 years ago

What you should find out is HOW those not-truly dangling edges were allowed in the database.

I belive this happens in this way .

1. A part of edge are inserted in phase one and then it's truly dangling egde.
2. The rest of edge are inserted and we a complete set og edges that can form a polygon.

Usually this goes ok and we get face with a valid geometry based in this edges, but sometimes this fail. I have not been able catch when happens.

One possible cause of it could be 2 transactions running at the same time and the last one committing work which overrides the work performed by the other. This is what you want to avoid and can surely happen. If this is the case you should NEVER be able to see such behaviour when ensuring a single client ever edits the Topology (I know you're trying to avoid this, but worth verifying).

One other possible cause (if running single-thread) could be that when you did split the original polygon into multiple edges, some edge gets threated (snapped, due to tolerance) in a way that it ends up NOT being connected anymore with its "other side", resulting in an open ring.

May I suggest to use the QGIS TopologyViewer plugin of DBManager to further inspect the topology ?

Yes I can use this plugin , but when I run I see logs faces that have this problem when I created for instance a simple feature layer based this Topology layer.

The TopologyViewer can be used to just inspect edge table, which gives more information here. Keep the "faces" layers off, just see edges and their attributes, as they give a visual indication of which edge has which face on which side, and which edges are supposed to be connected to each edge, via next_left, next_right edge attributes, these two classes of attributes can be inspected-with-your-eyes to verify their correctness (togheter with the direction of edges).

larsop commented 4 years ago

The TopologyViewer can be used to just inspect edge table, which gives more information here. Keep the "faces" layers off, just see edges and their attributes, as they give a visual indication of which edge has which face on which side, and which edges are supposed to be connected to each edge, via next_left, next_right edge attributes, these two classes of attributes can be inspected-with-your-eyes to verify their correctness (togheter with the direction of edges).

This sound very nice , I have some problems to use the way describe . I have upgraded qgis 3.10 and in the DB Manager I see this. Is it any user documentation for how to use this with Postgis Topology.

Screenshot 2020-04-25 at 07 35 21
larsop commented 4 years ago

One other possible cause (if running single-thread) could be that when you did split the original polygon into multiple edges, some edge gets threated (snapped, due to tolerance) in a way that it ends up NOT being connected anymore with its "other side", resulting in an open ring.

I did this test now and it does not seem to be a snap to problem . (And I have a temporary way to fix as it looks)

` select * from topo_sr16_mdata_02.edge_data where edge_id = 756267; edge_id | 756267 start_node | 690032 end_node | 555208 next_left_edge | 754795 abs_next_left_edge | 754795 next_right_edge | -931776 abs_next_right_edge | 931776 left_face | 320375 right_face | 320375 geom |

` select topology.TopoGeo_addLinestring('topo_sr16_mdata_02',geom,1) from (select ST_RemEdgeModFace('topo_sr16_mdata_02',edge_id), geom from topo_sr16_mdata_02.edge_data where edge_id = 756267) as r; topogeo_addlinestring

           1106439

` select * from topo_sr16_mdata_02.edge_data where edge_id = 1106439;

edge_id | 1106439 start_node | 690032 end_node | 555208 next_left_edge | 754795 abs_next_left_edge | 754795 next_right_edge | -931776 abs_next_right_edge | 931776 left_face | 324770 right_face | 320375 geom | 0102000020E96400001B00000000000000807C0E410000000025B15A4100000000C07B0E410000000025B15A4100000000C07B0E410000000028B15A4100000000807B0E410000000028B15A4100000000807B0E41000000002AB15A4100000000607B0E41000000002AB15A4100000000607B0E410000000030B15A4100000000C07A0E410000000030B15A4100000000C07A0E410000000031B15A4100000000607A0E410000000031B15A4100000000607A0E41000000002DB15A4100000000A0790E41000000002DB15A4100000000A0790E410000000032B15A4100000000607A0E410000000032B15A4100000000607A0E410000000036B15A4100000000007A0E410000000036B15A4100000000007A0E410000000037B15A4100000000E0790E410000000037B15A4100000000E0790E41233F2C373AB15A4100000000E0790E41000000003BB15A4100000000C0790E41000000003BB15A4100000000C0790E41233F2CB73BB15A4100000000C0790E41000000003CB15A410000000040790E41000000003CB15A410000000040790E41EF0BF9833CB15A410000000040790E41EF0BF9033EB15A410000000040790E410000000043B15A41 `

larsop commented 4 years ago

One possible cause of it could be 2 transactions running at the same time and the last one committing work which overrides the work performed by the other. This is what you want to avoid and can surely happen. If this is the case you should NEVER be able to see such behaviour when ensuring a single client ever edits the Topology (I know you're trying to avoid this, but worth verifying).

I did a test https://github.com/larsop/resolve-overlap-and-gap/tree/test_single_thread_issues_7 where I used a single thread for inserting the glue lines and that did not help. Here also added code where thread to thread remove and add the egde but that did not help , because probably need to healed the line first.

I will run another test where I do a union with exiting lines before I add the glue line.

larsop commented 4 years ago

I will run another test where I do a union with exiting lines before I add the glue line.

This failed also.

Tested at https://github.com/larsop/resolve-overlap-and-gap/tree/test_union_with_existing_edges_issues_7

It seems like splitting up the edges that crosser the cell borders are make too many error for some data sets.

I will no do another test where I do not splitt the lines that cross the cell borders, but keep them as they are.

strk commented 4 years ago

Is it any user documentation for how to use this with Postgis Topology.

I'm afraid not, but it what that plugin does is only creating a layer group containing various useful layers to inspect different things. It's not recommended to turn visibility of all layers ON. Rather you'd only enable one after the other, in turn, to see which information you find useful and which not.

What I see from the screenshot is that you have the "face MBR" layer visible, which you probably don't want. I would actually turn off all layers under the "face" group and only focus on the layers under the "Edge" group. I don't fully see those groups in your screenshot so I wonder if something is not working as expected. I'll build QGIS-3.10 and see what I get from it.

strk commented 4 years ago

On Fri, Apr 24, 2020 at 11:26:41PM -0700, Lars Aksel Opsahl wrote:

select topology.TopoGeo_addLinestring('topo_sr16_mdata_02',geom,1) from ( select ST_RemEdgeModFace('topo_sr16_mdata_02',edge_id), geom from topo_sr16_mdata_02.edge_data where edge_id = 756267) as r; topogeo_addlinestring

           1106439

Sorry I don't understand how the above query is a test for what I was saying about snapping. If I understand your query correctly what you are doing is:

  1. Remove one dangling edge from the topology
  2. Add the Linestring of that edge back in the topology
  3. Verifying that the insertion of given Linestring results again in a single edge, with same start/end node, same next left/right edges.

One thing I note in the result of that operation is that while the original edge was marked as "dangling" (same face on both sides) it is now instead reported to bind 2 different faces. I don't see as such behaviour could happen if not due to malformation of the original topology: that edge must have bound 2 different faces from the start, and that information must have gone lost somehow, maybe due to parallel update of its attributes.

strk commented 4 years ago

I did a test now where used a single thread for inserting the glue lines and that did not help.

Perfect, we have much higher chances of handling such case (easier to reproduce). Can you reduce the test to provide a start topology and operations that do break it ?

Please also when referring to code (ie: "inserting glue line") can you point me to the exact code snippet doing that ? I'd like to better understand the operations you are using to do this.

It seems like splitting up the edges that crosser the cell borders are make too many error for some data sets.

Splitting edges can always create problems, because unless edges are split at one of their existing vertices, single segments are bound to move slightly to be able to represent e new vertex, and such movement could result in an open ring. All these occurrences are supposedly handled by the splitting performed by the topology functions themselves, and for this reason I suggested to insert the edges of your defined grid to the topology, so that they are found by the topology routing and used to do "controlled" splitting of the edges you insert.

larsop commented 4 years ago

On Fri, Apr 24, 2020 at 11:26:41PM -0700, Lars Aksel Opsahl wrote: select topology.TopoGeo_addLinestring('topo_sr16_mdata_02',geom,1) from ( select ST_RemEdgeModFace('topo_sr16_mdata_02',edge_id), geom from topo_sr16_mdata_02.edge_data where edge_id = 756267) as r; topogeo_addlinestring ----------------------- 1106439 Sorry I don't understand how the above query is a test for what I was saying about snapping. If I understand your query correctly what you are doing is: 1. Remove one dangling edge from the topology 2. Add the Linestring of that edge back in the topology 3. Verifying that the insertion of given Linestring results again in a single edge, with same start/end node, same next left/right edges. One thing I note in the result of that operation is that while the original edge was marked as "dangling" (same face on both sides) it is now instead reported to bind 2 different faces. I don't see as such behaviour could happen if not due to malformation of the original topology: that edge must have bound 2 different faces from the start, and that information must have gone lost somehow, maybe due to parallel update of its attributes.

Yes you are reading it correctly and values for faces are fixed when deleting and the edge add again. I also did a test with where used tolerance like 0 and that worked. The problem also seems to be also when running i single thread.

Can the problem be something like this ? 1) A edge is added which is dangling at stage one and that is correct. The face may be very big. 2) More edges are added then at at certain stage we get a new face.

The problem seems be that dangling edge is not always been notified/updated about new face.

I will try make a test for this with a smaller dataset.

larsop commented 4 years ago

It seems like splitting up the edges that crosser the cell borders are make too many error for some data sets. Splitting edges can always create problems, because unless edges are split at one of their existing vertices, single segments are bound to move slightly to be able to represent e new vertex, and such movement could result in an open ring. All these occurrences are supposedly handled by the splitting performed by the topology functions themselves, and for this reason I suggested to insert the edges of your defined grid to the topology, so that they are found by the topology routing and used to do "controlled" splitting of the edges you insert.

I tried to do something like this here where I also added the cell borders https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology_use_cell_borders/src/main/extern_pgtopo_update_sql

But the performance was very poor and I also needed needed to removed a lot of edges in a post operation because they where only cell borders and not a line from the input layer, so went I stopped testing this branch.

I may try to see if it's possible to develop this further.

larsop commented 4 years ago

I have been able to make a small test that fails almost every second I run. By failing I mean we get dangling edges.

SELECT ST_length(geom), edge_id , start_node, next_left_edge , next_right_edge, abs_next_right_edge, right_face from test_topo_t3.edge_data where left_face = right_face ; st_length | edge_id | start_node | next_left_edge | next_right_edge | abs_next_right_edge | right_face --------------------+---------+------------+----------------+-----------------+---------------------+------------ 16638.511442709078 | 7151 | 5625 | -6957 | -4029 | 4029 | 0 15668.515454601618 | 7157 | 2706 | 3522 | -7651 | 7651 | 1465 9558.525838860027 | 7177 | 2809 | 7182 | 2815 | 2815 | 1471 46869.15086689703 | 7269 | 1471 | 3214 | 1482 | 1482 | 1484 15683.408435026411 | 7408 | 3856 | -3964 | -7410 | 7410 | 1546

If I try to check faces with st_getFaceGeometr select * from test_topo_t3.face where st_getFaceGeometry('test_topo_t3',face_id) is null; ERROR: XX000: SQL/MM Spatial exception - universal face has no geometry LOCATION: pg_error, lwgeom_pg.c:243

If I check with this cmd I find where catch errors select count(*) from test_topo_t3.face where topo_update.get_face_geo('test_topo_t3',face_id,0) is null; count

23

(1 row)

To make this state I used the branch https://github.com/larsop/resolve-overlap-and-gap/tree/test_faling_left_like_right_face_issues_7

and ran the tests './run_test.pl --verbose --spatial_ref_sys -topology resolve_overlap_and_gap;'

and then created a empty database

psql -h localhost -U postgres postgres -c "drop database resolve_og" psql -h localhost -U postgres postgres -c "create database resolve_og" psql -h localhost -U postgres resolve_og -c "CREATE EXTENSION dblink" psql -h localhost -U postgres resolve_og -c "CREATE EXTENSION postgis" psql -h localhost -U postgres resolve_og -c "CREATE EXTENSION postgis_topology" psql -h localhost -U postgres resolve_og -c "CREATE EXTENSION pg_stat_statements "

installed code, the what ever path psql -h localhost -U postgres resolve_og -f /Users/lop/dev/github/resolve-overlap-and-gap/src/test/sql/regress/resolve_overlap_and_gap-pre.sql psql -h localhost -U postgres resolve_og -f /Users/lop/dev/github/postgres_execute_parallel/src/main/sql/function_execute_parallel.sql

-- Create data test case meter psql -h localhost -U postgres resolve_og -c "CREATE table test_data.overlap_gap_input_t3 AS (SELECT distinct c1, c2, c3, ST_transform(geom,25833)::Geometry(Polygon,25833) as geom from test_data.overlap_gap_input_t1)" psql -h localhost -U postgres resolve_og -c "CREATE index on test_data.overlap_gap_input_t3 using gist(geom)"

-- Then running this cmd.

psql -h localhost -U postgres resolve_og -c "CALL resolve_overlap_gap_run( ('test_data.overlap_gap_input_t3','c1','geom',25833,true), -- TYPE resolve_overlap_data_input ('test_topo_t3',1.0), -- TYPE resolve_overlap_data_topology resolve_overlap_data_clean_type_func( -- TYPE resolve_overlap_data_clean 49, -- if this a polygon is below this limit it will merge into a neighbour polygon. The area is sqare meter. 10, -- is this is more than zero simply will called with null, -- _max_average_vertex_length, in meter both for utm and deegrees, this used to avoid running ST_simplifyPreserveTopology for long lines lines with few points 2, -- IF 0 NO CHAKINS WILL BE DONE A big value here make no sense because the number of points will increaes exponential ) 10000, --edge that are longer than this value will not be touched by _chaikins_min_degrees and _chaikins_max_degrees
120, -- The angle has to be less this given value, This is used to avoid to touch all angles. 240, -- OR the angle has to be greather than this given value, This is used to avoid to touch all angles 40, -- The angle has to be less this given value, This is used to avoid to touch all angles. 320 -- OR The angle has to be greather than this given value, This is used to avoid to touch all angles ), 4, -- number of threads 10 -- max numver of polygons in each cell )"

strk commented 4 years ago

Can the problem be something like this ? 1) A edge is added which is dangling at stage one and that is correct. The face may be very big. 2) More edges are added then at at certain stage we get a new face.

The problem seems be that dangling edge is not always been notified/updated about new face.

Edges to be notified about the new faces are the edges that formed a new ring. Ring is determined by following the "next_left_edge" and "next_right_edge" attributes of the "edge" table. This is what https://postgis.net/docs/GetRingEdges.html also does.

On the liblwgeom level, a callback is used to do the edge walking, which boils down to a RECURSIVE CTE. Such recursive CTE, in absence of a LIMIT paremeter (it's invoked with no LIMIT, from liblwgeom) may enter an infinite loop, when edge linking is incorrect (it may or may not be what is happening in my make check run, for instance).

The code of the plpgsql function: https://git.osgeo.org/gitea/postgis/postgis/src/tag/3.0.1/topology/sql/query/GetRingEdges.sql.in#L50-L61

The code of the C callback: https://git.osgeo.org/gitea/postgis/postgis/src/tag/3.0.1/topology/postgis_topology.c#L1182-L1192

They should be equivalent queries, and should probably be merged: https://trac.osgeo.org/postgis/ticket/4675

So, if "edge is not always been notified/updated", there are two options:

1) GetRingEdges does NOT find that edge (meaning edge linking is broken == already invalid topology)

2) UPDATE is ineffective (for another update in another process overriding the previous one?)

I've been thinking for some time now that we should implement some form of database level locks to avoid second scenario, but if you mention you can see this happening in a single thread my guess is that we're in presence of the first scenario instead.

strk commented 4 years ago

On Thu, Apr 30, 2020 at 09:49:30PM -0700, Lars Aksel Opsahl wrote:

I have been able to make a small test that fails almost every second I run. By failing I mean we get dangling edges.

Dangling edges from input without dangling edges may still not be a failure, as long as the resulting topology is valid (dangling edges are not invalid, but natural way to develop a ring in chunks).

SELECT ST_length(geom), edge_id , start_node, next_left_edge , next_right_edge, abs_next_right_edge, right_face from test_topo_t3.edge_data where left_face = right_face ; st_length | edge_id | start_node | next_left_edge | next_right_edge | abs_next_right_edge | right_face --------------------+---------+------------+----------------+-----------------+---------------------+------------ 16638.511442709078 | 7151 | 5625 | -6957 | -4029 | 4029 | 0 15668.515454601618 | 7157 | 2706 | 3522 | -7651 | 7651 | 1465 9558.525838860027 | 7177 | 2809 | 7182 | 2815 | 2815 | 1471 46869.15086689703 | 7269 | 1471 | 3214 | 1482 | 1482 | 1484 15683.408435026411 | 7408 | 3856 | -3964 | -7410 | 7410 | 1546

These records do not necessarely represent an invalid topology. Each of those edges could be the body of the hourglass in the following ASCII drawing:


     *-----*
      \   /
       \ /
        *
        |
        | <- this
        |
        *
       / \
      /   \
     *-----*

If I try to check faces with st_getFaceGeometr select * from test_topo_t3.face where st_getFaceGeometry('test_topo_t3',face_id) is null; ERROR: XX000: SQL/MM Spatial exception - universal face has no geometry

You have to explicitly skip universal face (0):

  SELECT id
  FROM test_topo_t3.face
  WHERE face_id != 0
  AND st_getFaceGeometry('test_topo_t3',face_id) is null;

If I check with this cmd I find where catch errors select count(*) from test_topo_t3.face where topo_update.get_face_geo('test_topo_t3',face_id,0) is null; count

23

(1 row)

Interesting, this does indeed sound like a bug.

To make this state I used the branch https://github.com/larsop/resolve-overlap-and-gap/tree/test_faling_left_like_right_face_issues_7

[...]

I will try to follow these steps, but I can't avoid to note it's still really too many steps, and high complexity (dblink involved). Attaching a dump of the resulting topology may also be a good idea.

strk commented 4 years ago

https://github.com/larsop/resolve-overlap-and-gap/tree/test_faling_left_like_right_face_issues_7 and ran the tests './run_test.pl --verbose --spatial_ref_sys -topology resolve_overlap_and_gap;'

I'm assuming you mean to run that command from the src/test/sql/regress directory, in order to find ./run_test.pl.

But running as you specify fails with:

  Unable to find /usr/src/nibio/resolve-overlap-and-gap/src/test/sql/regress/00-regress-install/share/contrib/postgis/postgis.sql

Adding the --extension switch makes the tester us PostGIS via EXTENSIN, fixing this problem, although the warning fixed in master and reported in https://github.com/larsop/resolve-overlap-and-gap/issues/6 is back, and the test still never completes for me, due to what I reported in https://github.com/larsop/resolve-overlap-and-gap/issues/12

I'm afraid I'll need to focus on #12 before being able to follow instructions implying running those tests.

larsop commented 4 years ago

[...] I will try to follow these steps, but I can't avoid to note it's still really too many steps, and high complexity (dblink involved). Attaching a dump of the resulting topology may also be a good idea.

Here is dump from postgres 12 og POSTGIS="3.0.0 r17983" [EXTENSION] PGSQL="120" GEOS="3.8.0-CAPI-1.13.1 " PROJ="6.3.0" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.2" WAGYU="0.4.3 (Internal)" TOPOLOGY (1 row)

The number null faces and faces where left_face= right_face is a kind of random from time to time.

'pg_dump -Z9 -f test_faling_left_like_right_face_issues_7.sql.gz resolve_og'

test_faling_left_like_right_face_issues_7.sql.gz

larsop commented 4 years ago

Here is a dump of the schema only.

pg_dump -Z9 -n test_topo_t3 -f test_topo_t3.sql.gz resolve_og

test_topo_t3.sql.gz

strk commented 4 years ago

Ok I've imported the dump, then registered the topology with:

insert into topology.topology(name,srid,precision) values 'test_topo_t3', 25833, 0);

Note I used 0 as the tolerance, because the dump does not contain any information about what's the actual tolerance was.

The imported topology, according to ValidateTopology, has many invalidities:

nibio-issue7=# select * from topology.validateTopology('test_topo_t3');
       error       | id1  | id2  
-------------------+------+------
 face has no rings | 1460 |     
 face has no rings | 1461 |     
 face has no rings | 1462 |     
 face has no rings | 1463 |     
 face has no rings | 1464 |     
 face has no rings | 1536 |     
 face has no rings | 1537 |     
 face has no rings | 1540 |     
 face has no rings | 1554 |     
 face has no rings | 1668 |     
 face has no rings | 1670 |     
 face has no rings | 1671 |     
 face has no rings | 1724 |     
 face has no rings | 1747 |     
 face has no rings | 1749 |     
 face has no rings | 1753 |     
 face has no rings | 1761 |     
 face has no rings | 1767 |     
 face has no rings | 1860 |     
 face has no rings | 1932 |     
 face has no rings | 1933 |     
 face has no rings | 1954 |     
 face has no rings | 1957 |     
 face has no rings | 1955 |     
 face within face  |  782 | 1959
 face within face  | 1959 |  782
 face within face  |  778 | 1921
 face within face  |  790 | 1921
 face within face  |  784 | 1921
(29 rows)

So next we shall find out HOW such invalidities were introduced.

Picking the first "face has no rings" record (face_id 1460), I see that there's a single edge marked as having that face on a side: edge_id 7166, having face 1460 on its left side and face 1462 on its right side. The face on the other side (1462) is ALSO found as being a "face has no ring" error, and is also only known by that edge.

So we have a single edge (edge_id 7166) marked as having 2 different faces on its sides, NOT closed (end_node != start_node), NOT isolated (abs_next_left_edge != 7166, abs_next_right_edge != 7166) BUT exclusively knowing about those faces (face_id 1460 and 1462).

Here's a picture of that situation, extracted from QGIS-3.12:

edge7166

The GetRingEdges function tells us that edge 7166 is part of a proper and closed ring:

nibio-issue7=# select * from GetRingEdges('test_topo_t3', '7166');
 sequence | edge 
----------+------
        1 | 7166
        2 | 3498
        3 | 3532
        4 | 6753
        5 | 1429
        6 | 1452
        7 | 2109
        8 | 2712
        9 | 2686
       10 | 3009
       11 | 6982
(11 rows)

You can see the next (3498) and previous (6982) edges in the map image above.

So what's happening is most likely that the face on the left of edge 7166 should be face_id=1724 (the face on that side registered on next and previous edges), and the face on the right of edge 7166 should be face_id=1671 (the face on that side registered on next and previous edges).

Updating the edge accordingly should (if it's the only issue) fix that specific problem, which results in GetFaceGeometry returning NULL also for faces 1724 and 1671 missing a portion of their ring (see https://trac.osgeo.org/postgis/ticket/4680#ticket for a possible enhancement to GetFaceGeometry that would make it work also in these cases).

I've also filed https://trac.osgeo.org/postgis/ticket/4681#ticket to enhance ST_GetFaceGeometry so it prints a WARNING when such cases are found.

So, to recap:

It remains to understand WHY the topology turned invalid in the first place, but I'd do this in a different ticket, because the title of this one is specifically addressing ST_GetFaceGeometry returning NULL. If you agree please file another ticket about "Topology construction routine results invalid topology" or similar.

larsop commented 4 years ago

I have filled new ticket here I have also been doing some more and it seems closely related how many threads and how complex the input are.

The best result seems to be when using branch. What I do here is that I do not cut the lines that cross the cell borders using the cell border, by using this code. When doing this we way don't need put together those lines connecting the cell at later stage.

Another change is at the stage 3 line 163 where where we merge all the cells together I also run this as one single thread.

We stills get some missing geometries but in one test case we reduced the number lost significant geometries from maybe around 400 hundred to less than 10

larsop commented 4 years ago

One update I check the results from last night using branch where I do not cut lines crossing the cell wall and it seems like the lines where left equal right face are that not connected to cell borders any more.

I used this SELECT ST_Length(geom), ST_ASText(geom) from topo_sr16_trl_09.edge_data where st_length(geom) > 10 and left_face != 0 and right_face = left_face order by ST_length(geom) desc; and I checked the 5 longest lines and they where not related to cell borders and that has not been the case before.

Here is SQL that show big differences. With the no cut line branch and a single thread we get 8 problems related to lines crossing cell borders. But using a branch where we cut the lines and merge them in a multithread environment we get almost 3000 error .

SELECT count(*) from topo_sr16_trl_09.edge_data e, (select ST_ExteriorRing(geo) as geo from topo_sr16_trl_09.trl_2019_test_segmenter_renska_grid) as r where ST_Intersects(r.geo,e.geom) and st_length(geom) > 10 and left_face != 0 and right_face = left_face; count

 8

(1 row)

vroom3.int.nibio.no postgres@sl=# vroom3.int.nibio.no postgres@sl=# SELECT count(*) from topo_sr16_trl_08.edge_data e, (select ST_ExteriorRing(geo) as geo from topo_sr16_trl_08.trl_2019_test_segmenter_renska_grid) as r where ST_Intersects(r.geo,e.geom) and st_length(geom) > 10 and left_face != 0 and right_face = left_face; count

2982 (1 row)

strk commented 4 years ago

I've filed PR #15 to run the topology validation step as part of the existing test. Are you able to include those less-than-10 records (possibly simplified) in the test dataset, and obtain a failure ? That would give me a way to try at reproducing the problem. You may see that as commented in https://trac.osgeo.org/postgis/ticket/4684#comment:1 the issue of having an invalid topology as the result of concurrent writing into the topology is expected behavior.

strk commented 4 years ago

Can we close this ticket as completed ?

larsop commented 4 years ago

Yes this solved as long as we run in a single thread in step 3 where we insert lines crossing cell borders. We will also do a test with row level lock to see if that can help to run in parallel.