bcgov / designatedlands

Python script to combine conservation related spatial data from many sources to create a single 'Designated Lands' layer for British Columbia
Apache License 2.0
9 stars 4 forks source link

topology exceptions #13

Closed smnorris closed 7 years ago

smnorris commented 7 years ago

The uwr_conditional_harvest layer probably needs some additional cleaning, adding it to output fails:

INFO:root:Inserting uwr_conditional_harvest into output
Traceback (most recent call last):
  File "conservationlands.py", line 223, in <module>
    cli()
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "conservationlands.py", line 175, in process
    db.execute(sql)
  File "/Volumes/Data/Projects/python/pgdb/pgdb/database.py", line 162, in execute
    return self._get_cursor().execute(sql, params)
  File "/usr/local/lib/python2.7/site-packages/psycopg2/extras.py", line 120, in execute
    return super(DictCursor, self).execute(query, vars)
psycopg2.InternalError: GEOSUnaryUnion: TopologyException: Input geom 0 is invalid: Ring Self-intersection at or near point 1227334.5330000001 1492096.138 at 1227334.5330000001 1492096.138
smnorris commented 7 years ago

Same thing happens (at a different spot) with great_bear_ebm_area (if uwr_conditional_harvest removed from list).

Traceback (most recent call last):
  File "conservationlands.py", line 223, in <module>
    cli()
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "conservationlands.py", line 175, in process
    db.execute(sql)
  File "/Volumes/Data/Projects/python/pgdb/pgdb/database.py", line 162, in execute
    return self._get_cursor().execute(sql, params)
  File "/usr/local/lib/python2.7/site-packages/psycopg2/extras.py", line 120, in execute
    return super(DictCursor, self).execute(query, vars)
psycopg2.InternalError: GEOSDifference: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 922787.56200000714 725268.95899998606 at 922787.56200000714 725268.95899998606

Problem is more likely that the ouput layer is getting complex rather than individual inputs.

ateucher commented 7 years ago

I'm guessing you know about the ST_Buffer(geom, 0) trick?

smnorris commented 7 years ago

Yep, that is being applied to the input table (which calmed those errors for the first few inserts) but maybe the output needs some tidying too. Many of the inputs have such similar edges.

On Wed, Oct 12, 2016, 9:42 AM Andy Teucher notifications@github.com wrote:

I'm guessing you know about the ST_Buffer(geom, 0) trick?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/smnorris/conservationlands/issues/13#issuecomment-253268568, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQxHdVOl7PSJFNuIRYoVEDcBR0wSC_pks5qzQ3tgaJpZM4KU9Sp .

smnorris commented 7 years ago

Adding an ST_MakeValid into the mix fixed the uwr_conditional_harvest and one of the great_bear_ebm_area problems, but there are still more failures on great_bear_ebm_area

smnorris commented 7 years ago

at https://github.com/smnorris/conservationlands/blob/master/sql/populate_output.sql#L46 I can increase the ST_SnapToGrid tolerance to 1cm and resolve problem when processing the GBR layer, but then the UWR layer has problems at that tolerance, when it was ok at 1mm.

Ideally I could ST_Union the entire set of intersecting polygons already in the output layer - creating one poly to cut against.... but the input_id has to be retained for the join to work.

smnorris commented 7 years ago

The various ST_Buffer(geom,0) and ST_MakeValid(geom) in the query are fairly random, I've dropped all of the buffers - there was a comment on stackexchange (lost the link) noting that relying on these may just lead to more problems down the road when dealing with lots of similar or problematic geometries.

Instead, reducing precision of the geometries input to ST_Difference to 1m results in a (fast) successful run with all current inputs.

http://tsusiatsoftware.net/jts/jts-faq/jts-faq.html#D9

Currently testing 1cm, it really slows down on the great_bear_ebm_area layer, we'll see if it completes successfully.

smnorris commented 7 years ago

1/2hr later the overlay at with ST_SnapToGrid(geom, .1) is still processing the GBR layer. I think that is good, it isn't just bailing on exceptions right away.

I'll go back to 1m precision for now just to get a result processed. But making the ST_Difference calculation conditional on ST_Coveredby or similar should speed things right back up.

ateucher commented 7 years ago

Good news. Thanks for the update.

smnorris commented 7 years ago

Good news - all layers (minus the three not publicly available) download, clean and process successfully, no topology errors.

Bad news - visual review of output layer shows lots of stuff missing, hopefully just a small oversight in the query but I'm not sure yet.

ateucher commented 7 years ago

Ok thanks for the update. Glad to hear it all ran! No rush for today and I'm not working Monday either

smnorris commented 7 years ago

Should be fixed by https://github.com/smnorris/conservationlands/commit/ac008d2e145310ca774e381ff0c46249460c549d

smnorris commented 7 years ago

Cutting inputs by 20k tiles rather than using ST_Subdivide seems to make all problems go away. Processing is FAR faster too!

smnorris commented 7 years ago

I spoke too soon. I thought the huge speedup might be too good to be true - a typo in the select meant the overlay wasn't being processed. No wonder it was so fast!

However

ateucher commented 7 years ago

@smnorris it's back :( process works great... but overlay with both BEC and ecosections causes topology exceptions since you fixed #27.

smnorris commented 7 years ago

Interesting. I didn't test that function because I thought it didn't call the modified code. On first glance, I still don't think it does.

ateucher commented 7 years ago

Weird...

smnorris commented 7 years ago

Added some ST_MakeValid and ST_SnapToGrid calls to the Intersect function's sql, running overlay on ecosections is successful. https://github.com/smnorris/conservationlands/commit/c250ab5b46dbecf80a69e004aabbec6478412033

ateucher commented 7 years ago

Great! I'll give it a shot right now