Closed apjoseph closed 4 years ago
So are you running a noding process over a set of polygons, and then polygonizing that to compute a full overlay?
You can use OverlayNG for the noding phase in the same way as the original overlay - just UnaryUnion the linework all together (see code here).
Unfortunately there isn't (yet) a Polygonizer equivalent in the new overlay (although of course it does have all the needed machinery for its own use). OverlayNG uses a new(ish) graph structure, which is more memory efficient than the one Polygonizer uses. So I have thought that it would be nice to change Polygonizer to use this new graph model. Maybe I can work on that sooner rather than later.
I'm surprised there is that much memory overhead for polygonization. Hopefully using the new graph structure can help with this. There is also another approach I've worked on in the past using a sweepline approach which has very low memory overhead. It's in my work queue to dust that off and get it released. Any chance of getting a copy of the problematic datasets?
I have also had some thoughts about using OverlayNG to provide a full polygonal coverage overlay (either to reduce an overlapping set of Polygons to a coverage, or to overlay two coverages). This will take a bit of work, but should be feasible.
Ah, so basically the idea is to use UnaryUnionNG.union
on the dumped polygon linework and pass it to the existing Polygonizer which shouldn't itself introduce any new topology exceptions, correct?
This is somewhat similar to the approach I was already using in 1.61. I didn't use the existing polygonization directly on source polygons -but used the old snaprounding functionality that was already present :
SegmentStringUtil.extractSegmentStrings()
org.locationtech.jts.noding.snapround.MCIndexSnapRounder
to node/snapround segement stringsSegmentStringUtil.toGeometry()
UnaryUnionOp.union()
snaprounded linestrings (where mem issue occurred)So the mem issue I was having wasn't actually on the polygonization portion, but in the unioning of all linestrings together into a single geometry for polygonization.
That coverage overlay functionality, sounds very exciting and helpful by the way -and I am greatly looking forward to it.
Ah, so basically the idea is to use
UnaryUnionNG.union
on the dumped polygon linework and pass it to the existing Polygonizer which shouldn't itself introduce any new topology exceptions, correct?
Correct. And should be a lot simpler than the process you are currently using. So now the process is:
getBoundary
) and collect into a MultiLineString
.UnaryUnionNG
on that geometry to produce a fully noded set of edges, using an appropriate precision model (note there is no need to round the linework coordinates - OverlayNG takes care of that itself)
Let me know if it works! And particularly about memory usage.
That coverage overlay functionality, sounds very exciting and helpful by the way -and I am greatly looking forward to it.
Roger that. It's on the list.
@apjoseph Do you have stats for the number of vertices in your 100 MB file? By estimating from a shapefile (which is binary of course) I'm thinking that's about 50M vertices?
That is quite a large dataset, and it doesn't surprise me that it blows out memory with the old overlay. And I'm thinking it's likely to be an issue with the new overlay as well. However, as I mentioned I do have a experimental project which could potentially process this size of dataset in very little memory. I will see if I can get that into a state to release.
I did a test this morning with overlayNG. The linestring union threw a topology exception:
Exception in thread "main" org.locationtech.jts.geom.TopologyException: found non-noded intersection between LINESTRING ( -118.299085 33.821959, -118.299085 33.821818 ) and LINESTRING ( -118.299085 33.821959, -118.299085 33.821887 ) [ (-118.299085, 33.821959, N
aN) ]
at org.locationtech.jts.noding.FastNodingValidator.checkValid(FastNodingValidator.java:140)
at org.locationtech.jts.geomgraph.EdgeNodingValidator.checkValid(EdgeNodingValidator.java:81)
at org.locationtech.jts.geomgraph.EdgeNodingValidator.checkValid(EdgeNodingValidator.java:46)
at org.locationtech.jts.operation.overlay.OverlayOp.computeOverlay(OverlayOp.java:231)
at org.locationtech.jts.operation.overlay.OverlayOp.getResultGeometry(OverlayOp.java:183)
at org.locationtech.jts.operation.overlay.OverlayOp.overlayOp(OverlayOp.java:86)
at org.locationtech.jts.operation.overlay.snap.SnapIfNeededOverlayOp.getResultGeometry(SnapIfNeededOverlayOp.java:75)
at org.locationtech.jts.operation.overlay.snap.SnapIfNeededOverlayOp.overlayOp(SnapIfNeededOverlayOp.java:37)
at org.locationtech.jts.operation.union.UnaryUnionOp.unionNoOpt(UnaryUnionOp.java:280)
at org.locationtech.jts.operation.union.UnaryUnionOp.union(UnaryUnionOp.java:216)
at org.locationtech.jts.operation.overlayng.UnaryUnionNG.union(UnaryUnionNG.java:49)
rawGF = GeometryFactory(PrecisionModel(10.0.pow(6)),4326) fixedGF = GeometryFactory(PrecisionModel(PrecisionModel.FLOATING))
Dataset is about 10 million vertices. The way I handle in 1.16 is by creating a grid and recursively subdividing it in parallel based on an index of all individual line segments until I get down to a manageable amount and then doing parallel union/polygonization on each portion of the grid,throwing out any polygons outside of it, and then recombining any split pieces. This is not ideal since the grid itself can introduce errors when recombining, but this does generally work well in speeding things up and reducing memory.
That experimental project sounds helpful!
That's disappointing. Are you able to provide a sample dataset which causes the issue? (Perhaps put a breakpoint and capture the two geometries which cause the exception?)
Clever approach with the subdividing, but yeah, sounds tricky to get it perfect.
I'm going to push my other project to Github soon, and will put a link on this issue.
Sorry for the delay. Here are the raw multipolygons whose boundaries cause the exception when unioned.
Thanks. What precision model was being used?
Actually I see now that the unary union is not using the OverlayNG code. So that's a bug.
Although I am not able to reproduce the failure using either overlay method.
Thanks. What precision model was being used?
val gf4326 = GeometryFactory(PrecisionModel(10.0.pow(6)),4326)
val gc = geoms.map {it -> it.boundary }.let{ gf4326.createGeometryCollection(it.toTypedArray())}
UnaryUnionNG.union(gc,gf4326.precisionModel)
I did fix some bugs over the past couple of days. Can you try again with the most recent code?
Although as noted above, this isn't actually using OverlayNG inside the unary union (which is a bug I will work on).
The other way to run the boundary linework through OverlayNG is to make a MultiLineString from it, and union it with an empty LineString. This is actually what UnaryUnion does internally (since there is no benefit to using a cascaded union approach with lines).
It also seems like there is quite a bit of precision in that data. To me it looked like a scale factor of 10^8 is more appropriate.
I was able to run UnaryUnion using both original overlay and OverlayNG (with scale-1e8). Interestingly OverlayNG was about 4 times faster (perhaps because the old overlay needed to use some snapping, which is quite slow).
Can confirm I was able to get it to go through after reducing precision to 10^6 using the new OverlayNG.overlay
method -and using 10^8 on UnaryUnion
. That's a relief!
Also verified that the reducePreicsion
algorithm also doesn't create any empty geometries when reducing precision on any of the polygons -which was another major problem I had in the old JTS. Very excited about that as well.
Can confirm I was able to get it to go through after reducing precision to 10^6 using the new
OverlayNG.overlay
method -and using 10^8 onUnaryUnion
. That's a relief!
Good to hear.
Also verified that the
reducePreicsion
algorithm also doesn't create any empty geometries when reducing precision on any of the polygons -which was another major problem I had in the old JTS. Very excited about that as well.
The old JTS GeometryPrecisionReducer
is a bit of a hack which has all kinds of potential problems, and isn't very efficient. So it will be replaced by the new code as soon as its had a bit more testing. (I'll keep the old API, just replace it with a call to the new OverlayNG PrecisionReducer.)
@apjoseph I pushed my alternative FORSE overlay code to a repo here. It's a WIP, but it should work well enough for you to test its characterstics for memory efficiency. I can provide more details if interested.
Excellent, I'll likely get around to running this on large dataset likely in the next week or two. Looking like this is the most relevant. I noticed this lib doesn't have a pom.xml or build.gradle -is forsee meant to be added as a module in JTS or standalone?
Yes, I haven't put any work into a build chain for FORSE. It's more likely to end up as a JTS API.
You might also be interested in the recent addition to the OverlayNG branch of the SnappingNoder
. It is a noder which uses full-precision snapping rather than snap-rounding. It's very new, but so far has been very robust in my testing. I'm optimistic that it will replace the floatng-point noding strategy used currently, and that it will reduce the need for using snap-rounding.
Usage examples here. As usual I'm keen to get some beta-testing!
I just read your blog post and am excited to see if this does indeed eliminate dreaded TopologyExceptions!
I'm looking to test polygonization (unique areas of overlap) on a large Multipolygon postgis-derived dataset.
I took a look at the new logic in the overlay-sr and I'm not seeing an an obvious equivalent to Polygonizer in the branch and wondering if/where such functionality exists.
In the past I've had to split up datasets that have complex polygons with many thousands of vertices to avoid mem overflow on a dataset of only 100MB with 20+GB of available memory for polygonization. So I'm also curious if overlayng has any functionality that might help avoid overflow on polygonizing operations (like st_subdivide in postgis for example) while maximizing robustness.