Open DGuidi opened 2 years ago
from Shapefile specs, page 8-9
The order of vertices or orientation for a ring indicates which side of the ring is the interior of the polygon... Vertices of rings defining holes in polygons are in a counterclockwise direction. Vertices for a single, ringed polygon are, therefore, always in clockwise order. The order of rings in the points array is not significant.
so basically testing for rings CW and CCW orientation is the correct way of doing things. Maybe is this code that try to clean data that is inherently dirty?
The shell_bad_ccw.shp
contains a single polygon, with:
The idea of this PR is to let the ShapefileReader
reads this kind of data as two separate shells, so a MultiPolygon
is returned.
The idea of this PR is to let the
ShapefileReader
reads this kind of data as two separate shells, so aMultiPolygon
is returned.
Does one polygon contain the other? In that case the result would be invalid.
Does one polygon contain the other? In that case the result would be invalid.
Yep, exactly (made explicit in the unit test).
I see few different strategies:
ShapefileReader.Read()
returns false
, because of this code and the reason must be inspected checking the windows messages (not good, but removing the catch
is a breaking change).PUT YOUR IDEA HERE
I'm open to suggestions :)
@DGuidi , I think we should
Polygon
s without holes)MultiPolygon
).[Multi]Polygon
.I think the PR I pointed you to yesterday does it in a similar way.
I think the PR I pointed you to yesterday does it in a similar way
I will do a check, thanks
The code in the WIP PR
looks similar to the code actually in the standard lib to me
I'm working on resolving an issue with this geom:
MULTIPOLYGON (((-124.134 -79.199, -124.141 -79.316, -124.164 -79.431, -124.202 -79.542, -124.254 -79.647, -124.319 -79.745, -124.396 -79.833, -124.484 -79.91, -124.582 -79.975, -124.687 -80.027, -124.798 -80.065, -124.913 -80.088, -125.03 -80.095, -125.147 -80.088, -125.262 -80.065, -125.373 -80.027, -125.478 -79.975, -125.576 -79.91, -125.664 -79.833, -125.741 -79.745, -125.806 -79.647, -125.858 -79.542, -125.896 -79.431, -125.919 -79.316, -125.926 -79.199, -125.919 -79.082, -125.896 -78.967, -125.858 -78.856, -125.806 -78.751, -125.741 -78.653, -125.664 -78.565, -125.576 -78.488, -125.478 -78.423, -125.373 -78.371, -125.262 -78.333, -125.147 -78.31, -125.03 -78.303, -124.913 -78.31, -124.798 -78.333, -124.687 -78.371, -124.582 -78.423, -124.484 -78.488, -124.396 -78.565, -124.319 -78.653, -124.254 -78.751, -124.202 -78.856, -124.164 -78.967, -124.141 -79.082, -124.134 -79.199),(-124.438 -79.199, -124.443 -79.122, -124.459 -79.046, -124.483 -78.973, -124.518 -78.903, -124.561 -78.839, -124.612 -78.781, -124.67 -78.73, -124.734 -78.687, -124.804 -78.652, -124.877 -78.628, -124.953 -78.612, -125.03 -78.607, -125.107 -78.612, -125.183 -78.628, -125.256 -78.652, -125.326 -78.687, -125.39 -78.73, -125.448 -78.781, -125.499 -78.839, -125.542 -78.903, -125.577 -78.973, -125.601 -79.046, -125.617 -79.122, -125.622 -79.199, -125.617 -79.276, -125.601 -79.352, -125.577 -79.425, -125.542 -79.495, -125.499 -79.559, -125.448 -79.617, -125.39 -79.668, -125.326 -79.711, -125.256 -79.746, -125.183 -79.77, -125.107 -79.786, -125.03 -79.791, -124.953 -79.786, -124.877 -79.77, -124.804 -79.746, -124.734 -79.711, -124.67 -79.668, -124.612 -79.617, -124.561 -79.559, -124.518 -79.495, -124.483 -79.425, -124.459 -79.352, -124.443 -79.276, -124.438 -79.199)),((-124.582 -79.199, -124.586 -79.257, -124.597 -79.315, -124.616 -79.371, -124.642 -79.423, -124.674 -79.472, -124.713 -79.516, -124.757 -79.555, -124.806 -79.587, -124.858 -79.613, -124.914 -79.632, -124.972 -79.643, -125.03 -79.647, -125.088 -79.643, -125.146 -79.632, -125.202 -79.613, -125.254 -79.587, -125.303 -79.555, -125.347 -79.516, -125.386 -79.472, -125.418 -79.423, -125.444 -79.371, -125.463 -79.315, -125.474 -79.257, -125.478 -79.199, -125.474 -79.141, -125.463 -79.083, -125.444 -79.027, -125.418 -78.975, -125.386 -78.926, -125.347 -78.882, -125.303 -78.843, -125.254 -78.811, -125.202 -78.785, -125.146 -78.766, -125.088 -78.755, -125.03 -78.751, -124.972 -78.755, -124.914 -78.766, -124.858 -78.785, -124.806 -78.811, -124.757 -78.843, -124.713 -78.882, -124.674 -78.926, -124.642 -78.975, -124.616 -79.027, -124.597 -79.083, -124.586 -79.141, -124.582 -79.199),(-124.896 -79.199, -124.897 -79.181, -124.9 -79.164, -124.906 -79.148, -124.914 -79.132, -124.923 -79.117, -124.935 -79.104, -124.948 -79.092, -124.963 -79.083, -124.979 -79.075, -124.995 -79.069, -125.012 -79.066, -125.03 -79.065, -125.048 -79.066, -125.065 -79.069, -125.081 -79.075, -125.097 -79.083, -125.112 -79.092, -125.125 -79.104, -125.137 -79.117, -125.146 -79.132, -125.154 -79.148, -125.16 -79.164, -125.163 -79.181, -125.164 -79.199, -125.163 -79.217, -125.16 -79.234, -125.154 -79.25, -125.146 -79.266, -125.137 -79.281, -125.125 -79.294, -125.112 -79.306, -125.097 -79.315, -125.081 -79.323, -125.065 -79.329, -125.048 -79.332, -125.03 -79.333, -125.012 -79.332, -124.995 -79.329, -124.979 -79.323, -124.963 -79.315, -124.948 -79.306, -124.935 -79.294, -124.923 -79.281, -124.914 -79.266, -124.906 -79.25, -124.9 -79.234, -124.897 -79.217, -124.896 -79.199)))
This multipolygon is valid (two separate polygons each with a hole) but using the suggested logic the data from the shapefile is considered a polygon. I'm working on it.
@FObermaier I slightly changed the logic. Now when a potential hole of a shell is found, we check also the shell holes, to see if any hole contains the potential hole. If so, the potential hole can not be considered a hole for the shell, but a separate geometry. It works but I think I can miss few other edge cases.
@FObermaier any advice on this?
I actually don't quite know how to proceed. There are complaints that current ShapeFile implementation is (dreadfully) slow. That is why I pointed you to the WIP/redo everyting PR. If we do thorough topology test on the rings we read, I assume it gets even worse. Maybe we can enable the thorough test conditionally, if the user of the library really wants it?
I actually don't quite know how to proceed.
Actually, I have the same feeling: with the "new" polygon builder logic all the tests are green, but I fear that some behavior not covered by some test can be broken.
That is why I pointed you to the WIP/redo everyting PR
that shows the same problem, actually
If we do thorough topology test on the rings we read
the "new" polygon builder logic I think is similar to the older one, in the sense that I didn't expected to be too much slower. But actually this is just a feeling, I haven't tried no comparison tests
we can enable the thorough test conditionally
I can work on it: a flag can be helpful also to build some performance test
my2cents
@FObermaier added a "temporary" static property as a flag, I need to think a bit better about how to manage the entire "flag" thing... suggestions are welcome!
after a brief inspection: adding a flag property to the shapefiledatareader
, means that the same flag shall be copied to the shapefilereader
and then to the polygonhandler
, that is created in a static factory method of the shapefile
class, so we must ensure that any call to the factory method should also handle the flag... additionally, the Shapefile.CreateDataReader
needs to accept another parameter, just to enable the new behavior.
viable, not so complicated at all, maybe acceptable for an "experiment", but I hope to find a better alternative
I didn't expected to be too much slower
but actually - based on some poor-man's performance tests - it's much (~3X) slower
WRITE => elapsed: '00:00:00.4138680'
ShapeFileDataReader
flag DISABLED => elapsed: '00:00:00.4381579'
flag ENABLED => elapsed: '00:00:01.3507803'
ShapeReader
flag DISABLED => elapsed: '00:00:00.4276830'
flag ENABLED => elapsed: '00:00:01.2875299'
just a small note: tests marked as Explicit
are always executed in my machine (using VS2019). this is why I used Ignore
instead: now I understood that probably it's something related to my setup, and not a general problem.
AFAIK Explicit
will always run from within the UI (VisualStudio).
In GDAL/OGR shapefile driver, all rings are converted to Polygon
s and afterwards sent to organizePolygons
:
https://github.com/OSGeo/gdal/blob/363178015ae009fa1187d8c1e20b3587db43aa58/ogr/ogrgeometryfactory.cpp#L1455-L2144
I have not bisected the algorithm altogether, but it seems to be a bit of everything...
AFAIK
Explicit
will always run from within the UI (VisualStudio).
exactly what happens to me
maybe we can refactor this check to speed up a bit the computation:
bool IsHoleContainedInShell(LinearRing shell, LinearRing hole) =>
//shell.EnvelopeInternal.Contains(hole.EnvelopeInternal) &&
PointLocation.IsInRing(hole.GetCoordinateN(0), shell.Coordinates);
I'd assume that without the cheap envelope precondition test we'll see a performance degeneration.
I'd assume that without the cheap envelope precondition test we'll see a performance degeneration.
I just mean that IsHoleContainedInShell
check should just use PointLocation.IsInRing
I got that, but PointLocation.IsInRing
is the expensive test (compared to shell.EnvelopeInternal.Contains(hole.EnvelopeInternal)
). If we don't do the envelope test, the PointLocation.IsInRing
test is performed more often which will lead to performance degeneration.
If we don't do the envelope test, the
PointLocation.IsInRing
test is performed more often which will lead to performance degeneration.
yep now I see your point. curiously, from performance tests I see performance improvements
yep now I see your point. curiously, from performance tests I see performance improvements
If you only have polygons with holes that might be true, but for multipolygons whose shell ring envelopes don't intersect, I doubt that.
can be this PR cancelled, considering the discussion in #81, and in particular this comment ? Maybe part of this work can be added to #69 , if worthy
To investigate #70