NetTopologySuite / NetTopologySuite.IO.Esri

BSD 3-Clause "New" or "Revised" License
31 stars 16 forks source link

Add Test project from NetTopologySuite.IO.ShapeFile repository #7

Closed KubaSzostak closed 1 year ago

KubaSzostak commented 1 year ago

Add Test project from NetTopologySuite.IO.ShapeFile repository

KubaSzostak commented 1 year ago

NetTopologySuite.IO.Esri currently read all polygon shapefiles always as MultiPolygon what is consistent with ESRI Shapefile Technical Description. The same for lines - they are always read as MultiLineString.

DGuidi commented 1 year ago

I have some compilation issues dbf.Keys() should be record.GetNames() I think

in Test.cs, PrintFieldValues should be something like

        protected void PrintFieldValues(IAttributesTable values)
        {
            Console.WriteLine();
            foreach (var nameVal in values.GetNames())
            {
                PrintFieldValue(nameVal, values[nameVal]);
            }
        }

or am I missing something?

There is also something in ShapeFileDataReaderTest.TestReadingShapeFileWithNulls that prevents subsequent tests to be executed, at least in my machine. There is a Debug.Assert in ShpPolygonReader that fails when this test is executed. Debug.Assert(!shell.IsCCW, "Invalid Shapefile polygon - shell coordinates are not in in clockwise order.");

Beside this, all tests are basically green. In my opinion, the code can be integrated and marked as experimental in the docs, so everyone out there can start using it, but the clear disclaimer that the code written against the old shapefile library can be rewritten, the behavour is different and there will be some bugs.

KubaSzostak commented 1 year ago

Thanks for the feedback, @DGuidi . I've fixed compilation errors in the TestConsole app. By the way I also upgraded it from .NET Framework into .NET 6.

Assert statement was set for this piece of code

var shell = Factory.CreateLinearRing(partsBuilder[0]);
if (shell.IsCCW)
{
    if (partsBuilder.Count > 1)
    {
        ThrowInvalidRecordException("Invalid Shapefile polygon - shell coordinates are not in in clockwise order.");
    }
    Debug.Assert(!shell.IsCCW, "Invalid Shapefile polygon - shell coordinates are not in in clockwise order.");
    shell = Factory.CreateLinearRing(partsBuilder[0].Reversed());
}

According to SHP specification vertices for a polygon shell should have clockwise order. I could throw an error if that's not the case, but in order to pass other tests forgiving approach was used. If, contrary to documentation, shell IsCCW and there is only one polygon part (there are no other polygons and there are no holes) then it is recreated using Reversed() vertices order. In theory it shouldn't happen so I decided to add an Assert statement to make developer aware of the issue. In this specific test case we have test shapefile that contains incorrect data. That is intentionally alerted through Assert statement. I started thinking that after passing all other test maybe this is no longer needed? Should I remove it or there is another way to make the subsequent tests pass even if there is failure in an Assert statement?

DGuidi commented 1 year ago

I think that this can be related to IO.ShapeFile issue 70 and related PR 80?

KubaSzostak commented 1 year ago

Yes, it's related to issue 70 and PR 80. Currently it works like this:

  1. Read first LinearRing
    • If IsCCW == false it's fine, go further.
    • If IsCCW == true and there is only one polygon part (no other polygons, no holes) then reversed order of vertices is used.
    • If IsCCW == true and there are more polygon parts then Invalid Shapefile polygon exception is thrown .
  2. Rest of polygon parts are read assuming that:
    • IsCCW == false starts new polygon
    • IsCCW == true are holes in previously read polygon.
  3. If polygon parts are invalid (for example hole is outside the shell) then no exception is thrown. In that case geometry has IsValid property set to fasle.

Above approach was chosen in order to make the reading fast. Personally I don't like the idea of checking every geometry in the world because there can be someone who provided us invalid Shapefile. But if you like to sacrifice performance for convenience I can change the code to follow @FObermaier proposal:

  1. Order the polygon rings by their area (as if they were Polygons without holes)
  2. Pick the largest as 1st shell
  3. Check the others if they are contained by any of the shells and if so, make them a hole of that shell, otherwise create a new shell (in that case we have a MultiPolygon).
  4. Repeat 3 until there are no more rings left
  5. Build the valid [Multi]Polygon.

What is the decision?

KubaSzostak commented 1 year ago

In the meantime I've moved all ported test into Depraced folder/namespace. I've also added a new test checking if MultiPolygons that have a polygon inside the hole are handled properly.

image

DGuidi commented 1 year ago

What is the decision?

My2cents: as in old library (if I remember correctly), enable the "slower code that checks every geometry" only if a specific flag is enabled. Otherwise assume the data provided is correct and use the faster.code.

KubaSzostak commented 1 year ago

I like this idea. I will add a flag to ShapefileReaderOptions and create two ReadGeometry() variants. How to name the flag?

  1. FixInvalidShapes
  2. FixInvalidGeometries
  3. FixStructure (as in WktRreadder)
DGuidi commented 1 year ago

There should be similar flags in the codebase of the io.shapefile library. You can take a look for inspiration, but feel free to choose whatever you prefer. EDIT: @kubaszostak actually the code I mentioned is not in the main branch, see PolygonBuilder here

KubaSzostak commented 1 year ago

I'm going to handle it using following flags (common for all geometry types):

DGuidi commented 1 year ago

In my opinion, the code can be integrated and marked as experimental in the docs, so everyone out there can start using it, but the clear disclaimer that the code written against the old shapefile library can be rewritten, the behavour is different and there will be some bugs.

@FObermaier @airbreather sorry to bother you. any advice on this?