locationtech / jts

The JTS Topology Suite is a Java library for creating and manipulating vector geometry.
Other
1.97k stars 442 forks source link

OverlayNG area check heuristic causing issues when snapping noder is used #951

Closed grimsa closed 1 year ago

grimsa commented 1 year ago

After upgrading JTS from 1.18.2 to 1.19.0 ran into a number of tests that use difference operation failing with TopologyException: Result area inconsistent with overlay operation.

I tracked it down to OverlayUtil#isResultAreaConsistent (which was added in https://github.com/locationtech/jts/pull/812):

case OverlayNG.DIFFERENCE:
  isConsistent = isLess(areaResult, areaA, AREA_HEURISTIC_TOLERANCE)   // true; args are: 0, 6.512
              && isGreater(areaResult, areaA - areaB, AREA_HEURISTIC_TOLERANCE); // false; args are 0; 8.8818E-16
  break;

In our case we are performing a difference operation on two almost-exactly-the-same geometries and expect an empty result.

Note: we are using snapping noder like in OverlayNGSnappingFunctions code.

Test case 1

This case fails with TopologyException:

@Test
void test2() throws ParseException {
    Geometry g1 = new WKTReader().read("MULTIPOLYGON (((0.37676311 2.57570853, 7.28652472 0.00028375, 7.60034931 0.81686059, 0.50229292 3.4551325, 0.37676311 2.57570853)))");
    Geometry g2 = new WKTReader().read("MULTIPOLYGON (((0.50229292 3.4551325, 7.60034931 0.81686059, 7.28652472 0.00028375, 0.37676311 2.57570853, 0.50229292 3.4551325)))");

    // Note: in our code we have an utility function that wraps this logic
    Geometry overlayResult = OverlayNG.overlay( 
            g1,
            g2,
            OverlayNG.DIFFERENCE,
            g1.getPrecisionModel(),
            new ValidatingNoder(new SnappingNoder(1e-4))
    );

    assertTrue(overlayResult.isEmpty());
}

In this case OverlayUtil#isResultAreaConsistent is reached once, and the expression in isGreater check works out to be:

Test case 2

This case passes with the same data:

@Test       // Passes
void test() throws ParseException {
    Geometry p1 = new WKTReader().read("MULTIPOLYGON (((0.37676311 2.57570853, 7.28652472 0.00028375, 7.60034931 0.81686059, 0.50229292 3.4551325, 0.37676311 2.57570853)))");
    Geometry p2 = new WKTReader().read("MULTIPOLYGON (((0.50229292 3.4551325, 7.60034931 0.81686059, 7.28652472 0.00028375, 0.37676311 2.57570853, 0.50229292 3.4551325)))");

    Geometry result = p1.difference(p2);

    assertEquals(result.getArea(), 0);
}

In this case it seems OverlayNGRobust is used instead, which results in multiple calls to OverlayUtil#isResultAreaConsistent (with slightly different values) and eventually it succeeds in passing the heuristic check.

Question

To me this feels like a bug in the heuristic check, but maybe our utility function for the difference operation is implemented in a suboptimal way?

If it's indeed a bug, and the fix is relatively simple (for someone who does not know JTS internals well), we'd be happy to contribute a PR with the fix.

dr-jts commented 1 year ago

Thanks for the detailed bug report. Sorry for the delay in looking into this. It looks like a design flaw in the heuristic area check.

In this case the result is an empty geometry with zero area. The input geometries are essentially identical, but have rotated vertex sequences. Due to numeric precision issues (the eternal bugbear) the areas evaluate to be slightly different. This causes the area heuristic check to fail, incorrectly.

A : MULTIPOLYGON [ 1 ] - 5 pts
    Len: 16.709769332012375  Area: 6.511982813837381
B : MULTIPOLYGON [ 1 ] - 5 pts
    Len: 16.70976933201237  Area: 6.51198281383738

The heuristic needs to be modified to be less stringent when evaluating very small differences in area. I'm not sure what a suitable fix is, but will think about it. Suggestions welcome.

grimsa commented 1 year ago

In case of DIFFERENCE operation this is the current code:

  public static boolean isResultAreaConsistent(Geometry geom0, Geometry geom1, int opCode, Geometry result) {
    // ...
    double areaResult = result.getArea();
    double areaA = geom0.getArea();
    double areaB = geom1.getArea();
    // ...
      isConsistent = isLess(areaResult, areaA, AREA_HEURISTIC_TOLERANCE)
                  && isGreater(areaResult, areaA - areaB, AREA_HEURISTIC_TOLERANCE);

AREA_HEURISTIC_TOLERANCE is 0.1 and the comparisons are defined as

  private static boolean isLess(double v1, double v2, double tol) {
    return v1 <= v2 * (1 + tol);
  }

  private static boolean isGreater(double v1, double v2, double tol) {
    return v1 >= v2 * (1 - tol);
  }

If I understand correctly, the intention is to check that:

  1. Area of result of the difference operation (areaOfResult) is smaller or equal to the 110% of the area of the first input geometry (as when we remove something from the input geometry, it cannot become bigger)
    1. I think this should be OK as-is, as under no circumstances the area difference should be 10+% greater than area of one of the input geometries.
  2. Area of result of the difference operation (areaOfResult) is >= to 90% of the difference of areas between two input geometries.
    1. If we have no overlap between A and B, the area of the result will be full area of A, and difference of areas will be smaller (as B is non-empty). So the existing check works.
    2. If we have almost no overlap (I imagine two identical squares where square B is a translation of square A on X axis by 0.000...0001), the area of the result will be near 0 (but >0), and the difference of input geometry areas should be 0 (as squares are identical). Then the check would be isGreater(positiveAlmostZero, 0, tolerance), which means positiveAlmostZero>= 0 * 90%, i.e. the check also works.
    3. If we have two almost identical geometries (two squares where one on A one of the corner is translated on X axis by a tiny distance, and on another - the same corner is translated by a slightly different tiny distance on Y axis), then area of result will be 0, and difference between areas can be almost zero (either positive or negative, depending on A or B is larger).
      1. isGreater(0, negativeAlmostZero, tolerance) would result in return 0 >= negativeAlmostZero * 90%, which is true, so the check works.
      2. isGreater(0, positiveAlmostZero, tolerance) would result in return 0 >= positiveAlmostZero * 90%, which is false (like in the reported case), so the check does not work.

I tried to think of possible solutions, but did not come up with any definitive answer quickly (but I can think more of it if needed).

One line of thinking was to add special handling for 0 or almost-zero values. But I did not come up with a way how to come up with "what is almost zero" without hardcoding a constant, e.g. 0.000001. Maybe that could be expressed as some % of the area of A or B (whichever is smaller and non-empty)? But that would likely require passing an extra arg to isGreater.

Another idea was to base it on relative difference between sizes (one such example function is provided at the bottom of this page: https://c-faq.com/fp/fpequal.html), but then I think it does not work for non-near-zero cases as if there is no overlap between two similarly-sized polygons, the relative difference between area of the result (area A) and difference between areas (close to 0) can be very large.

dr-jts commented 1 year ago

See #1005 for a fix for this issue.

@grimsa it would be great if you can confirm the fix works.

grimsa commented 1 year ago

@dr-jts #1005 seems to fix the issue we had - thank you for the fix! It will let us benefit from JTS 1.20.0 once it is released.

However, while testing I noticed an unrelated minor change in the output of negative buffer operation since 1.19.0 - reported it in https://github.com/locationtech/jts/issues/1007