apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.69k stars 1.04k forks source link

Add support for polygon holes to Geo3D [LUCENE-7154] #8209

Closed asfimport closed 8 years ago

asfimport commented 8 years ago

Real-world polygons (e.g. from ESRI) have holes in them. We need polygon support in geo3D that works in the same way.

The proposal would be to change the GeoConvexPolygon constructor to include a number of GeoPolygon inputs, each of which would specify a hole. Then, the GeoPolygonFactory.makeGeoPolygon() method would also need to accept a similar list of GeoPolygon hole descriptions.

This change is likely to be fairly complex because of the already tricky algorithm used to create convex polygons from non-convex ones, implemented in GeoPolygonFactory.


Migrated from LUCENE-7154 by Karl Wright (@DaddyWri), resolved May 16 2016 Attachments: LUCENE-7154.diff

asfimport commented 8 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@rmuir: Would you have an estimate of the number of holes typically seen for a single country in the ESRI data?

asfimport commented 8 years ago

Robert Muir (@rmuir) (migrated from JIRA)

I'm not sure, I was just playing with these geoname shapes:

http://download.geonames.org/export/dump/shapes_simplified_low.zip

asfimport commented 8 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Here is an indented version of the South Africa shape there. I imagine it should have a hole?

http://home.apache.org/\~rmuir/za.txt

asfimport commented 8 years ago

Robert Muir (@rmuir) (migrated from JIRA)

That looks to have a \~ 46-vertex hole for lesotho to me. I think its the last array at the end of the file

asfimport commented 8 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Yup, @rmuir, thanks, I see it.

Geo3D needs to split non-convex polygons into multiple convex polygons, and I'm basically trying to figure out how much work I should put into determining whether or not a given provided hole has any intersection with a given convex polygon. That's question #1. I could use all holes for each convex polygon, of course, and that's what I'll probably do for starters.

The other issue I've got is this business of clockwise vs counterclockwise ordering being meaningful. For Geo3D I just need a point that I know is inside or outside. The internal API asks the user to specify the index of a point that is convex (outside the two adjoining points if you connected them). I haven't yet found an algorithm that maps to clockwise/counterclockwise points, though. How are you handling this elsewhere?

asfimport commented 8 years ago

Robert Muir (@rmuir) (migrated from JIRA)

I haven't yet found an algorithm that maps to clockwise/counterclockwise points, though. How are you handling this elsewhere?

I think some algorithms such as area computation care about this? But we don't need that. We only have 3 operations:

Currently these work based on line crossings algorithm that is fine with things being either clockwise or counterclockwise.

asfimport commented 8 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

This is the initial patch, without tests. It's brute force when it comes to splitting up holes during breakup of non-convex polygons into convex polygons; it just includes all of them with each convex polygon created. That's not ideal because it misrepresents intersections etc., but it should generally work.

asfimport commented 8 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Cool, thanks for hacking this together! I apologize for leaving out Geo3d from my patch but I don't yet have a good understanding :)

asfimport commented 8 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Wouldn't "contains" have a different result if the sense of the polygon were different?

If you draw a polygon on a sphere (or an ellipsoid), there are two areas you might be specifying: one on one side of the shape you drew, and another on the other side.

asfimport commented 8 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Wouldn't "contains" have a different result if the sense of the polygon were different?

I don't think so. Note that our "contains" and "crosses" behave like java's Shape contains/intersects apis: where they can conservatively return false, and true respectively (its safe, just means things are potentially slower/less efficient). So e.g. for our crosses we don't try to optimize rects fully within holes (someone could add that later). For the contains, see the logic below. The only part i added was the stuff within the "game over" block, essentially i did the "minimal effort needed to correctly support holes":

  /**
   * Computes whether a rectangle is within a polygon (shared boundaries not allowed)
   */
  public boolean contains(double minLat, double maxLat, double minLon, double maxLon) {
    // check if rectangle crosses poly (to handle concave/pacman polys), then check that all 4 corners
    // are contained
    boolean contains = crosses(minLat, maxLat, minLon, maxLon) == false &&
                       contains(minLat, minLon) &&
                       contains(minLat, maxLon) &&
                       contains(maxLat, maxLon) && 
                       contains(maxLat, minLon);

    if (contains) {
      // if we cross any hole, game over
      for (Polygon hole : holes) {
        if (hole.crosses(minLat, maxLat, minLon, maxLon)) {
          return false;
        }
      }
      return true;
    } else {
      return false;
    }
  }
asfimport commented 8 years ago

Robert Muir (@rmuir) (migrated from JIRA)

And just pasting that code makes me think its buggy :)

Geo termsenum has:

    `@Override`
    protected boolean cellIntersectsShape(final double minLat, final double maxLat, final double minLon, final double maxLon) {
      return cellContains(minLat, maxLat, minLon, maxLon) || cellWithin(minLat, maxLat, minLon, maxLon)
        || cellCrosses(minLat, maxLat, minLon, maxLon);
    }

Just shows patch needs some more tests...

asfimport commented 8 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

I have an algorithm now for using ordering to determine the sense of the polygon. But I'll create another ticket for that.

asfimport commented 8 years ago

Nick Knize (@nknize) (migrated from JIRA)

If we're heading down the path of supporting OGC SFA and ISO 19107 for polygons I can add the ES code I use to do this. It's on my list to refactor from ES anyway but haven't put it here because of the "we're a search engine API". I was planning to refactor to S4J or JTS. Since that's going to slightly change with a refactor to core it makes sense to add/grow the necessary support here?

asfimport commented 8 years ago

Nick Knize (@nknize) (migrated from JIRA)

I should add: the code validates the proper ordering (right-hand rule for shell, left-hand for holes) and splits dateline-crossing polys into two shapes so dateline crossing polys can be rewritten as a Boolean.OR of two polys.

For XYZ coordinates I'm not so sure this will be necessary for Geo3d? But you tell me, Karl.

asfimport commented 8 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Geo3D has, of course, a whole different way of doing this, with its own problems. I'm currently working on the issues I've found here under the #8212 ticket. I'm making okay progress but not anywhere near done yet.

A general algorithm for detecting clockwise vs. counterclockwise is one area that is a pain for general polygons; I can detect this but ONLY if the proper "pole" point is calculable. I know when I've got a good one, but I currently can only pick one at random until I find one that qualifies. Anyway, please carry on the discussion in #8212.