mikemccand / stargazers-migration-test

Testing Lucene's Jira -> GitHub issues migration
0 stars 0 forks source link

LatLonShapeBoundingBoxQuery could make more decisions on inner nodes [LUCENE-8860] #857

Closed mikemccand closed 5 years ago

mikemccand commented 5 years ago

Currently LatLonShapeBoundingBoxQuery with the INTERSECTS relation only returns CELL_INSIDE_QUERY if the query contains ALL minimum bounding rectangles of the indexed triangles.

I think we could return CELL_INSIDE_QUERY if the box contains either of the edges of all MBRs of indexed triangles since triangles are guaranteed to touch all edges of their MBR by definition. In some cases this would help save decoding triangles and running costly point-in-triangle computations.


Legacy Jira details

LUCENE-8860 by Adrien Grand (@jpountz) on Jun 14 2019, resolved Oct 04 2019 Attachments: fig1.png, fig2.png, fig3.png

mikemccand commented 5 years ago

I made the issue about box queries, but that would actually work for polygons too.

[Legacy Jira: Adrien Grand (@jpountz) on Jul 05 2019]

mikemccand commented 5 years ago

I tried looking into this issue but I don't think we have enough information in the inner node to make such determination. For example, if I index two polygons: one L shaped polygon and another small triangle placed inside the L-shape (see blue and green tessellated versions on fig 1):        fig1.png

Then all I have on the inner node level are minPackedValues and maxPackedValue (depicted as purple rectangles on fig 2) this doesn't give me enough information to determine if my query bounding box (red rectangle on fig 2) intersects with blue triangle or not.         fig2.png

So, unless I misunderstood the proposal, I am not really sure how to achieve that on the inner node level.

[Legacy Jira: Igor Motov (@imotov) on Aug 19 2019]

mikemccand commented 5 years ago

I suspect that the confusion is that we don't use minPackedValue and maxPackedValue directly but a combination of them. For instance the current logic short circuits the query on Inner nodes when the query shape fully contains all indexed triangles, using the minimum (x,y) coordinates from minPackedValue and the maximum (x,y) coordinates from maxPackedValue to compute a rectangle that contains all MBRs. In your example this rectangle is [(0,0), (30,20)].

The idea of this issue is that there are weaker conditions we can check that allow us to make the same conclusion. For instance if we take the minimum (x,y) coordinates from minPackedValue, the maximum x from minPackedValue and the maximum y from maxPackedValue we get a smaller rectangle that contains the left edge of all MBRs on the leaf block. If the query fully contains this rectangle then it is guaranteed to intersect all triangles of the leaf given that triangles have at least one point on the left edge of their MBR. In your example this rectangle is [(0,0), (20,20)]. We can do the same test for the right, top and bottom edges.

With your red box query, we wouldn't be able to make a decision on the inner node since it doesn't fully contain any of the 4 rectangles that contains all left, right, top and bottom edges.

[Legacy Jira: Adrien Grand (@jpountz) on Aug 20 2019]

mikemccand commented 5 years ago

Thanks! It is clear now. I think I understood the part about the bounding box queries and I opened PR based on it.  Unfortunately, I don't see how to extend this to the polygons queries. If we take a look at the fig3, the bounding box for the red query completely encapsulates the green polygon's bounding box and yet, we cannot make any conclusion about their intersection based on this information. 

              fig3.png

[Legacy Jira: Igor Motov (@imotov) on Aug 23 2019]

mikemccand commented 5 years ago

Why is it a problem? If the query polygon fully contains all left edges of the MBRs on a given node then we would know it intersects all indexed triangles, just like when the query is a box?

[Legacy Jira: Adrien Grand (@jpountz) on Sep 02 2019]

mikemccand commented 5 years ago

Commit d4ab808a8ab8c58a9ddbc7d4f108df7f1f4c0b51 in lucene-solr's branch refs/heads/master from Igor Motov https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d4ab808

LUCENE-8860: add additional leaf node level optimizations in LatLonShapeBoundingBoxQuery. (#844)

[Legacy Jira: ASF subversion and git services on Oct 04 2019]

mikemccand commented 5 years ago

Commit 800971020aa2a35f9b2ba1b76f7bca244f005f7d in lucene-solr's branch refs/heads/branch_8x from Igor Motov https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8009710

LUCENE-8860: add additional leaf node level optimizations in LatLonShapeBoundingBoxQuery. (#844)

  1. Conflicts:

    lucene/sandbox/src/java/org/apache/lucene/geo/Rectangle2D.java

[Legacy Jira: ASF subversion and git services on Oct 04 2019]

mikemccand commented 5 years ago

Thanks @imotov!

[Legacy Jira: Ignacio Vera (@iverase) on Oct 04 2019]