apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.67k stars 1.03k forks source link

Add DISJOINT and WITHIN support to LatLonShape queries [LUCENE-8447] #9493

Closed asfimport closed 6 years ago

asfimport commented 6 years ago

This feature will add support to LatLonShapeBoundingBoxQuery and LatLonShapePolygonQuery for searching all indexed LatLonShape types that are WITHIN, or DISJOINT to, the target query shape. INTERSECTS remains unchanged.


Migrated from LUCENE-8447 by Nick Knize (@nknize), resolved Aug 08 2018 Attachments: LUCENE-8447.patch (versions: 3)

asfimport commented 6 years ago

Nick Knize (@nknize) (migrated from JIRA)

Attached patch includes the following improvements/changes:

todo

asfimport commented 6 years ago

Adrien Grand (@jpountz) (migrated from JIRA)

Thanks Nick, it looks great. It looks like INTERSECTS now always creates dense bitsets as opposed to filling a sparse DocIdSetBuilder (the one created in LatLonShapeQuery is unused). I see how it's helping with code reuse but maybe we should try to improve it in follow-ups. Regarding your comment about whether the inverse optimization makes sense with DISJOINT and WITHIN, I think it has limited benefits indeed. I wonder whether we should drop this optimization entirely, as well as maybe the other one that checks whether all documents match, in order to make shapes easier to maintain since I don't think these optimizations would often kick in in practice. I suspect it will at least help to remove them temporarily while we are iterating, even if we eventually decide to add them back once the code stabilizes.

asfimport commented 6 years ago

Nick Knize (@nknize) (migrated from JIRA)

Thanks @jpountz I didn't like the fact that INTERSECT was changed to use a dense bitset so I fixed up the patch to only use dense bitsets for WITHIN and DISJOINT queries. I did keep the inverse optimization for INTERSECT queries since I think it adds value for the point case? And I left the matchAllDocs optimization because it does help optimize queries over large spatial areas. I also cleaned up the javadocs a bit. Let me know your thoughts and I can commit for further benchmarking and iterating.

asfimport commented 6 years ago

Lucene/Solr QA (migrated from JIRA)

-1 overall
Vote Subsystem Runtime Comment
Prechecks
+1 test4tests 0m 0s The patch appears to include 5 new or modified test files.
master Compile Tests
+1 compile 0m 28s master passed
Patch Compile Tests
+1 compile 0m 25s the patch passed
+1 javac 0m 25s the patch passed
+1 Release audit (RAT) 0m 23s the patch passed
+1 Check forbidden APIs 0m 19s the patch passed
+1 Validate source patterns 0m 19s the patch passed
Other Tests
+1 unit 10m 35s core in the patch passed.
-1 unit 1m 58s sandbox in the patch failed.
15m 12s
Reason Tests
Failed junit tests lucene.document.TestLatLonLineShapeQueries
Subsystem Report/Notes
JIRA Issue LUCENE-8447
JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12934724/LUCENE-8447.patch
Optional Tests compile javac unit ratsources checkforbiddenapis validatesourcepatterns
uname Linux lucene1-us-west 4.4.0-130-generic #156\~14.04.1-Ubuntu SMP Thu Jun 14 13:51:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool ant
Personality /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
git revision master / 38bf976
ant version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018
Default Java 1.8.0_172
unit https://builds.apache.org/job/PreCommit-LUCENE-Build/65/artifact/out/patch-unit-lucene_sandbox.txt
Test Results https://builds.apache.org/job/PreCommit-LUCENE-Build/65/testReport/
modules C: lucene lucene/core lucene/sandbox U: lucene
Console output https://builds.apache.org/job/PreCommit-LUCENE-Build/65/console
Powered by Apache Yetus 0.7.0 http://yetus.apache.org

This message was automatically generated.

asfimport commented 6 years ago

Nick Knize (@nknize) (migrated from JIRA)

Fixed up the patch. WITHIN validation was missing from the bounding box test in LineValidator.

asfimport commented 6 years ago

Adrien Grand (@jpountz) (migrated from JIRA)

+1 I'd just keep the transpose() method as an impl detail rather than add it to the points API.

asfimport commented 6 years ago

Lucene/Solr QA (migrated from JIRA)

+1 overall
Vote Subsystem Runtime Comment
Prechecks
+1 test4tests 0m 0s The patch appears to include 5 new or modified test files.
master Compile Tests
+1 compile 0m 26s master passed
Patch Compile Tests
+1 compile 1m 34s the patch passed
+1 javac 1m 34s the patch passed
+1 Release audit (RAT) 1m 13s the patch passed
+1 Check forbidden APIs 1m 2s the patch passed
+1 Validate source patterns 1m 2s the patch passed
Other Tests
+1 unit 33m 7s core in the patch passed.
+1 unit 3m 54s sandbox in the patch passed.
42m 50s
Subsystem Report/Notes
JIRA Issue LUCENE-8447
JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12934824/LUCENE-8447.patch
Optional Tests compile javac unit ratsources checkforbiddenapis validatesourcepatterns
uname Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool ant
Personality /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
git revision master / 9b418a4
ant version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018
Default Java 1.8.0_172
Test Results https://builds.apache.org/job/PreCommit-LUCENE-Build/68/testReport/
modules C: lucene lucene/core lucene/sandbox U: lucene
Console output https://builds.apache.org/job/PreCommit-LUCENE-Build/68/console
Powered by Apache Yetus 0.7.0 http://yetus.apache.org

This message was automatically generated.

asfimport commented 6 years ago

ASF subversion and git services (migrated from JIRA)

Commit cbaedb470c8ddc7b21e8ff0b2729c6cf97fbd3d0 in lucene-solr's branch refs/heads/master from @nknize https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cbaedb4

LUCENE-8447: Add DISJOINT and WITHIN support to LatLonShape queries

asfimport commented 6 years ago

ASF subversion and git services (migrated from JIRA)

Commit 3102c21e004d7384beff245f61bf8cb79639ebae in lucene-solr's branch refs/heads/branch_7x from @nknize https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3102c21

LUCENE-8447: Add DISJOINT and WITHIN support to LatLonShape queries

asfimport commented 6 years ago

Nick Knize (@nknize) (migrated from JIRA)

All feedback incorporated. Thank you for all of your help @jpountz.