QueryRescorer sort optimization [LUCENE-8956]

mikemccand commented 5 years ago

This patch addresses a TODO in QueryRescorer: We should not sort the full array of the results returned from rescoring, but rather only topN, when topN is less than total hits.

Made this optimization with some suggestions from @jpountz and @jimczi, this is my first lucene patch submission.

Legacy Jira details

LUCENE-8956 by Paul Sanwald (@pcsanwald) on Aug 22 2019, resolved Sep 05 2019 Attachments: LUCENE-8956.patch

mikemccand commented 5 years ago

The changes in src/java look good to me. Maybe we could make the test a bit better:

The name is a bit vague, as there could be lots of different optimizations in the rescoring logic. Maybe call it "testRescoreSubsetOfHits" or something along those lines?
Currently the test only checks that the correct number of hits is returned, we should also check that we got the right hits?

Some minor comments:

No need for a BooleanQuery in the test since you are adding a single clause. Or did you plan to add wordTwo in another clause?
I can't check right now but I think randomizedtesting already has a utility class called RandomPicks to help pick a random element from a list.

[Legacy Jira: Adrien Grand (@jpountz) on Aug 22 2019]

mikemccand commented 5 years ago

✔ +1 overall

Vote	Subsystem	Runtime	Comment
Prechecks
+1	test4tests	0m 0s	The patch appears to include 1 new or modified test files.
master Compile Tests
+1	compile	0m 50s	master passed
Patch Compile Tests
+1	compile	0m 48s	the patch passed
+1	javac	0m 48s	the patch passed
+1	Release audit (RAT)	0m 48s	the patch passed
+1	Check forbidden APIs	0m 48s	the patch passed
+1	Validate source patterns	0m 48s	the patch passed
Other Tests
+1	unit	16m 50s	core in the patch passed.
	21m 21s

Subsystem	Report/Notes
JIRA Issue	LUCENE-8956
JIRA Patch URL	https://issues.apache.org/jira/secure/attachment/12978313/LUCENE-8956.patch
Optional Tests	compile javac unit ratsources checkforbiddenapis validatesourcepatterns
uname	Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	ant
Personality	/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
git revision	master / 1cbc5eaf51
ant	version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019
Default Java	LTS
Test Results	https://builds.apache.org/job/PreCommit-LUCENE-Build/201/testReport/
modules	C: lucene/core U: lucene/core
Console output	https://builds.apache.org/job/PreCommit-LUCENE-Build/201/console
Powered by	Apache Yetus 0.7.0 http://yetus.apache.org

This message was automatically generated.

[Legacy Jira: Lucene/Solr QA on Aug 23 2019]

mikemccand commented 5 years ago

Thanks for your review @jpountz, I've updated the patch. The one area I'm struggling to think of a good way is the assertion you mentioned: usually, asserting on length doesn't feel like a good check to me, but, for this case, I did want to make sure the logic is exercised, so the first query is a term query that will match most of the documents in the set, and the second is a phrase that should match less and thus return different scores. So, I've added a loop that asserts the topN scores are in fact ordered.

Do you recommend to remove some of the randomization and assert on specific scores? Or is there another strategy I'm not thinking of?

[Legacy Jira: Paul Sanwald (@pcsanwald) on Aug 23 2019]

mikemccand commented 5 years ago

✔ +1 overall

Vote	Subsystem	Runtime	Comment
Prechecks
+1	test4tests	0m 0s	The patch appears to include 1 new or modified test files.
master Compile Tests
+1	compile	0m 26s	master passed
Patch Compile Tests
+1	compile	0m 25s	the patch passed
+1	javac	0m 25s	the patch passed
+1	Release audit (RAT)	0m 25s	the patch passed
+1	Check forbidden APIs	0m 25s	the patch passed
+1	Validate source patterns	0m 25s	the patch passed
Other Tests
+1	unit	28m 21s	core in the patch passed.
	31m 3s

Subsystem	Report/Notes
JIRA Issue	LUCENE-8956
JIRA Patch URL	https://issues.apache.org/jira/secure/attachment/12978446/LUCENE-8956.patch
Optional Tests	compile javac unit ratsources checkforbiddenapis validatesourcepatterns
uname	Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool	ant
Personality	/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
git revision	master / f335ac9
ant	version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018
Default Java	LTS
Test Results	https://builds.apache.org/job/PreCommit-LUCENE-Build/202/testReport/
modules	C: lucene/core U: lucene/core
Console output	https://builds.apache.org/job/PreCommit-LUCENE-Build/202/console
Powered by	Apache Yetus 0.7.0 http://yetus.apache.org

This message was automatically generated.

[Legacy Jira: Lucene/Solr QA on Aug 23 2019]

mikemccand commented 5 years ago

I was thinking that we could verify that we have the right hits by rescoring twice, once with topN=random().nextInt(numDocs) like in your patch, and another time with topN=numDocs, then make sure that the first topN hits are the same in both cases (CheckHits#checkEquals might help).

[Legacy Jira: Adrien Grand (@jpountz) on Sep 02 2019]

mikemccand commented 5 years ago

@jpountz I've updated the patch with a new test :)

[Legacy Jira: Paul Sanwald (@pcsanwald) on Sep 04 2019]

mikemccand commented 5 years ago

✔ +1 overall

Vote	Subsystem	Runtime	Comment
Prechecks
+1	test4tests	0m 0s	The patch appears to include 1 new or modified test files.
master Compile Tests
+1	compile	0m 42s	master passed
Patch Compile Tests
+1	compile	0m 41s	the patch passed
+1	javac	0m 41s	the patch passed
+1	Release audit (RAT)	0m 41s	the patch passed
+1	Check forbidden APIs	0m 41s	the patch passed
+1	Validate source patterns	0m 41s	the patch passed
Other Tests
+1	unit	16m 2s	core in the patch passed.
	19m 50s

Subsystem	Report/Notes
JIRA Issue	LUCENE-8956
JIRA Patch URL	https://issues.apache.org/jira/secure/attachment/12979474/LUCENE-8956.patch
Optional Tests	compile javac unit ratsources checkforbiddenapis validatesourcepatterns
uname	Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	ant
Personality	/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
git revision	master / 5204d0f963c
ant	version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019
Default Java	LTS
Test Results	https://builds.apache.org/job/PreCommit-LUCENE-Build/204/testReport/
modules	C: lucene/core U: lucene/core
Console output	https://builds.apache.org/job/PreCommit-LUCENE-Build/204/console
Powered by	Apache Yetus 0.7.0 http://yetus.apache.org

This message was automatically generated.

[Legacy Jira: Lucene/Solr QA on Sep 04 2019]

mikemccand commented 5 years ago

Commit 42fc9690bfd545d2a147f306ed08f175a0293fc0 in lucene-solr's branch refs/heads/branch_8x from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=42fc969

LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.

[Legacy Jira: ASF subversion and git services on Sep 05 2019]

mikemccand commented 5 years ago

Commit fd3ae878051ab4854a3739f18e1b982fc9bb47fa in lucene-solr's branch refs/heads/master from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd3ae87

LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.

[Legacy Jira: ASF subversion and git services on Sep 05 2019]

mikemccand commented 5 years ago

This looks great, thanks @pcsanwald.

[Legacy Jira: Adrien Grand (@jpountz) on Sep 05 2019]

mikemccand commented 5 years ago

Commit bc4a84e913549144b5bf4907c7dc7f8f1d988efe in lucene-solr's branch refs/heads/branch_8x from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bc4a84e

Revert "LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits."

This reverts commit 42fc9690bfd545d2a147f306ed08f175a0293fc0.

[Legacy Jira: ASF subversion and git services on Sep 05 2019]

mikemccand commented 5 years ago

Commit e1c4742abfb406fedc1e21fa17c68677687311e5 in lucene-solr's branch refs/heads/master from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e1c4742

Revert "LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits."

This reverts commit fd3ae878051ab4854a3739f18e1b982fc9bb47fa.

[Legacy Jira: ASF subversion and git services on Sep 05 2019]

mikemccand commented 5 years ago

I had to revert this change due to test failures such as

13:40:13    [junit4] Suite: org.apache.lucene.search.TestQueryRescorer
13:40:13    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestQueryRescorer -Dtests.method=testRescoreIsIdempotent -Dtests.seed=2E252C4ACD6D510E -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=nb -Dtests.timezone=America/Indiana/Winamac -Dtests.asserts=true -Dtests.file.encoding=UTF8
13:40:13    [junit4] FAILURE 0.04s J3 | TestQueryRescorer.testRescoreIsIdempotent <<<
13:40:13    [junit4]    > Throwable #1: junit.framework.AssertionFailedError: Hit 3 docnumbers don't match
13:40:13    [junit4]    > Hits length1=30   length2=30
13:40:13    [junit4]    > hit=0: doc11=0.027509231 shardIndex=-1,    doc11=0.027509231 shardIndex=-1
13:40:13    [junit4]    > hit=1: doc59=0.026626626 shardIndex=-1,    doc59=0.026626626 shardIndex=-1
13:40:13    [junit4]    > hit=2: doc94=0.025820786 shardIndex=-1,    doc94=0.025820786 shardIndex=-1
13:40:13    [junit4]    > hit=3: doc14=0.025403785 shardIndex=-1,    doc13=0.025586303 shardIndex=-1
13:40:13    [junit4]    > hit=4: doc13=0.025586303 shardIndex=-1,    doc14=0.025403785 shardIndex=-1
13:40:13    [junit4]    > hit=5: doc69=0.02535518 shardIndex=-1,     doc69=0.02535518 shardIndex=-1
13:40:13    [junit4]    > hit=6: doc37=0.024901403 shardIndex=-1,    doc37=0.024901403 shardIndex=-1
13:40:13    [junit4]    > hit=7: doc73=0.023868036 shardIndex=-1,    doc73=0.023868036 shardIndex=-1
13:40:13    [junit4]    > hit=8: doc63=0.021619176 shardIndex=-1,    doc63=0.021619176 shardIndex=-1
13:40:13    [junit4]    > hit=9: doc72=0.019876242 shardIndex=-1,    doc72=0.019876242 shardIndex=-1
13:40:13    [junit4]    > hit=10: doc50=0.01923588 shardIndex=-1,    doc50=0.01923588 shardIndex=-1
13:40:13    [junit4]    > hit=11: doc10=0.018147592 shardIndex=-1,   doc10=0.018147592 shardIndex=-1
13:40:13    [junit4]    > hit=12: doc0=0.018087288 shardIndex=-1,    doc0=0.018087288 shardIndex=-1
13:40:13    [junit4]    > hit=13: doc98=0.017422743 shardIndex=-1,   doc98=0.017422743 shardIndex=-1
13:40:13    [junit4]    > hit=14: doc47=0.016934035 shardIndex=-1,   doc47=0.016934035 shardIndex=-1
13:40:13    [junit4]    > hit=15: doc6=0.016177062 shardIndex=-1,    doc79=0.016415473 shardIndex=-1
13:40:13    [junit4]    > hit=16: doc96=0.016177062 shardIndex=-1,   doc6=0.016177062 shardIndex=-1
13:40:13    [junit4]    > hit=17: doc79=0.016415473 shardIndex=-1,   doc96=0.016177062 shardIndex=-1
13:40:13    [junit4]    > hit=18: doc55=0.015521079 shardIndex=-1,   doc55=0.015521079 shardIndex=-1
13:40:13    [junit4]    > hit=19: doc34=0.011353219 shardIndex=-1,   doc34=0.011353219 shardIndex=-1
13:40:13    [junit4]    > hit=20: doc9=0.010401427 shardIndex=-1,    doc9=0.010401427 shardIndex=-1
13:40:13    [junit4]    > hit=21: doc12=0.0033432373 shardIndex=-1,  doc12=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > hit=22: doc15=0.0033432373 shardIndex=-1,  doc15=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > hit=23: doc23=0.0033432373 shardIndex=-1,  doc23=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > hit=24: doc26=0.0033432373 shardIndex=-1,  doc26=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > hit=25: doc31=0.0033432373 shardIndex=-1,  doc31=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > hit=26: doc33=0.0033432373 shardIndex=-1,  doc33=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > hit=27: doc41=0.0033432373 shardIndex=-1,  doc41=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > hit=28: doc43=0.0033432373 shardIndex=-1,  doc43=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > hit=29: doc45=0.0033432373 shardIndex=-1,  doc45=0.0033432373 shardIndex=-1
13:40:13    [junit4]    > for query:field:"river river"~1
13:40:13    [junit4]    >   at __randomizedtesting.SeedInfo.seed([2E252C4ACD6D510E:E4C37C59CD79ADE6]:0)
13:40:13    [junit4]    >   at junit.framework.Assert.fail(Assert.java:57)
13:40:13    [junit4]    >   at org.apache.lucene.search.CheckHits.checkEqual(CheckHits.java:205)
13:40:13    [junit4]    >   at org.apache.lucene.search.TestQueryRescorer.testRescoreIsIdempotent(TestQueryRescorer.java:142)
13:40:13    [junit4]    >   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
13:40:13    [junit4]    >   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
13:40:13    [junit4]    >   at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
13:40:13    [junit4]    >   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
13:40:13    [junit4]    >   at java.base/java.lang.Thread.run(Thread.java:834)

@pcsanwald told me he is looking into it.

[Legacy Jira: Adrien Grand (@jpountz) on Sep 05 2019]

mikemccand commented 5 years ago

Sorry for the noise here: The bug here was that the test that exercises rescore twice was re-using the hits object, which contains an array of ScoreDocs that are mutated by rescore. I've uploaded a new patch that I believe addresses the bug.

[Legacy Jira: Paul Sanwald (@pcsanwald) on Sep 05 2019]

mikemccand commented 5 years ago

Commit 13addb4d5e9125d15ab1c30ab1ae4a17fa9a3ff7 in lucene-solr's branch refs/heads/branch_8x from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=13addb4

LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.

[Legacy Jira: ASF subversion and git services on Sep 05 2019]

mikemccand commented 5 years ago

Commit 3ad6e4f0bcd0a9ec1ef3034d2c20fa666b18d2b3 in lucene-solr's branch refs/heads/master from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3ad6e4f

LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.

[Legacy Jira: ASF subversion and git services on Sep 05 2019]

mikemccand commented 5 years ago

Merged, thanks.

[Legacy Jira: Adrien Grand (@jpountz) on Sep 05 2019]

mikemccand commented 5 years ago

Commit e1c4742abfb406fedc1e21fa17c68677687311e5 in lucene-solr's branch refs/heads/jira/SOLR-13677_3 from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e1c4742

Revert "LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits."

This reverts commit fd3ae878051ab4854a3739f18e1b982fc9bb47fa.

[Legacy Jira: ASF subversion and git services on Sep 08 2019]

mikemccand commented 5 years ago

Commit 3ad6e4f0bcd0a9ec1ef3034d2c20fa666b18d2b3 in lucene-solr's branch refs/heads/jira/SOLR-13677_3 from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3ad6e4f

LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.

[Legacy Jira: ASF subversion and git services on Sep 08 2019]

mikemccand / stargazers-migration-test

QueryRescorer sort optimization [LUCENE-8956] #953

Legacy Jira details