Closed mikemccand closed 5 years ago
The changes in src/java look good to me. Maybe we could make the test a bit better:
Some minor comments:
[Legacy Jira: Adrien Grand (@jpountz) on Aug 22 2019]
✔ +1 overall |
---|
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
Prechecks | |||
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
master Compile Tests | |||
+1 | compile | 0m 50s | master passed |
Patch Compile Tests | |||
+1 | compile | 0m 48s | the patch passed |
+1 | javac | 0m 48s | the patch passed |
+1 | Release audit (RAT) | 0m 48s | the patch passed |
+1 | Check forbidden APIs | 0m 48s | the patch passed |
+1 | Validate source patterns | 0m 48s | the patch passed |
Other Tests | |||
+1 | unit | 16m 50s | core in the patch passed. |
21m 21s |
Subsystem | Report/Notes |
---|---|
JIRA Issue | LUCENE-8956 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12978313/LUCENE-8956.patch |
Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns |
uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | ant |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh |
git revision | master / 1cbc5eaf51 |
ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
Default Java | LTS |
Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/201/testReport/ |
modules | C: lucene/core U: lucene/core |
Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/201/console |
Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
[Legacy Jira: Lucene/Solr QA on Aug 23 2019]
Thanks for your review @jpountz, I've updated the patch. The one area I'm struggling to think of a good way is the assertion you mentioned: usually, asserting on length doesn't feel like a good check to me, but, for this case, I did want to make sure the logic is exercised, so the first query is a term query that will match most of the documents in the set, and the second is a phrase that should match less and thus return different scores. So, I've added a loop that asserts the topN scores are in fact ordered.
Do you recommend to remove some of the randomization and assert on specific scores? Or is there another strategy I'm not thinking of?
[Legacy Jira: Paul Sanwald (@pcsanwald) on Aug 23 2019]
✔ +1 overall |
---|
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
Prechecks | |||
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
master Compile Tests | |||
+1 | compile | 0m 26s | master passed |
Patch Compile Tests | |||
+1 | compile | 0m 25s | the patch passed |
+1 | javac | 0m 25s | the patch passed |
+1 | Release audit (RAT) | 0m 25s | the patch passed |
+1 | Check forbidden APIs | 0m 25s | the patch passed |
+1 | Validate source patterns | 0m 25s | the patch passed |
Other Tests | |||
+1 | unit | 28m 21s | core in the patch passed. |
31m 3s |
Subsystem | Report/Notes |
---|---|
JIRA Issue | LUCENE-8956 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12978446/LUCENE-8956.patch |
Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns |
uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | ant |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh |
git revision | master / f335ac9 |
ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 |
Default Java | LTS |
Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/202/testReport/ |
modules | C: lucene/core U: lucene/core |
Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/202/console |
Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
[Legacy Jira: Lucene/Solr QA on Aug 23 2019]
I was thinking that we could verify that we have the right hits by rescoring twice, once with topN=random().nextInt(numDocs) like in your patch, and another time with topN=numDocs, then make sure that the first topN hits are the same in both cases (CheckHits#checkEquals might help).
[Legacy Jira: Adrien Grand (@jpountz) on Sep 02 2019]
@jpountz I've updated the patch with a new test :)
[Legacy Jira: Paul Sanwald (@pcsanwald) on Sep 04 2019]
✔ +1 overall |
---|
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
Prechecks | |||
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
master Compile Tests | |||
+1 | compile | 0m 42s | master passed |
Patch Compile Tests | |||
+1 | compile | 0m 41s | the patch passed |
+1 | javac | 0m 41s | the patch passed |
+1 | Release audit (RAT) | 0m 41s | the patch passed |
+1 | Check forbidden APIs | 0m 41s | the patch passed |
+1 | Validate source patterns | 0m 41s | the patch passed |
Other Tests | |||
+1 | unit | 16m 2s | core in the patch passed. |
19m 50s |
Subsystem | Report/Notes |
---|---|
JIRA Issue | LUCENE-8956 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12979474/LUCENE-8956.patch |
Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns |
uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | ant |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh |
git revision | master / 5204d0f963c |
ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
Default Java | LTS |
Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/204/testReport/ |
modules | C: lucene/core U: lucene/core |
Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/204/console |
Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
[Legacy Jira: Lucene/Solr QA on Sep 04 2019]
Commit 42fc9690bfd545d2a147f306ed08f175a0293fc0 in lucene-solr's branch refs/heads/branch_8x from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=42fc969
LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.
[Legacy Jira: ASF subversion and git services on Sep 05 2019]
Commit fd3ae878051ab4854a3739f18e1b982fc9bb47fa in lucene-solr's branch refs/heads/master from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd3ae87
LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.
[Legacy Jira: ASF subversion and git services on Sep 05 2019]
This looks great, thanks @pcsanwald.
[Legacy Jira: Adrien Grand (@jpountz) on Sep 05 2019]
Commit bc4a84e913549144b5bf4907c7dc7f8f1d988efe in lucene-solr's branch refs/heads/branch_8x from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bc4a84e
Revert "LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits."
This reverts commit 42fc9690bfd545d2a147f306ed08f175a0293fc0.
[Legacy Jira: ASF subversion and git services on Sep 05 2019]
Commit e1c4742abfb406fedc1e21fa17c68677687311e5 in lucene-solr's branch refs/heads/master from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e1c4742
Revert "LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits."
This reverts commit fd3ae878051ab4854a3739f18e1b982fc9bb47fa.
[Legacy Jira: ASF subversion and git services on Sep 05 2019]
I had to revert this change due to test failures such as
13:40:13 [junit4] Suite: org.apache.lucene.search.TestQueryRescorer
13:40:13 [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestQueryRescorer -Dtests.method=testRescoreIsIdempotent -Dtests.seed=2E252C4ACD6D510E -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=nb -Dtests.timezone=America/Indiana/Winamac -Dtests.asserts=true -Dtests.file.encoding=UTF8
13:40:13 [junit4] FAILURE 0.04s J3 | TestQueryRescorer.testRescoreIsIdempotent <<<
13:40:13 [junit4] > Throwable #1: junit.framework.AssertionFailedError: Hit 3 docnumbers don't match
13:40:13 [junit4] > Hits length1=30 length2=30
13:40:13 [junit4] > hit=0: doc11=0.027509231 shardIndex=-1, doc11=0.027509231 shardIndex=-1
13:40:13 [junit4] > hit=1: doc59=0.026626626 shardIndex=-1, doc59=0.026626626 shardIndex=-1
13:40:13 [junit4] > hit=2: doc94=0.025820786 shardIndex=-1, doc94=0.025820786 shardIndex=-1
13:40:13 [junit4] > hit=3: doc14=0.025403785 shardIndex=-1, doc13=0.025586303 shardIndex=-1
13:40:13 [junit4] > hit=4: doc13=0.025586303 shardIndex=-1, doc14=0.025403785 shardIndex=-1
13:40:13 [junit4] > hit=5: doc69=0.02535518 shardIndex=-1, doc69=0.02535518 shardIndex=-1
13:40:13 [junit4] > hit=6: doc37=0.024901403 shardIndex=-1, doc37=0.024901403 shardIndex=-1
13:40:13 [junit4] > hit=7: doc73=0.023868036 shardIndex=-1, doc73=0.023868036 shardIndex=-1
13:40:13 [junit4] > hit=8: doc63=0.021619176 shardIndex=-1, doc63=0.021619176 shardIndex=-1
13:40:13 [junit4] > hit=9: doc72=0.019876242 shardIndex=-1, doc72=0.019876242 shardIndex=-1
13:40:13 [junit4] > hit=10: doc50=0.01923588 shardIndex=-1, doc50=0.01923588 shardIndex=-1
13:40:13 [junit4] > hit=11: doc10=0.018147592 shardIndex=-1, doc10=0.018147592 shardIndex=-1
13:40:13 [junit4] > hit=12: doc0=0.018087288 shardIndex=-1, doc0=0.018087288 shardIndex=-1
13:40:13 [junit4] > hit=13: doc98=0.017422743 shardIndex=-1, doc98=0.017422743 shardIndex=-1
13:40:13 [junit4] > hit=14: doc47=0.016934035 shardIndex=-1, doc47=0.016934035 shardIndex=-1
13:40:13 [junit4] > hit=15: doc6=0.016177062 shardIndex=-1, doc79=0.016415473 shardIndex=-1
13:40:13 [junit4] > hit=16: doc96=0.016177062 shardIndex=-1, doc6=0.016177062 shardIndex=-1
13:40:13 [junit4] > hit=17: doc79=0.016415473 shardIndex=-1, doc96=0.016177062 shardIndex=-1
13:40:13 [junit4] > hit=18: doc55=0.015521079 shardIndex=-1, doc55=0.015521079 shardIndex=-1
13:40:13 [junit4] > hit=19: doc34=0.011353219 shardIndex=-1, doc34=0.011353219 shardIndex=-1
13:40:13 [junit4] > hit=20: doc9=0.010401427 shardIndex=-1, doc9=0.010401427 shardIndex=-1
13:40:13 [junit4] > hit=21: doc12=0.0033432373 shardIndex=-1, doc12=0.0033432373 shardIndex=-1
13:40:13 [junit4] > hit=22: doc15=0.0033432373 shardIndex=-1, doc15=0.0033432373 shardIndex=-1
13:40:13 [junit4] > hit=23: doc23=0.0033432373 shardIndex=-1, doc23=0.0033432373 shardIndex=-1
13:40:13 [junit4] > hit=24: doc26=0.0033432373 shardIndex=-1, doc26=0.0033432373 shardIndex=-1
13:40:13 [junit4] > hit=25: doc31=0.0033432373 shardIndex=-1, doc31=0.0033432373 shardIndex=-1
13:40:13 [junit4] > hit=26: doc33=0.0033432373 shardIndex=-1, doc33=0.0033432373 shardIndex=-1
13:40:13 [junit4] > hit=27: doc41=0.0033432373 shardIndex=-1, doc41=0.0033432373 shardIndex=-1
13:40:13 [junit4] > hit=28: doc43=0.0033432373 shardIndex=-1, doc43=0.0033432373 shardIndex=-1
13:40:13 [junit4] > hit=29: doc45=0.0033432373 shardIndex=-1, doc45=0.0033432373 shardIndex=-1
13:40:13 [junit4] > for query:field:"river river"~1
13:40:13 [junit4] > at __randomizedtesting.SeedInfo.seed([2E252C4ACD6D510E:E4C37C59CD79ADE6]:0)
13:40:13 [junit4] > at junit.framework.Assert.fail(Assert.java:57)
13:40:13 [junit4] > at org.apache.lucene.search.CheckHits.checkEqual(CheckHits.java:205)
13:40:13 [junit4] > at org.apache.lucene.search.TestQueryRescorer.testRescoreIsIdempotent(TestQueryRescorer.java:142)
13:40:13 [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
13:40:13 [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
13:40:13 [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
13:40:13 [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:566)
13:40:13 [junit4] > at java.base/java.lang.Thread.run(Thread.java:834)
@pcsanwald told me he is looking into it.
[Legacy Jira: Adrien Grand (@jpountz) on Sep 05 2019]
Sorry for the noise here: The bug here was that the test that exercises rescore twice was re-using the hits object, which contains an array of ScoreDocs that are mutated by rescore. I've uploaded a new patch that I believe addresses the bug.
[Legacy Jira: Paul Sanwald (@pcsanwald) on Sep 05 2019]
Commit 13addb4d5e9125d15ab1c30ab1ae4a17fa9a3ff7 in lucene-solr's branch refs/heads/branch_8x from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=13addb4
LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.
[Legacy Jira: ASF subversion and git services on Sep 05 2019]
Commit 3ad6e4f0bcd0a9ec1ef3034d2c20fa666b18d2b3 in lucene-solr's branch refs/heads/master from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3ad6e4f
LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.
[Legacy Jira: ASF subversion and git services on Sep 05 2019]
Merged, thanks.
[Legacy Jira: Adrien Grand (@jpountz) on Sep 05 2019]
Commit e1c4742abfb406fedc1e21fa17c68677687311e5 in lucene-solr's branch refs/heads/jira/SOLR-13677_3 from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e1c4742
Revert "LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits."
This reverts commit fd3ae878051ab4854a3739f18e1b982fc9bb47fa.
[Legacy Jira: ASF subversion and git services on Sep 08 2019]
Commit 3ad6e4f0bcd0a9ec1ef3034d2c20fa666b18d2b3 in lucene-solr's branch refs/heads/jira/SOLR-13677_3 from Adrien Grand https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3ad6e4f
LUCENE-8956: QueryRescorer now only sorts the first topN hits instead of all initial hits.
[Legacy Jira: ASF subversion and git services on Sep 08 2019]
This patch addresses a TODO in QueryRescorer: We should not sort the full array of the results returned from rescoring, but rather only topN, when topN is less than total hits.
Made this optimization with some suggestions from @jpountz and @jimczi, this is my first lucene patch submission.
Legacy Jira details
LUCENE-8956 by Paul Sanwald (@pcsanwald) on Aug 22 2019, resolved Sep 05 2019 Attachments: LUCENE-8956.patch