Closed mikemccand closed 5 years ago
Here is a patch that wraps the boolean scorer in a constant score scorer when there is no scoring clause and the score mode is TOP_SCORES.
[Legacy Jira: Jim Ferenczi (@jimczi) on Jul 26 2019]
The approach works for me. I'm wondering that if we put this logic at the very bottom of Boolean2ScorerSupplier#get instead then we'd also cover the case when there is a SHOULD clause in addition to the FILTER clauses, but it produces a null scorer.
[Legacy Jira: Adrien Grand (@jpountz) on Jul 26 2019]
The logic is already at the bottom of Boolean2ScorerSupplier#get but good call on the SHOULD clause that can produce a null scorer.
We can check the number of scoring clauses after the build instead of checking the number of scorer suppliers. I'll work on a fix.
[Legacy Jira: Jim Ferenczi (@jimczi) on Jul 26 2019]
Sorry I misunderstood the logic but the number of scoring clauses is already computed from the pruned list of scorers so the actual patch works. It's the scorer supplier that can be null but in such case they would not appear in Boolean2ScorerSupplier.
[Legacy Jira: Jim Ferenczi (@jimczi) on Jul 26 2019]
Woops indeed you are right. +1 to the attached patch!
[Legacy Jira: Adrien Grand (@jpountz) on Jul 26 2019]
Commit b8289abeebb23b10ea02b8a27d6b6c07deaa9e50 in lucene-solr's branch refs/heads/master from jimczi https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b8289ab
LUCENE-8935: BooleanQuery with no scoring clause can now early terminate the query when the total hits is not requested.
[Legacy Jira: ASF subversion and git services on Jul 29 2019]
Commit c557e4323daaff43d041d0599b254d94f1b8d792 in lucene-solr's branch refs/heads/branch_8x from jimczi https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c557e43
LUCENE-8935: BooleanQuery with no scoring clause can now early terminate the query when the total hits is not requested.
[Legacy Jira: ASF subversion and git services on Jul 29 2019]
Closing after the 9.0.0 release
[Legacy Jira: Adrien Grand (@jpountz) on Dec 08 2021]
Today a boolean query that is composed of filtering clauses only (more than one) cannot skip documents when the search is executed with the TOP_SCORES mode. However since all documents have a score of 0 it should be possible to early terminate the query as soon as we collected enough top hits. Wrapping the resulting boolean scorer in a constant score scorer should allow early termination in this case and would speed up the retrieval of top hits case considerably if the total hit count is not requested.
Legacy Jira details
LUCENE-8935 by Jim Ferenczi (@jimczi) on Jul 26 2019, resolved Jul 29 2019 Attachments: LUCENE-8935.patch