Open jpountz opened 2 years ago
Pinging @elastic/es-search (Team:Search)
I take
The PR in Lucene that add Weight#count : https://github.com/apache/lucene/pull/242
@OlivierCavadenti you may be interested in https://github.com/elastic/elasticsearch/pull/81322 - some of the plumbing I had to to there is the same as you'd have to do for this.
@OlivierCavadenti you may be interested in #81322 - some of the plumbing I had to to there is the same as you'd have to do for this.
Thanks ! I think I am near something but I have two tests that fail (testCountWithoutDeletions => "should not collect more than 0 doc per segment, got 1", it seems we have a special case with hasDeletions == false ? ) and testNumericSortOptimization (java.lang.AssertionError: Expected :GREATER_THAN_OR_EQUAL_TO Actual : EQUAL_TO).
I will investigate this week-end.
I have reverted #89047, which means that we still have our own shortcutTotalHitCount which is independent from Weight#count
in Lucene. This is something we will want to address. One way I considered was creating a throw-away Weight
, but that requires duplicating the Weight
which may have a cost. Ideally, the mechanism is brought to Lucene entirely, which is something that I will look into.
Pinging @elastic/es-search-foundations (Team:Search Foundations)
Elasticsearch has a special
shortcutTotalHitCount
hack inTopDocsCollectorContext.java
that it uses to avoid collecting all matches only to get the hit count when the hit count can otherwise be inferred from index statistics.Lucene introduced a new API that does exactly that and that should support more queries in the near future. We should cut over usage of
shortcutTotalHitCount
toWeight#count
instead.