Open fulmicoton opened 3 months ago
Yes, we don't scan all splits in some cases:
// if client wants full count, or we are doing an aggregation, we want to run every splits.
// However if the aggregation is the tracing aggregation, we don't actually need all splits.
let run_all_splits = request.count_hits() == CountHits::CountAll
|| (request.aggregation_request.is_some()
&& !matches!(split_filter, CanSplitDoBetter::FindTraceIdsAggregation(_)));
This does happen not cross index currently, so N indexes each with one split won't benefit from this currently.
Otherwise I think there's some information optimization missing with the enum. We may want to carry the number up to which we underestimate. So we can identify cases where we can remove searches completely. Currently we may count although we already reached the threshold.
pub enum CountHits {
/// Count all hits, querying all splits.
CountAll = 0,
/// Give an underestimate of the number of hits, possibly skipping entire
/// splits if they are otherwise not needed to fulfull a query.
Underestimate = 1,
}
This ticket is about optimizing was when no count at all is requested.
See if we have optimizations for the case where count is not requested.
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/search-your-data.html#track-total-hits
on elasticsearch