elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.18k stars 24.84k forks source link

Enable Circuit Breaker tracking in more parts of the aggregations framework #89437

Open not-napoleon opened 2 years ago

not-napoleon commented 2 years ago

Description

This meta issue tracks the effort to extend circuit breaker memory tracking beyond the collect phase of aggregations. There are several existing issues related to this, which do a good job of describing the problem (and are linked below), but we need a place to track the tasks for fixing this. That's what this issue is for.

Plan

Currently (8.4) aggregations create an object per collected bucket, called InternalAggregation. These objects are stored in the QuerySearchResult which is responsible for serializing them from the data nodes back to the coordinator, and also for de-serializing them on coordinator side. Managing these objects is quite tricky, and does not provide good places to inject the circuit breaker logic.

Instead, we want to move to a dense representation, which would create one object per aggregator. These objects would be Releasable, and responsible for tracking both the post-collection data node side memory usage and the reduce time coordinating node memory usage. Obviously this involves a (big) change to the wire format used for QuerySearchResult. Doing this in a backwards compatible way is non trivial.

Tasks


Vague Tasks

Related Issues

elasticsearchmachine commented 2 years ago

Pinging @elastic/es-analytics-geo (Team:Analytics)