elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.73k stars 24.68k forks source link

Dense representation for aggs #77449

Open nik9000 opened 3 years ago

nik9000 commented 3 years ago

Right now the reduction phase of aggregations is the most memory intensive part of the process of running aggs. This is weird because we reduce the amount of data by a good bit before leaving the data node. But the "result" representation that the data nodes make is so big and bloated that it takes up a ton of space. That is what we serialize over the wire and reduce across all shards. The write size can get quite large. As can the memory usage of holding the things before reduction. I'd love to transition us to a more dense response over the wire - something like a single named-writeable per aggregation rather than per bucket.

This could:

  1. Reduce the over-the-wire representation of dense aggregation results
  2. Reduce memory usage of aggregation results
  3. Allow us to bring our standard circuit breaker stuff to bear on the aggs reduction process - potentially allowing us to raise the max bucket limit. OTOH, we still have to turn the result into json which will be big.
  4. Allow us to raise the batch reduce size, making the incremental reductions more efficient.
  5. Allow us to pull back many many many more buckets to the coordinating node so long as we don't return them over JSON. JSON/CBOR/SMILE/YAML is always going to be a very expensive way to send these results back. 5a. If we had a non-JSON/CBOR/SMILE/YAML way to send the response back then we could send back many many buckets. Especially if the response was selective in what needed to be sent back. Maybe some kind of columnar format? 5b. If we had some way to throw out "intermediate" aggregation results that the caller isn't interested in. Say you want to get the hour of the day that each server sent the most outgoing bytes. In that case you'd do term(server)->date_histogram->rate(bytes_out) and then want to select the best hour from the date_histogram and throw out the other buckets. In a case like this the JSON response could be small but the answer could still require creating many buckets.
elasticmachine commented 3 years ago

Pinging @elastic/es-analytics-geo (Team:Analytics)

DaveCTurner commented 1 year ago

On the topic of response size, we can now stream pretty-much-arbitrarily-large responses back to clients using HTTP chunked-encoding, see ChunkedToXContent and friends. Of course that might cause problems for the client, but at least it stops Elasticsearch from blowing up in this situation.

I also encountered a user who was interested in understanding the (peak?) memory usage of their individual searches; I thought this'd be a good place to record the suggestion, given that it seems that it'd need exactly the accurate tracking of heap usage to which this issue pertains.

elasticsearchmachine commented 3 months ago

Pinging @elastic/es-analytical-engine (Team:Analytics)