Open nik9000 opened 3 years ago
Pinging @elastic/es-analytics-geo (Team:Analytics)
On the topic of response size, we can now stream pretty-much-arbitrarily-large responses back to clients using HTTP chunked-encoding, see ChunkedToXContent
and friends. Of course that might cause problems for the client, but at least it stops Elasticsearch from blowing up in this situation.
I also encountered a user who was interested in understanding the (peak?) memory usage of their individual searches; I thought this'd be a good place to record the suggestion, given that it seems that it'd need exactly the accurate tracking of heap usage to which this issue pertains.
Pinging @elastic/es-analytical-engine (Team:Analytics)
Right now the reduction phase of aggregations is the most memory intensive part of the process of running aggs. This is weird because we reduce the amount of data by a good bit before leaving the data node. But the "result" representation that the data nodes make is so big and bloated that it takes up a ton of space. That is what we serialize over the wire and reduce across all shards. The write size can get quite large. As can the memory usage of holding the things before reduction. I'd love to transition us to a more dense response over the wire - something like a single named-writeable per aggregation rather than per bucket.
This could: