Make use of chunked REST response infrastructure in more APIs

original-brownbear commented 2 years ago

Now that #88311 has landed and we have the infrastructure for serializing chunked REST responses, we should make use of it to fix the massive memory usage of APIs that are known to return large responses:

[x] snapshot status API (https://github.com/elastic/elasticsearch/pull/90801)
[x] cluster state API (https://github.com/elastic/elasticsearch/pull/92285)
[x] get mappings API (#89906)
[x] indices stats API (#91760)
[x] node stats API (https://github.com/elastic/elasticsearch/issues/90097, https://github.com/elastic/elasticsearch/issues/93985)
[x] field data stats (https://github.com/elastic/elasticsearch/pull/91942)
[x] get indices API (#92034)
[x] index settings API (https://github.com/elastic/elasticsearch/pull/90326)
[ ] search API #95661
[x] _cat APIs (https://github.com/elastic/elasticsearch/pull/92022)
[x] fields caps API (#89996)
[x] tasks API (https://github.com/elastic/elasticsearch/pull/91935)
[x] pending cluster tasks API (https://github.com/elastic/elasticsearch/pull/91929)
[x] recoveries API (#89999)
[x] segments API (https://github.com/elastic/elasticsearch/pull/90136)
[x] POST _cluster/reroute (includes the cluster state) (https://github.com/elastic/elasticsearch/pull/92615)
[ ] shard stores API (#94507)
[ ] cluster health API
[ ] mget API

elasticsearchmachine commented 2 years ago

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner commented 1 year ago

I took a look at adding chunking to the cluster state API and it is not a small task. Yet I think it's important, we still have clients that monitor things by requesting the whole routing table sometimes and they're not going away any time soon. I don't think we can just add chunking to the routing table part, making everything else a single chunk, since you pointed out elsewhere that this moves all the serialization work back onto the transport threads. So we have to do it properly. I think we can do it in stages tho, starting at the bottom and using wrapAsXContentObject as needed to keep the work off of transport threads until we get to the top:

[x] Make Metadata.Custom implement ChunkedToXContent instead of ToXContentFragment. (25 implementations of toXContent as I write this, some of them pretty large).
[x] Make Metadata implement ChunkedToXContent (likely impacts PersistedClusterStateService).
[x] Make ClusterState.Custom implement ChunkedToXContent instead of ToXContentFragment. (https://github.com/elastic/elasticsearch/pull/91963)
[x] Finally make ClusterState fully chunked.

DaveCTurner commented 1 year ago

I think that's everything but the various search APIs, for which I think we should ask for help from the search team. Should we open a separate issue for the search team about that, and then close this one?

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-search (Team:Search)

javanna commented 1 year ago

Heya, I added the Search label so that this is on our radar, given that the only remaining task is ours.

javanna commented 1 year ago

We now have a Search meta issue (#95661), so I am removing the search area label from this one.

elastic / elasticsearch

Make use of chunked REST response infrastructure in more APIs #89838