opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.51k stars 1.75k forks source link

[Enhancement] Cluster health API uses 36% of memory allocations during index creation #11684

Open amkhar opened 8 months ago

amkhar commented 8 months ago

Describe the bug

While using a large cluster with more than 300K shards, if we start creating index, it takes more than 15 seconds. To optimize this flow, we should use less resource in terms of CPU and JVM.

_cluster/health API flow takes more than 36% of memory allocations (see attached image of async-profiler for reference). 20% of it is taken in calculating the health and 16% is taken in just constructing the response.

Related component

Cluster Manager

To Reproduce

  1. Create a 300 node cluster with 300K shards.
  2. Start new index creation.
  3. Take async profiler alloc profile on active cluster manager node.
  4. See that cluster/health flow takes more than 36% of memory allocations

Expected behavior

Ideally it should be optimized to use lesser resources. If possible we can pre-compute health for a particular cluster state version, so whenever it's needed, we don't do whole re-computation again if version is same.

Additional Details

Screenshots

github-upload-cluster-health-takes-36%-allocations-during-index-creation-1
SwethaGuptha commented 1 week ago

Addressed in https://github.com/opensearch-project/OpenSearch/pull/15492