[BUG] Make AsyncShardFetch cache in GatewayAllocator bounded with eviction policy

opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.

https://opensearch.org/docs/latest/opensearch/index/

Apache License 2.0

9.88k stars 1.84k forks source link

[BUG] Make AsyncShardFetch cache in GatewayAllocator bounded with eviction policy #10316

Open sam-herman opened 1 year ago

sam-herman commented 1 year ago

Describe the bug Currently AsyncShardFetch requests triggered during reroute in the GatewayAllocator have a cache that is not bounded and eventually can cause the ClusterManager to run out of heap memory when clusters with multiple nodes and shards are starting up after a ClusterManager restart.

To Reproduce When provisioning large clusters with multiple nodes and shards you can take a heap dump and observe that the GatewayAllocator retain most of the heap.

github issue for OpenSearch regarding cache

Expected behavior

Cache will have limits and will start evict older entries when reaching the limits
Limits should be configurable with default heuristic that calculates desired cache size and eviction policy based on heap size.

Bukhtawar commented 1 year ago

Tagging the meta issue https://github.com/opensearch-project/OpenSearch/issues/8098 for better visibility.

gtahhan commented 1 year ago

I can work on this

amkhar commented 1 year ago

@samuel-oci

Thanks for putting up this thought, this is interesting as it'll help in overall resiliency of this particular flow. Currently we're also trying to reduce the size of cache itself by going deep one level down and removing unnecessary data present in cache like number of DiscoveryNode objects as @Bukhtawar pointed out about meta issue for improving this area.

Quick question

Cache will have limits and will start evict older entries when reaching the limits

We're still clearing the cache after shard is started. And it makes sense to keep ShardEntry in cache to avoid re-fetching the metadata from data nodes. Evicting an entry for which shard initialize is not called yet will require us to do async fetch again for that shard. Want to understand your thoughts on this.

sam-herman commented 1 year ago

Hey @amkhar that's correct understanding of the proposal. You are correct to point out that without sufficient memory for cache there will be repeated asyncFetch requests to retrieve the same entry. This for sure is a tradeoff proposed in the event where the cache overfill the heap. I believe @gtahhan had done some experimentation and found that eviction penalty of refetching is manageable. Perhaps he can provide more details.

amkhar commented 10 months ago

Hi @gtahhan are you still working on this ?

peternied commented 9 months ago

[Triage - attendees 1 2 3 4 5] @samuel-oci Thanks for opening this issue.

rwali-aws commented 6 months ago

@gtahhan Please let us know in case you are still working on this.

gtahhan commented 6 months ago

Hi @rwali-aws @amkhar sorry I didnt get the chance to work on it as of now... if its urgent, please you can un assign it from myself... Thank you and sorry again