opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.43k stars 1.72k forks source link

[META] In-place Shard Splitting #13254

Open vikasvb90 opened 4 months ago

vikasvb90 commented 4 months ago

Please describe the end goal of this project

In RFC https://github.com/opensearch-project/OpenSearch/issues/12918, we proposed to build shard level splitting without the need to stop write traffic on the cluster. This feature is aimed at solving hot shard or large shard problem which arises mostly in search workload where data is never rolled over and supposed to be available at all times in the same index. Large shards in such workloads is caused typically by usage of custom doc ids by users causing uneven shard sizes or even but large shards on nodes. Shards of the index in such cases continue to grow and eventually become too hot/large to be hosted on the same node. This leads to scaling bottlenecks and therefore, this meta issue defines an overall project plan to dial down on high level tasks required to build the solution of these problems.

Supporting References

RFC : In-place Shard Splitting

Issues

Following is the high level break up of tasks of the project. I will keep linking github issues and PRs as and when they are published.

Related component

Indexing

peternied commented 4 months ago

[Triage - attendees 1 2 3 4 5 6 7] @vikasvb90 Thanks for creating this RFC