Combine ILM shrink and force merge

dakrone commented 3 years ago

It's a common use case for an ILM policy to have a shrink action as well as a forcemerge action in the warm phase. However, in order to reduce DTS costs, we should investigate combining these actions.

Currently when performing a shrink, the following actions are taken by ILM (this is a subset):

Select a node that will perform the shrink
Relocate a single copy of each shard to the node
Perform the shrink
Shrink creates a new index on the same node, with the same number of replica shards
As the new index initializes, it is then replicated to a different node (assuming number_of_replicas=1)
Add the new index to the data stream or alias while removing the old index

The forcemerge performs a simple forcemerge of the index, but it does mean that the forcemerge is duplicated, and because merging is non deterministic the segments will likely differ between the nodes, leading to replication of segments.

There are at least two things we can do to help reduce DTS costs related to this:

Shrink into an index with zero replicas

When we shrink, currently ILM creates the shrunken index with the same replica count, but since this is going on transparently in the background, there is no need to create a shrunken index with a single replica. Instead, we can create the index with zero replicas, and increase the number of replicas to the original index's count prior to deletion of the original index.

Since shrink now has ILM resiliency, it means that in the event that something goes wrong, no data loss occurs, and ILM can retry.

By itself, this doesn't reduce DTS, because regardless the data will still have to be replicated across the zone boundary. However, if it was combined with the next enhancement:

Perform forcemerge prior to increasing the replica count

Forcemerge also ends up leading to replication across zone boundaries, however, if we perform the forcemerge at a point where the index has no replicas, then it only need be performed once, and the data will be replicated to a different zone only a single time.

If we combine both of these behaviors, the new behavior looks like:

Select a node that will perform the shrink
Relocate a single copy of each shard to the node
Perform the shrink
Shrink creates a new index on the chosen node with 0 replicas
The new index is initialized
Force merge the shrunken index
Increase the number of replicas to the force merged and shrunk index back to the original index's count (likely 1 replica)
Add the new index to the data stream or alias while removing the old index

Here is a before picture: 71903033-F5B5-48DA-AD30-2DB01F26D696

And here is an after picture: 4D545A4E-3157-4B00-A796-AD4F6709E755

In both examples I treated the single node allocation rule (where ILM has to get a copy of each shard on the same node) as "smart" and not sending any data across zones. Still, this step is tedious, and it would be nice if we could skip it.

elasticmachine commented 3 years ago

Pinging @elastic/es-core-features (Team:Core/Features)

gaobinlong commented 3 years ago

@dakrone , can I work on this issue? I'm a deep user of ILM and want to make more contributions to the feature.

dakrone commented 3 years ago

@gaobinlong I appreciate the interest! For this one though, I think we should hold off. I'm not sure yet the best way to implement this, whether we want to put something solely in the shrink action, or whether we want to introduce the concept of a "logical plan" into ILM that can re-order or combine steps to be optimized.

gaobinlong commented 3 years ago

@dakrone thanks for you reply, I will keep track of this issue and follow up the development of ILM.

jpountz commented 2 years ago

whether we want to put something solely in the shrink action, or whether we want to introduce the concept of a "logical plan" into ILM that can re-order or combine steps to be optimized

Maybe one argument for the latter is that we would likely want to also optimize the forcemerge + shrink + searchable_snapshot workflow to replace the step that increases the number of replicas of the shrunken index with taking a snapshot and doing a snapshot recovery?

dakrone commented 2 years ago

@jpountz yes with a logical plan we could re-order, elide, or enhance actions to make more combinations of actions efficient.

jpountz commented 2 years ago

In addition to the DTS costs, there is another aspect of this proposal that I like a lot, which is the fact that we would reduce the CPU cost of the forcemerge operation by 2x since it would run on a single shard copy.

This would be a win on its own, plus we could then have more discussions about shifting some of the CPU cost from natural merges to forced merges, e.g.

Maybe our built-in index templates / ILM policies should index with index.codec: best_speed and we'd only move to index.codec:best_compression for forcemerge.
Maybe data streams and time-based indices could have a merge policy that is a bit lighter on natural merges, e.g. by decreasing the max merged segment size from 5GB to 2GB (which would need to be evaluated properly due to the potential impact on search performance) and we'd then do more merging in the forcemerge.

VimCommando commented 2 years ago

There is related discussion in Can we avoid force-merging all shard copies?

elastic / elasticsearch

Combine ILM shrink and force merge #73499

Shrink into an index with zero replicas

Perform forcemerge prior to increasing the replica count