[ILM] Allow ILM and CCR to work well together

colings86 commented 5 years ago

If an index is a CCR leader or follower index then the delete and shrink action should wait proceding any operations. This is to avoid problems described under original problem description.

Leader indices

ILM needs to query the indices stats api and check the shard history retention leases in order to determine whether an index is a leader index.

If an index is a leader index then the delete and shrink actions first need to execute the following steps:

Set the index.lifecycle.indexing_complete index setting to true.
Periodically query the indices stats api and check whether there are no shard history retention leases for the leader index.

After this it is safe the proceed any steps that are part of the ILM delete and shrink actions.

Follower indices

ILM needs to check an index's custom index metadata to check whether an index is a follower index. If an index is a follower index then the shrink action first needs to execute the following steps:

Wait for the index.lifecycle.indexing_complete index setting to be replicated from the leader index.
Then after that wait for the follower index's global check point to be equal to the leader index's global check point.
Pause index following for the follower index. (This will release any shard history retention leases a follow index has on its leader index)
Close the follower index.
Unfollow the follower index. (Only closed indices can be unfollowed, because it changes the internal engine for all shards.)
Open the unfollowed index.

After this it is safe the proceed any steps that are part of the ILM shrink action.

Tasks

[x] Change the delete and shrink actions to safely handle CCR leader indices. https://github.com/elastic/elasticsearch/pull/38140
[x] Implement an Unfollow action https://github.com/elastic/elasticsearch/pull/36970
[x] Inject Unfollow action/steps before Shrink and Rollover actions https://github.com/elastic/elasticsearch/pull/37625

Original problem description

Currently if a user wishes to use CCR and ILM together on the same index they can run into problems. To help describe these problems imagine we have two clusters (for this discussion I'm going to call them leader and follower) and we are using CCR's auto-follow on the follower cluster to follow any indices on the leader cluster matching test-*.

Now, because in our scenario we have a time series use case it would also be good to have ILM manage the indices, so on the leader we set up a policy on the leader cluster which uses rollover, warm allocation, forcemerge, and shrink. Then we add the policy name to the index template for test-*, bootstrap ILM by creating the first index and now we have ILM working on our leader cluster and managing the test-* indices.

Problem 1 - Setting up a policy for the following indices

Having the test-* indices managed by ILM on the leader cluster is great but equally we would like ILM to manage the following indices on the follower cluster too. However, we can't use the exact same policy on the follower cluster because the following index will not have the write alias and even if it did we don't want the following index to rollover on its own criteria, we want it to mirror the leader index. This means the following index needs an indication that the leader has rolled over and moved to the warm phase so the following index also knows it can move to the warm phase.

Problem 2 - The leader index and the shrink action

In ILM the shrink action allocates one copy of each shard to a single node, then performs the shrink operation and then deletes the original (un-shrunk) index and sets an alias on the new (shrunken) index with the same name as the original index. This allows the naive user to search the index as if it was still the same index but under the covers the index is a different index.

The problem when combining this with CCR is that the following index may not be completely up to date with the leader index at the point the shrink action is performed, meaning that it may suddenly discover the leader index no longer exists and not be able to progress since there is no way for it to know that the index is equivalent to the shrunken index on the leader and means that the follower and leader cluster are indefinitely out of sync.

One solution to this would be for the un-shrunken following index to delete itself and for there to be a separate auto-follow rule to sync the shrunken indices from the leader. The problem with this is that it requires all the follower shrunken index to be synced from scratch copying all the same data as it had already in the un-shrunken index which is a waste of resources but more importantly means there is a period where the follower cluster will actually be getting further out of sync with the leader since its thrown away the un-shrunken index and is waiting to fully sync the shrunken index from the leader.

elasticmachine commented 5 years ago

Pinging @elastic/es-core-infra

elasticmachine commented 5 years ago

Pinging @elastic/es-distributed

martijnvg commented 5 years ago

@gwbrown and I started to think about a high level plan for how to make ILM work with CCR. It is quite abstract since it is unknown yet how certain primitives will be implemented in CCR.

The high level plan is built on the assumption that ILM is currently targeted at time-series use cases, in which there is a period where documents are indexed (the write period), followed by a period where no documents are indexed or updated ever again (the read-only period). The other assumption is that each cluster has own ILM policies. A leader cluster shouldn't propagate actions / steps to follower cluster and visa versa.

CCR will add a mechanism to indicate in the leader cluster that a leader index is being followed by one or more indices from another cluster. A follower cluster should indicate this prior to following a leader index in the leader cluster.

CCR will also add a mechanism for the ILM to specify that a leader index is “done indexing” and will be read-only thereafter - that is, it has left the write period and entered the read-only period. The fact that a leader index is read only is also replicated to follower index. Once a leader index has been marked as “done indexing”, follower clusters will continue following until they have replicated all updates, then automatically unfollows (1). CCR in the following cluster will then indicate to the leader cluster that this follower index no longer follows the leader index.

On the leader cluster:

ILM will be able to use these mechanism to mark an index as “done indexing” once the index has been rolled over.
ILM will wait execute actions / steps that are destructive for CCR leader indices. These ILM steps wait until the indication that an index has followers has been removed. (0 indices are following a leader index)

On the follower cluster:

ILM will not execute actions / steps that are destructive for CCR on follower indices. These ILM actions / steps will wait until follower indices have been unfollowed and the ready only attribute has been set.

1: This would be an operation that: pauses index following, closes, unfollows and then opens the index.

martijnvg commented 5 years ago

We discussed yesterday in more detail what the read only attribute is, in order to let ILM safely operate describe operations in the follower side.

ILM can use the readonly action to set the index.blocks.write index setting. If that index is being followed then CCR will automatically replicate that to the corresponding follower index in a follower cluster. ILM in a following cluster can then automatically unfollow a follower index if index.blocks.write index setting has been set and the global checkpoint of a shards in a follow index are equal to the global checkpoint in the corresponding leader index. When the follower index has been unfollowed then ILM is allowed to execute destructive operations on this index (shrink or delete action), since it has become a regular index. ILM should not perform destructive operations on follower indices (this can be checked by checking custom metadata in IndexMetaData).

This part of the CCR ILM integration can already be implemented, so we should start soon.

colings86 commented 5 years ago

We discussed using the index.blocks.write index setting before and said that its not a good idea to use this setting and we should have a different setting for this. This is because users may set the write block themselves for other reasons and then unset it later intending to keep indexing into that index. Therefore we cannot be sure this setting means the indexing is complete for that index. I think we should do this on a separate setting. I opened https://github.com/elastic/elasticsearch/issues/35944 this morning which is basically the same mechanism but using a different setting which would solve this part of the CCR ILM issue and also be good for other use cases too. We can also make sure that once this new setting is set it cannot be unset which I think is important for these cases

colings86 commented 5 years ago

ILM should not perform destructive operations on follower indices (this can be checked by checking custom metadata in IndexMetaData).

Is this saying that follower indexes should never run shrink or delete? If so I don't think this restriction is necessary since the index will no longer be following by the time it gets to the warm phase

EDIT: actually I see what this means now. We should not perform destructive operations if the follower index is still following. In this case we should go to the ERROR step. In the case that things have worked as expected the index would not be a follower index anymore but a regular index so would be fine to run destructive operations

martijnvg commented 5 years ago

This is because users may set the write block themselves for other reasons and then unset it later intending to keep indexing into that index.

This was discussed yesterday and the conclusion was that this is problem also in other scenarios. Something like a readonly api should be build that turns an index into a read only index forever. But that should be tackled outside ccr / ilm.

Is this saying that follower indexes should never run shrink or delete?

Well not when it is actively following a leader index. The index would first need to be unfollowed before that could be done.

martijnvg commented 5 years ago

actually I see what this means now. We should not perform destructive operations if the follower index is still following. In this case we should go to the ERROR step. In the case that things have worked as expected the index would not be a follower index anymore but a regular index so would be fine to run destructive operations

:+1:

colings86 commented 5 years ago

This was discussed yesterday and the conclusion was that this is problem also in other scenarios. Something like a readonly api should be build that turns an index into a read only index forever. But that should be tackled outside ccr / ilm.

I agree that we should build a readonly API that makes the index readonly forever but I don't think using the write block is a good idea in the interim. Users may already be used to using the write block as part of maintenance workflows, if the index reacts to that setting at a time that the user does not expect then the results are pretty bad since the index will un-follow the leader when its up to date meaning it is not going to be a copy of the leader. I think we should use something other than the write block.

martijnvg commented 5 years ago

I see, that makes sense. So instead of relying on index.blocks.write, ILM on a follower cluster would rely on something like index.lifecycle.index_complete. Should this new index setting also control whether a write block is set? Because otherwise documents can still be indexed while this setting is set.

colings86 commented 5 years ago

Should this new index setting also control whether a write block is set? Because otherwise documents can still be indexed while this setting is set.

This is definitely up for discussion but I think we need something that has a clear intention and meaning that the index is not intended to be written to agian

gwbrown commented 5 years ago

In the ILM sync, we decided that we're going to move ahead with a solution as proposed in #35944 (index_complete setting or similar) as it not only addresses the issue with following indices, but also other issues we're facing around Beats integration with ILM. More detail will be added to that issue shortly.

bleskes commented 5 years ago

actually I see what this means now. We should not perform destructive operations if the follower index is still following. In this case we should go to the ERROR step. In the case that things have worked as expected the index would not be a follower index anymore but a regular index so would be fine to run destructive operations

Small comment here - I was thinking that we add a post rollover/pre-shrink wait condition that waits for the index to have no followers. It may take a bit for all the followers to catch up with the leader index and unsubscribe.

colings86 commented 5 years ago

++ I was thinking the same thanks for commenting on it explicitly

gwbrown commented 5 years ago

@martijnvg and I discussed this briefly on Slack, and while the high-level concepts for handling ILM and CCR are sound, there are a few options for how we can handle the details of doing so, particularly in the interface that will be used for writing ILM policies for indices which interact with CCR.

Option 1: Add an "unfollow_when_ready" action

This is the simplest option in that it keeps the logic for unfollowing an index contained to one new explicit action. This would likely take the form of a new Hot phase action which could be used in place of Rollover, which would move to the Error step if applied to an index that is not a follower. This would require checks in certain actions (Shrink, possibly Readonly and ForceMerge) to verify that the index is not currently following as we would not be guaranteed to be able to safely perform those operations otherwise, and may involve the Rollover action verifying that it is not being used on a following index

Pros:

Requires different policies to be used on the leader and on the follower
Aligns with the intention of the Warm phase as being for after all indexing has completed (and therefore following is no longer necessary)
Fails relatively fast
Explicit about what operations are being performed
The timing of the gap in readability of the index required for unfollowing is relatively predictable

Cons:

There are two different types of policies - ones can be used on "regular" indices, and ones which can be used on "follower" indices. This may be confusing or frustrating to users.
Requires users to know about the "Unfollow" action

Option 2: Unfollow automatically in the Hot phase or as part of Rollover

This option would either add steps to the Rollover action or inject an action into the Hot phase to automatically unfollow an index once the leader has signaled that indexing is complete. This would allow policies to be more easily reused between leader and follower clusters, and may be more inuitive for the user: Once indexing is complete, the follower would automatically decouple itself from the leader and each would proceed with their policies completely independently.

Pros:

With careful design of the execution model, may allow certain policies to be copied verbatim from leader to follower
Aligns with the intention of the Warm phase as being for after all indexing has completed (and therefore following is no longer necessary)
Easy path to doing "the correct thing" for the user
The timing of the gap in readability of the index required for unfollowing is relatively predictable

Cons:

Not explicit about what operations are being performed
Requires extra complexity as each step in the unfollowing process must have a check to see if it is being used on a following index, and if not, skip execution. (due to being unable to choose between branches of steps)
Requires extra complexity either in injecting an action or in adding steps to the Rollover action
Doing this as part of Rollover means overloading Rollover to mean "Rollover or Unfollow", which is confusing, but doing it automatically means any Rollover action in a policy used on a follower action is meaningless, which is also confusing

Option 3: Unfollow automatically before dangerous actions

This option is similar to Option 2, but would perform unfollowing if and only if an action which is unsafe to perform on a following index is specified in the policy, immediately before performing the operation. This would be implemented by adding several steps to each action which cannot be safely performed on a following index (Shrink, possibly Readonly and ForceMerge) to automatically unfollow the index before performing the action.

Pros:

The gap in readability of the index required for unfollowing is only required for the indices which actually need it
Easy-ish path to doing "the correct thing" for the user
Not mutually exclusive with Option 1 - we could still have unfollowing be available as an explicit step
Putting off unfollowing until the last minute may allow e.g. replication to occur simultaneously with allocation in the Warm phase

Cons:

Requires adding lots of extra steps to actions which are not directly related to the action itself
Not explicit about what operations are being performed
It is not guaranteed that we ever unfollow the index, which may lead to the leader getting unexpectedly "stuck" on actions such as Shrink or Delete if followers retain history leases
The timing of the gap in readability of the index required for unfollowing is less predictable

Currently, my personal preference is for Option 2, though I think I could easily be swayed to Option 1. I think Option 3 is "too magical" - it's difficult to explain, less predictable, and adds a lot of complexity to the code.

colings86 commented 5 years ago

My preferences largely follow @gwbrown's.

Option 3 is quite hard to explain to a user and would add a lot of complexity to the code. Also I think for most use cases the user will not want to move to the warm phase until replication has finished meaning that every action would need to first unfollow and thats essentially option 2 anyway.

I think Option 1 will lead to frustration for users. I'm not sure users will see "Requires different policies to be used on the leader and on the follower" as a pro and will find it hard to understand why the policies need to be different between the leader and the follower.

My preference is therefore for option 2. Additionally I think it might be better to have an implicit action rather than having the logic built into the rollover action. My reasoning here is:

If a user is not using the rollover action but it is a following index we will still do the right thing. Explaining to a user that they need to use a rollover action on the follower index even though they don't use it on the leader index is problematic IMO
Implicit actions are easier to explain than overloading an existing action
A user can omit the rollover action on the follow index and thing will work fine which helps to mitigate "any Rollover action in a policy used on a follower action is meaningless"
I think the unfollowing process should consist of 3 steps:
1. Cluster state wait step - Checks if this is a follower index, if so checks for indexing_complete setting (this is necessary in case there is no rollover action used)
2. Async wait step - Calls CCR API to check if index is caught up (if this check can be done in cluster state we can combine this step with the previous
3. Async action step - Calls CCR API to unfollow leader
The second and third step above may not need to check if the index is a following index if the CCR APIs return an error or response that indicates the index is not a follower index (since we can detect that response and move to the next step)

talevy commented 5 years ago

I agree, option 2 seems to be the clearest. We have a history of injecting implicit actions, and that caused some confusion. If we go with injecting an unfollow action after the rollover action, then it would be nice if we showed it in the stored policy, at least in the explain APIs phase info. I am concerned this will potentially cause other confusion if we disallow explicitly defining this action when PUTing a policy, but I think it should show itself in the Expain API.

gwbrown commented 5 years ago

After talking @talevy and @martijnvg a bit, I'd like to have a real-time discussion about which option we should go with when people are available to do so.

Additionally, @martijnvg uncovered another thing we'll have to make a decision about: CCR does not respect index templates when creating follower indices, so that approach for setting policies on new indices won't work for follower indices. There are a couple ways we could handle this, which aren't necessarily mutually exclusive:

Simply keep index.lifecycle.name as a setting that is managed by CCR and copied from the leader to the follower. This requires the policies on the leader and the follower to have the same name, although the policies themselves could be different.
Add a parameter to the CCR APIs/auto-follow patterns that allows specifying a policy name for the follower index.

This is also something I think we should discuss as a team when everyone is available again. I don't think this blocks any work, as we can test for the moment assuming Option 1 very easily, but we do need to make a final decision before shipping.

gwbrown commented 5 years ago

Following a Zoom discussion with @jakelandis, @martijnvg, @dakrone, and @talevy, we have come to a decision on the above questions.

Regarding how to specify the Unfollow action: In order to give the user maximum flexibility while also maintaining ease of use, the Unfollow action will be available as an explicit action in the Hot, Warm, and Cold phases, and will automatically run before the Shrink action (and in the future, any other actions which require it) [edit: and the Rollover action, see below]. If the index is not a follower index, Unfollow is a no-op, so we do not have concerns about this impacting non-follower indices or problems with policies that specify the Unfollow action multiple times.

Regarding policy names on follower indices: We are going to require follower indices to have the same policy name as their leader index for now, while keeping in mind the possibility to add the ability to specify a different policy name to the CCR APIs if and when we determine that this is a needed feature. This would be a non-breaking change and could be easily added at that point.

jakelandis commented 5 years ago

and will automatically run before the Shrink action (and in the future, any other actions which require it)

IIRC we also discussed having it automatically run before the Rollover action too. If this is not the case we should revisit this part of the discussion. (also +1 on explicit + implicit unfollow)

gwbrown commented 5 years ago

Ah, yes, you are correct, I just forgot to write it. Doing that on rollover as well allows for reuse of policies between leader and follower, but not doing it automatically in the Hot phase gives the flexibility to control when in the lifecycle the follower is decoupled.

gwbrown commented 5 years ago

All of the tasks listed above are complete & backported to 6.x, which means the outstanding concerns we have around ILM and CCR operating on the same indices have been addressed.

gwbrown commented 5 years ago

Given a recent discovery, I'm reopening this until work on https://github.com/elastic/elasticsearch/issues/37165 progresses until this item is complete: integrate shard history retention leases with cross-cluster replication

Until then, the work already done on ILM to utilize shard history retention leases is effectively a no-op.

elastic / elasticsearch