elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.6k stars 24.63k forks source link

[ILM] Allow ILM and CCR to work well together #34648

Closed colings86 closed 5 years ago

colings86 commented 5 years ago

If an index is a CCR leader or follower index then the delete and shrink action should wait proceding any operations. This is to avoid problems described under original problem description.

Leader indices

ILM needs to query the indices stats api and check the shard history retention leases in order to determine whether an index is a leader index.

If an index is a leader index then the delete and shrink actions first need to execute the following steps:

After this it is safe the proceed any steps that are part of the ILM delete and shrink actions.

Follower indices

ILM needs to check an index's custom index metadata to check whether an index is a follower index. If an index is a follower index then the shrink action first needs to execute the following steps:

After this it is safe the proceed any steps that are part of the ILM shrink action.

Tasks

Original problem description

Currently if a user wishes to use CCR and ILM together on the same index they can run into problems. To help describe these problems imagine we have two clusters (for this discussion I'm going to call them leader and follower) and we are using CCR's auto-follow on the follower cluster to follow any indices on the leader cluster matching test-*.

Now, because in our scenario we have a time series use case it would also be good to have ILM manage the indices, so on the leader we set up a policy on the leader cluster which uses rollover, warm allocation, forcemerge, and shrink. Then we add the policy name to the index template for test-*, bootstrap ILM by creating the first index and now we have ILM working on our leader cluster and managing the test-* indices.

Problem 1 - Setting up a policy for the following indices

Having the test-* indices managed by ILM on the leader cluster is great but equally we would like ILM to manage the following indices on the follower cluster too. However, we can't use the exact same policy on the follower cluster because the following index will not have the write alias and even if it did we don't want the following index to rollover on its own criteria, we want it to mirror the leader index. This means the following index needs an indication that the leader has rolled over and moved to the warm phase so the following index also knows it can move to the warm phase.

Problem 2 - The leader index and the shrink action

In ILM the shrink action allocates one copy of each shard to a single node, then performs the shrink operation and then deletes the original (un-shrunk) index and sets an alias on the new (shrunken) index with the same name as the original index. This allows the naive user to search the index as if it was still the same index but under the covers the index is a different index.

The problem when combining this with CCR is that the following index may not be completely up to date with the leader index at the point the shrink action is performed, meaning that it may suddenly discover the leader index no longer exists and not be able to progress since there is no way for it to know that the index is equivalent to the shrunken index on the leader and means that the follower and leader cluster are indefinitely out of sync.

One solution to this would be for the un-shrunken following index to delete itself and for there to be a separate auto-follow rule to sync the shrunken indices from the leader. The problem with this is that it requires all the follower shrunken index to be synced from scratch copying all the same data as it had already in the un-shrunken index which is a waste of resources but more importantly means there is a period where the follower cluster will actually be getting further out of sync with the leader since its thrown away the un-shrunken index and is waiting to fully sync the shrunken index from the leader.

elasticmachine commented 5 years ago

Pinging @elastic/es-core-infra

elasticmachine commented 5 years ago

Pinging @elastic/es-distributed

martijnvg commented 5 years ago

@gwbrown and I started to think about a high level plan for how to make ILM work with CCR. It is quite abstract since it is unknown yet how certain primitives will be implemented in CCR.

The high level plan is built on the assumption that ILM is currently targeted at time-series use cases, in which there is a period where documents are indexed (the write period), followed by a period where no documents are indexed or updated ever again (the read-only period). The other assumption is that each cluster has own ILM policies. A leader cluster shouldn't propagate actions / steps to follower cluster and visa versa.

CCR will add a mechanism to indicate in the leader cluster that a leader index is being followed by one or more indices from another cluster. A follower cluster should indicate this prior to following a leader index in the leader cluster.

CCR will also add a mechanism for the ILM to specify that a leader index is “done indexing” and will be read-only thereafter - that is, it has left the write period and entered the read-only period. The fact that a leader index is read only is also replicated to follower index. Once a leader index has been marked as “done indexing”, follower clusters will continue following until they have replicated all updates, then automatically unfollows (1). CCR in the following cluster will then indicate to the leader cluster that this follower index no longer follows the leader index.

On the leader cluster:

On the follower cluster:

1: This would be an operation that: pauses index following, closes, unfollows and then opens the index.

martijnvg commented 5 years ago

We discussed yesterday in more detail what the read only attribute is, in order to let ILM safely operate describe operations in the follower side.

ILM can use the readonly action to set the index.blocks.write index setting. If that index is being followed then CCR will automatically replicate that to the corresponding follower index in a follower cluster. ILM in a following cluster can then automatically unfollow a follower index if index.blocks.write index setting has been set and the global checkpoint of a shards in a follow index are equal to the global checkpoint in the corresponding leader index. When the follower index has been unfollowed then ILM is allowed to execute destructive operations on this index (shrink or delete action), since it has become a regular index. ILM should not perform destructive operations on follower indices (this can be checked by checking custom metadata in IndexMetaData).

This part of the CCR ILM integration can already be implemented, so we should start soon.

colings86 commented 5 years ago

We discussed using the index.blocks.write index setting before and said that its not a good idea to use this setting and we should have a different setting for this. This is because users may set the write block themselves for other reasons and then unset it later intending to keep indexing into that index. Therefore we cannot be sure this setting means the indexing is complete for that index. I think we should do this on a separate setting. I opened https://github.com/elastic/elasticsearch/issues/35944 this morning which is basically the same mechanism but using a different setting which would solve this part of the CCR ILM issue and also be good for other use cases too. We can also make sure that once this new setting is set it cannot be unset which I think is important for these cases

colings86 commented 5 years ago

ILM should not perform destructive operations on follower indices (this can be checked by checking custom metadata in IndexMetaData).

Is this saying that follower indexes should never run shrink or delete? If so I don't think this restriction is necessary since the index will no longer be following by the time it gets to the warm phase

EDIT: actually I see what this means now. We should not perform destructive operations if the follower index is still following. In this case we should go to the ERROR step. In the case that things have worked as expected the index would not be a follower index anymore but a regular index so would be fine to run destructive operations

martijnvg commented 5 years ago

This is because users may set the write block themselves for other reasons and then unset it later intending to keep indexing into that index.

This was discussed yesterday and the conclusion was that this is problem also in other scenarios. Something like a readonly api should be build that turns an index into a read only index forever. But that should be tackled outside ccr / ilm.

Is this saying that follower indexes should never run shrink or delete?

Well not when it is actively following a leader index. The index would first need to be unfollowed before that could be done.

martijnvg commented 5 years ago

actually I see what this means now. We should not perform destructive operations if the follower index is still following. In this case we should go to the ERROR step. In the case that things have worked as expected the index would not be a follower index anymore but a regular index so would be fine to run destructive operations

:+1:

colings86 commented 5 years ago

This was discussed yesterday and the conclusion was that this is problem also in other scenarios. Something like a readonly api should be build that turns an index into a read only index forever. But that should be tackled outside ccr / ilm.

I agree that we should build a readonly API that makes the index readonly forever but I don't think using the write block is a good idea in the interim. Users may already be used to using the write block as part of maintenance workflows, if the index reacts to that setting at a time that the user does not expect then the results are pretty bad since the index will un-follow the leader when its up to date meaning it is not going to be a copy of the leader. I think we should use something other than the write block.

martijnvg commented 5 years ago

I see, that makes sense. So instead of relying on index.blocks.write, ILM on a follower cluster would rely on something like index.lifecycle.index_complete. Should this new index setting also control whether a write block is set? Because otherwise documents can still be indexed while this setting is set.

colings86 commented 5 years ago

Should this new index setting also control whether a write block is set? Because otherwise documents can still be indexed while this setting is set.

This is definitely up for discussion but I think we need something that has a clear intention and meaning that the index is not intended to be written to agian

gwbrown commented 5 years ago

In the ILM sync, we decided that we're going to move ahead with a solution as proposed in #35944 (index_complete setting or similar) as it not only addresses the issue with following indices, but also other issues we're facing around Beats integration with ILM. More detail will be added to that issue shortly.

bleskes commented 5 years ago

actually I see what this means now. We should not perform destructive operations if the follower index is still following. In this case we should go to the ERROR step. In the case that things have worked as expected the index would not be a follower index anymore but a regular index so would be fine to run destructive operations

Small comment here - I was thinking that we add a post rollover/pre-shrink wait condition that waits for the index to have no followers. It may take a bit for all the followers to catch up with the leader index and unsubscribe.

colings86 commented 5 years ago

++ I was thinking the same thanks for commenting on it explicitly

gwbrown commented 5 years ago

@martijnvg and I discussed this briefly on Slack, and while the high-level concepts for handling ILM and CCR are sound, there are a few options for how we can handle the details of doing so, particularly in the interface that will be used for writing ILM policies for indices which interact with CCR.

Option 1: Add an "unfollow_when_ready" action

This is the simplest option in that it keeps the logic for unfollowing an index contained to one new explicit action. This would likely take the form of a new Hot phase action which could be used in place of Rollover, which would move to the Error step if applied to an index that is not a follower. This would require checks in certain actions (Shrink, possibly Readonly and ForceMerge) to verify that the index is not currently following as we would not be guaranteed to be able to safely perform those operations otherwise, and may involve the Rollover action verifying that it is not being used on a following index

Pros:

Cons:

Option 2: Unfollow automatically in the Hot phase or as part of Rollover

This option would either add steps to the Rollover action or inject an action into the Hot phase to automatically unfollow an index once the leader has signaled that indexing is complete. This would allow policies to be more easily reused between leader and follower clusters, and may be more inuitive for the user: Once indexing is complete, the follower would automatically decouple itself from the leader and each would proceed with their policies completely independently.

Pros:

Cons:

Option 3: Unfollow automatically before dangerous actions

This option is similar to Option 2, but would perform unfollowing if and only if an action which is unsafe to perform on a following index is specified in the policy, immediately before performing the operation. This would be implemented by adding several steps to each action which cannot be safely performed on a following index (Shrink, possibly Readonly and ForceMerge) to automatically unfollow the index before performing the action.

Pros:

Cons:


Currently, my personal preference is for Option 2, though I think I could easily be swayed to Option 1. I think Option 3 is "too magical" - it's difficult to explain, less predictable, and adds a lot of complexity to the code.

colings86 commented 5 years ago

My preferences largely follow @gwbrown's.

Option 3 is quite hard to explain to a user and would add a lot of complexity to the code. Also I think for most use cases the user will not want to move to the warm phase until replication has finished meaning that every action would need to first unfollow and thats essentially option 2 anyway.

I think Option 1 will lead to frustration for users. I'm not sure users will see "Requires different policies to be used on the leader and on the follower" as a pro and will find it hard to understand why the policies need to be different between the leader and the follower.

My preference is therefore for option 2. Additionally I think it might be better to have an implicit action rather than having the logic built into the rollover action. My reasoning here is:

talevy commented 5 years ago

I agree, option 2 seems to be the clearest. We have a history of injecting implicit actions, and that caused some confusion. If we go with injecting an unfollow action after the rollover action, then it would be nice if we showed it in the stored policy, at least in the explain APIs phase info. I am concerned this will potentially cause other confusion if we disallow explicitly defining this action when PUTing a policy, but I think it should show itself in the Expain API.

gwbrown commented 5 years ago

After talking @talevy and @martijnvg a bit, I'd like to have a real-time discussion about which option we should go with when people are available to do so.

Additionally, @martijnvg uncovered another thing we'll have to make a decision about: CCR does not respect index templates when creating follower indices, so that approach for setting policies on new indices won't work for follower indices. There are a couple ways we could handle this, which aren't necessarily mutually exclusive:

  1. Simply keep index.lifecycle.name as a setting that is managed by CCR and copied from the leader to the follower. This requires the policies on the leader and the follower to have the same name, although the policies themselves could be different.
  2. Add a parameter to the CCR APIs/auto-follow patterns that allows specifying a policy name for the follower index.

This is also something I think we should discuss as a team when everyone is available again. I don't think this blocks any work, as we can test for the moment assuming Option 1 very easily, but we do need to make a final decision before shipping.

gwbrown commented 5 years ago

Following a Zoom discussion with @jakelandis, @martijnvg, @dakrone, and @talevy, we have come to a decision on the above questions.

Regarding how to specify the Unfollow action: In order to give the user maximum flexibility while also maintaining ease of use, the Unfollow action will be available as an explicit action in the Hot, Warm, and Cold phases, and will automatically run before the Shrink action (and in the future, any other actions which require it) [edit: and the Rollover action, see below]. If the index is not a follower index, Unfollow is a no-op, so we do not have concerns about this impacting non-follower indices or problems with policies that specify the Unfollow action multiple times.

Regarding policy names on follower indices: We are going to require follower indices to have the same policy name as their leader index for now, while keeping in mind the possibility to add the ability to specify a different policy name to the CCR APIs if and when we determine that this is a needed feature. This would be a non-breaking change and could be easily added at that point.

jakelandis commented 5 years ago

and will automatically run before the Shrink action (and in the future, any other actions which require it)

IIRC we also discussed having it automatically run before the Rollover action too. If this is not the case we should revisit this part of the discussion. (also +1 on explicit + implicit unfollow)

gwbrown commented 5 years ago

Ah, yes, you are correct, I just forgot to write it. Doing that on rollover as well allows for reuse of policies between leader and follower, but not doing it automatically in the Hot phase gives the flexibility to control when in the lifecycle the follower is decoupled.

gwbrown commented 5 years ago

All of the tasks listed above are complete & backported to 6.x, which means the outstanding concerns we have around ILM and CCR operating on the same indices have been addressed.

gwbrown commented 5 years ago

Given a recent discovery, I'm reopening this until work on https://github.com/elastic/elasticsearch/issues/37165 progresses until this item is complete: integrate shard history retention leases with cross-cluster replication

Until then, the work already done on ILM to utilize shard history retention leases is effectively a no-op.