elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.12k stars 24.83k forks source link

CCR automatically replicates all x-pack index settings #51972

Open jasontedor opened 4 years ago

jasontedor commented 4 years ago

CCR is deliberate about which index settings it copies:

https://github.com/elastic/elasticsearch/blob/da20957e815d9370142738be9e402727f36b9857/x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/TransportResumeFollowAction.java#L455-L460

In fact, there is a test to ensure that if any index setting is added, the setting is either deliberately marked as replicated, or is added to the list of non-replicated settings:

https://github.com/elastic/elasticsearch/blob/698b1e97e78cdec26f1b2f971dd0d5e79e8ac47e/x-pack/plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/action/TransportResumeFollowActionTests.java#L198-L222

Alas, there is some oversight here which relates to index settings registered in x-pack (e.g., ILM index-level settings such as the index policy). This means that today, all such settings are replicated from the leader to the follower. This can lead to issues on the follower (e.g., explicit issues such as if the ILM policy does not exist in the local cluster, or implicit issues such as if the ILM policy does exist in the local cluster, but it's not the one that user wanted to have used on the follower index).

elasticmachine commented 4 years ago

Pinging @elastic/es-distributed (:Distributed/CCR)

wangkhc commented 4 years ago

@jasontedor So one solution here is to be more deliberately on x-pack related settings? Which means to classify x-pack settings into replicated-settings and non-replicated-settings?

jasontedor commented 4 years ago

@wangkhc Indeed, it means classifying the existing index settings, and writing a test that ensures that every future setting is classified one way or the other.

zacharymorn commented 4 years ago

Is this being worked upon by anyone? I’m pretty new to CCR but I can do some readings and pick this up if it’s still available.

jasontedor commented 4 years ago

@zacharymorn I am not aware of this being on worked on by anyone.

zacharymorn commented 4 years ago

Cool I’ll give it a try.

zacharymorn commented 4 years ago

I read the documentation on CCR and believe I get the gist of it. For the scope of this fix, I'm guessing is to include x-pack setting registry in the check like such for future proof:

for (Setting<?> setting : IndexScopedSettings.BUILT_IN_INDEX_SETTINGS + X_PACK_INDEX_SETTINGS) { 
         if (setting.isDynamic()) { 
             boolean notReplicated = TransportResumeFollowAction.NON_REPLICATED_SETTINGS.contains(setting); 
             boolean replicated = replicatedSettings.contains(setting); 
             assertThat("setting [" + setting.getKey() + "] is not classified as replicated or not replicated", 
                 notReplicated ^ replicated, is(true)); 
         } 
     } 

and register current x-pack settings in X_PACK_INDEX_SETTINGS, as well as the non-replicated ones into TransportResumeFollowAction.NON_REPLICATED_SETTINGS?

If this looks good to you, I can get started coding. I may need some specification for which x-pack settings should be added to the registry and the non-replicated list if such specification is not available yet though.

jasontedor commented 4 years ago

@zacharymorn Yes, you have the right idea!

There is one complication, which is that index settings from x-pack can live in different sub-projects of x-pack, so then there is an interesting question of how to get those settings registered over to CCR....

By the way, my general approach on guiding someone through some work it to be hands off, letting them have the joy of solving problems and figuring things out on their own. If at any point you want more guidance, please do let me know, I'm happy to do so!

I may need some specification for which x-pack settings should be added to the registry and the non-replicated list if such specification is not available yet though.

Sure, I am happy to help with this once we have a list in our hands that we can use.

zacharymorn commented 4 years ago

@jasontedor Cool sounds good! I will take a crack at it then and open a PR when there's enough code for further discussion / clarification. Thanks for the confirmation!

zacharymorn commented 4 years ago

@jasontedor I've opened a PR that implemented the idea above. Could you please take a look and let me know if it looks good to you? If so, may I get the specification of which x-pack index settings should be excluded from CCR?

zacharymorn commented 4 years ago

@jasontedor Is this issue still relevant? I had a PR opened in March earlier and closed in Aug due to merge conflict & I happened to be away. If this still needs to be worked upon, I can pick up the changes.

tlrx commented 2 years ago

I think this issue is still relevant: we don't have a way today to know if a settings must be replicated or ignored when CCR bootstrap or update a follower index. Note that Auto-Follow patterns also allow to update index settings that are not filtered out today (see #75428)

I think we could have a property on settings similar to the NotCopyableOnResize that already exists could help avoiding most of situations where a follower index fails to be created/updated because a setting should not be replicated.

kingherc commented 2 years ago

@tlrx 's suggestion of putting a property on settings is nice! And if it is done like that, it should make the list at https://github.com/elastic/elasticsearch/blob/v8.3.3/x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/TransportResumeFollowAction.java#L433 obsolete. The list of properties that are not replicated with CCR should be properly documented so that users are not surprised.