elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.49k stars 24.6k forks source link

CCR auto-follow can leave unfollowed indices during master failovers #86654

Open fcofdez opened 2 years ago

fcofdez commented 2 years ago

Today when the cluster is unstable and there are master failovers while some new leader indices match an auto-follow pattern it is possible that the following index end up in a state where it does not pull changes from the leader index or it is considered as an already following index when it is not followed.

One scenario where this is possible is after the following index is recovered from the leader index in: https://github.com/elastic/elasticsearch/blob/c7dc89f3cd86dbc9ad11c2f831c63651053e6e4a/x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/TransportPutFollowAction.java#L269-L270

Eventhough that end up calling a AcknowledgedTransportMasterNodeAction, it uses the default timeout (30s) meaning that if there's a failure for more than (30s) the listener just logs the failure instead of retrying or informing back to the auto-follow coordinator, see:

https://github.com/elastic/elasticsearch/blob/c7dc89f3cd86dbc9ad11c2f831c63651053e6e4a/x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/TransportPutFollowAction.java#L244-L255

elasticmachine commented 2 years ago

Pinging @elastic/es-distributed (Team:Distributed)

henningandersen commented 2 years ago

I wonder if we should consider this a problem with auto-following or a problem with "put-follow" instead? Ideally, if the put-follow creates the index, we should also eventually start following.

fcofdez commented 2 years ago

I agree but I suspect that we have some cases where we might end up trying to "auto-follow" an already followed index multiple times during master failovers, this needs some tests to prove it.

DaveCTurner commented 2 years ago

By my reading if we do a resume-follow action on a shard that's already following then it will fail with a ResourceAlreadyExistsException rather than creating a duplicate follower task. It is tricky tho, I don't know that there's a way to distinguish a shard which failed to create the initial follower task from one that was set up successfully and subsequently paused. We might need to add a flag to its index metadata and then do a single cluster state update which creates the follower task and flips the flag.