opensearch-project / cross-cluster-replication

Synchronize your data across multiple clusters for lower latencies and higher availability
https://opensearch.org/docs/latest/replication-plugin/index/
Apache License 2.0
47 stars 58 forks source link

[BUG] Cross Site Replication Fails to Replicate Restored Data on Production #1336

Closed zalseryani closed 6 months ago

zalseryani commented 6 months ago

Describing The Case.

Is there a solution for that ? noting that I could see a _ccr API in elastic search which allows you to update the follower index checkpoint. Or do I need to stop replication between prod and dr, delete follower index, re-trigger index replication again between leader (in prod cluster) and follower index (in dr cluster) ? what would the impact be in case of large data in prod/leader index ?

If there is another best practice way to fix such case which might occur in production, kindly advise. Much Respect.

Related component

Indexing:Replication

To Reproduce

Expected behavior

Additional Details

opensearch version 2.11.1 opensearch prod and dr sites are deployed on kubernetes

mgodwan commented 6 months ago

@opensearch-project/engineering-effectiveness Can you help move this issue to github.com/opensearch-project/cross-cluster-replication?

ankitkala commented 6 months ago

Hi, restoring snapshot breaks the data consistency between leader & follower domain. Only way to mitigate would be to stop replication, delete follower index & restart replication again.