opensearch-project / cross-cluster-replication

Synchronize your data across multiple clusters for lower latencies and higher availability
https://opensearch.org/docs/latest/replication-plugin/index/
Apache License 2.0
48 stars 58 forks source link

[Discuss] Long term plan for cross-cluster-replication #558

Closed ankitkala closed 1 year ago

ankitkala commented 2 years ago

There have been multiple discussions around cross-cluster-replication which we'd like to reconcile and present a long term plan for CCR plugin.


Full cluster replication with failover: Currently cross-cluster-replication supports only index level replication. We eventually want to build the support for full cluster replication with failover capabilities. This'd allow users to have a backup cluster ready to serve reads incase the primary cluster goes down. We don't have a clear timelines on when we'll start on this. But the intention here is to share the final desired state which will a major consideration during the design discussions.


CCR move to the core: https://github.com/opensearch-project/OpenSearch/issues/2872 There have been discussions aroung moving the CCR to the OS core (issue). Major concern with CCR as plugin was due to dependency of CCR on internal component methods(engine and translog) which are not guaranteed to the external plugin and can easily break if there are any major changes done on the core(example).


Remove CCR dependency on security: CCR in the current state can't be moved completely to the core. Primary reason is that CCR has a strong dependency on the security plugin. CCR requires multiple persistent tasks on the follower cluster which either monitor the ongoing replications(cluster/index/shard level) or fetch the changes from leader. For these tasks we rely on the FGAC role for authorization. These roles are passed by the user in API when starting the replication and we persist this info as part of Replcation metadata in system index. Here is the public documentation for the CCR security For CCR to completely move to OS core, we'll have to rethink our security model for CCR. It'd also be a breaking change for the existing customers. This'll require a dedicated effort and won't be combined with any other projects mentioned below.


Using Segment Replication for CCR: We're working on exploring Segment Replication(https://github.com/opensearch-project/OpenSearch/issues/3020) for CCR where we'd now sync segments from each replicating shard from leader cluster to follower cluster. To achieve this, we'd be extending/reusing the modules for local Segment Replication. Will add more details when we put up the proposal. We can't implement this completely in core as we will still be relying on existing APIs and persistent tasks in CCR plugin for managing the replication. However, any implementation that requires knowledge of core OS components would reside in OS core.


Utilizing Remote store & segment replication: With integration of segment replication with remote store, replicas now would be able to sync segments directly from the remote store. CCR also plans to leverage from similar setup. We'll be ensuring that the design also extends to the cross cluster usecases as well.


Logical replication and fetch from translogs: After move to segment based replication, we'll still keep supporting logical replication till the existing customers migrate to Segment Replication. Even after the migration, we might want to keep the support for logical intact while the Segment Replication becomes the default choice. Logical replication can be helpful incase we want to explore active-active replication in future. Similarly, doc level filtering in replication can be supported via the logical replication, if required.

For logical replication, the ops from leader shard can be fetched from either lucene or translogs. CCR relies on translogs as we've observed 8-10% CPU impact on leader while fetching the changes from lucene. This is because operations from lucene requires decompression(refer here and here). We'll be relying on Pluggable Translogs for fetching the operations(https://github.com/opensearch-project/cross-cluster-replication/issues/375). It should continue to work as expected with support for translogs in remote store as well. One deviation here is that currently we fetch the operations for primary as well as replicas on the leader shard. This is done for loadbalancing the requests to leader cluster which might not work with remote store where only primary shard would be able to fetch the translogs. We need to either build support for fetching the translogs from both primary & replicas. If we can't have this support incase of remote translogs, we'll need to add additional handling to always rely on primary shard in such cases.

Incase, user has opted for no durability, we can either fallback to lucene or decide to not even support logical replication is such case.

ankitkala commented 2 years ago

@Bukhtawar @nknize @mch2 @sachinpkale

Bukhtawar commented 2 years ago

Thanks @ankitkala for the doc Could you please re-structure the issue a bit to give some context to the problem, call out approaches, short term vs long term and caveats with them as applicable. Feel free to break it into multiple issues as discussions on each sub-topic might become overwhelming and the focus lost

ankitkala commented 2 years ago

Thanks @ankitkala for the doc Could you please re-structure the issue a bit to give some context to the problem, call out approaches, short term vs long term and caveats with them as applicable. Feel free to break it into multiple issues as discussions on each sub-topic might become overwhelming and the focus lost

Sure, Let me do that. I agree it might be overwhelming to discuss all the topics on this single issue. This serves mostly as a reconciliation of all the issues to provide the big picture. There already exists separate issues for most sub-topics. For areas where we haven't done any scoping(like full cluster replication, decoupling from security), we can start the issues as soon as we have some clarity around those.

Here are the ones already created for reference: CCR move to the core: https://github.com/opensearch-project/OpenSearch/issues/2872 Translog extension: https://github.com/opensearch-project/cross-cluster-replication/issues/375 Segment Replication: https://github.com/opensearch-project/OpenSearch/issues/3020 (will add more details after we finalize the proposal)

elfisher commented 2 years ago

@andrross could you help transfer this to the cross cluster replication repo? It looks like that would be a good place to track this.

andrross commented 2 years ago

@elfisher done