opensearch-project / cross-cluster-replication

Synchronize your data across multiple clusters for lower latencies and higher availability
https://opensearch.org/docs/latest/replication-plugin/index/
Apache License 2.0
51 stars 60 forks source link

[BUG] Cross cluster replication fails to allocate shard on follower cluster #1465

Open borutlukic opened 1 week ago

borutlukic commented 1 week ago

What is the bug? Replication does not start. Shard fails to allocate.

How can one reproduce the bug? Steps to reproduce the behavior:

PUT _plugins/_replication/proxy-2024.11/_start { "leader_alias": "main-cluster", "leader_index": "proxy-2024.11", "use_roles":{ "leader_cluster_role": "all_access", "follower_cluster_role": "all_access" } }

  1. See error GET _plugins/_replication/proxy-2024.11/_status { "status": "FAILED", "reason": "", "leader_alias": "prod-mon-elk-muc", "leader_index": "proxy-2024.11", "follower_index": "proxy-2024.11" }

What is the expected behavior? Replication should start

What is your host/environment?

Do you have any screenshots? N/A

Do you have any additional context? Opensearch logs give lots java stack traces, but they all end with java.lang.IllegalStateException: confined:

Example stack log: [2024-11-25T16:51:05,128][ERROR][o.o.r.r.RemoteClusterRepository] [opensearch-node-114] Restore of [proxy-2024.11][0] failed due to java.lang.IllegalStateException: confined at org.apache.lucene.store.MemorySegmentIndexInput.ensureAccessible(MemorySegmentIndexInput.java:103) at org.apache.lucene.store.MemorySegmentIndexInput.buildSlice(MemorySegmentIndexInput.java:461) at org.apache.lucene.store.MemorySegmentIndexInput.clone(MemorySegmentIndexInput.java:425) at org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl.clone(MemorySegmentIndexInput.java:530) at org.opensearch.replication.repository.RestoreContext.openInput(RestoreContext.kt:39) at org.opensearch.replication.repository.RemoteClusterRestoreLeaderService.openInputStream(RemoteClusterRestoreLeaderService.kt:76) at org.opensearch.replication.action.repository.TransportGetFileChunkAction$shardOperation$1.invoke(TransportGetFileChunkAction.kt:59) at org.opensearch.replication.action.repository.TransportGetFileChunkAction$shardOperation$1.invoke(TransportGetFileChunkAction.kt:57) at org.opensearch.replication.util.ExtensionsKt.performOp(Extensions.kt:55) at org.opensearch.replication.util.ExtensionsKt.performOp$default(Extensions.kt:52) at org.opensearch.replication.action.repository.TransportGetFileChunkAction.shardOperation(TransportGetFileChunkAction.kt:57) at org.opensearch.replication.action.repository.TransportGetFileChunkAction.shardOperation(TransportGetFileChunkAction.kt:33) at org.opensearch.action.support.single.shard.TransportSingleShardAction.lambda$asyncShardOperation$0(TransportSingleShardAction.java:131) at org.opensearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:74) at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:89) at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1005) at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.lang.Thread.run(Thread.java:1583)

Followed by: [2024-11-25T16:51:05,135][ERROR][o.o.r.r.RemoteClusterRepository] [opensearch-node-114] Releasing leader resource failed due to NotSerializableExceptionWrapper[wrong_thread_exception: Attempted access outside owning thread] at jdk.internal.foreign.MemorySessionImpl.wrongThread(MemorySessionImpl.java:315) at jdk.internal.misc.ScopedMemoryAccess$ScopedAccessError.newRuntimeException(ScopedMemoryAccess.java:113) at jdk.internal.foreign.MemorySessionImpl.checkValidState(MemorySessionImpl.java:219) at jdk.internal.foreign.ConfinedSession.justClose(ConfinedSession.java:83) at jdk.internal.foreign.MemorySessionImpl.close(MemorySessionImpl.java:242) at jdk.internal.foreign.MemorySessionImpl$1.close(MemorySessionImpl.java:88) at org.apache.lucene.store.MemorySegmentIndexInput.close(MemorySegmentIndexInput.java:514) at org.opensearch.replication.repository.RestoreContext.close(RestoreContext.kt:52) at org.opensearch.replication.repository.RemoteClusterRestoreLeaderService.removeLeaderClusterRestore(RemoteClusterRestoreLeaderService.kt:142) at org.opensearch.replication.action.repository.TransportReleaseLeaderResourcesAction.shardOperation(TransportReleaseLeaderResourcesAction.kt:48) at org.opensearch.replication.action.repository.TransportReleaseLeaderResourcesAction.shardOperation(TransportReleaseLeaderResourcesAction.kt:31) at org.opensearch.action.support.single.shard.TransportSingleShardAction.lambda$asyncShardOperation$0(TransportSingleShardAction.java:131) at org.opensearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:74) at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:89) at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1005) at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.lang.Thread.run(Thread.java:1583)

borutlukic commented 1 week ago

Setting: "plugins.replication.follower.index.recovery.chunk_size": "1gb", "plugins.replication.follower.index.recovery.max_concurrent_file_chunks": "1"

Seems to fix the issue. It appears that if the files are too large on the primary cluster, replication fails to start unless recovery.chunk_size is big enough to transfer files in one go.

borutlukic commented 5 days ago

It appears that setting 'plugins.replication.follower.index.recovery.chunk_size' to the max (which is 1gb) I can get all but one index to replicate. Would it be possible to raise the limit to higher than 1gb? As there appears to be something strange happening when the transfer of files needs to happen in chunks. The chunk with offset >0 will fail to get transferred with the 'java.lang.IllegalStateException: confined'