elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.42k stars 24.57k forks source link

Retention lease sync actions should not trip circuit breakers #105926

Closed DaveCTurner closed 2 weeks ago

DaveCTurner commented 6 months ago

Today RetentionLeaseSyncAction and RetentionLeaseBackgroundSyncAction are both subject to circuit-breaker and indexing pressure checks, and in particular can fail the ~primary~replica if such a request were to fail. However these actions don't really need any resources so it'd seem better to me if we didn't fail the ~primary~replica in this situation.

~Specifically, I think we should set forceExecutionOnPrimary when calling the superclass constructor (which overrides the indexing pressure checks too, see org.elasticsearch.action.support.replication.TransportWriteAction#force).~

(edit: s/primary/replica)

elasticsearchmachine commented 6 months ago

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner commented 1 month ago

Correcting my previous Correction: although the circuit breaker may fail a replica, it looks like the indexing pressure check may fail a primary too.

org.elasticsearch.transport.RemoteTransportException: [REDACTED][indices:admin/seq_no/retention_lease_sync[p]]
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of primary operation [coordinating_and_primary_bytes=2530540854, replica_bytes=2362878422, all_bytes=4893419276, primary_operation_bytes=0, max_coordinating_and_primary_bytes=3189348761]
    at org.elasticsearch.index.IndexingPressure.markPrimaryOperationStarted(IndexingPressure.java:136) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.replication.TransportWriteAction.checkPrimaryLimits(TransportWriteAction.java:141) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.replication.TransportWriteAction.checkPrimaryLimits(TransportWriteAction.java:54) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.replication.TransportReplicationAction.handlePrimaryRequest(TransportReplicationAction.java:349) ~[elasticsearch-8.11.1.jar:?]
    ...
    at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:842) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:906) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:883) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.seqno.RetentionLeaseSyncAction.sync(RetentionLeaseSyncAction.java:117) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.seqno.RetentionLeaseSyncer.sync(RetentionLeaseSyncer.java:44) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.shard.IndexShard.lambda$new$0(IndexShard.java:370) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.seqno.ReplicationTracker.cloneRetentionLease(ReplicationTracker.java:344) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.seqno.ReplicationTracker.cloneLocalPeerRecoveryRetentionLease(ReplicationTracker.java:528) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.shard.IndexShard.cloneLocalPeerRecoveryRetentionLease(IndexShard.java:3227) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.lambda$createRetentionLease$31(RecoverySourceHandler.java:972) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.lambda$updateRetentionLease$33(RecoverySourceHandler.java:1020) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:299) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.lambda$updateRetentionLease$34(RecoverySourceHandler.java:1017) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.lambda$runUnderPrimaryPermit$19(RecoverySourceHandler.java:423) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListener.run(ActionListener.java:368) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.lambda$runUnderPrimaryPermit$20(RecoverySourceHandler.java:401) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$DelegatingFailureActionListener.onResponse(ActionListenerImplementations.java:212) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.shard.IndexShard.lambda$wrapPrimaryOperationPermitListener$27(IndexShard.java:3405) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$DelegatingFailureActionListener.onResponse(ActionListenerImplementations.java:212) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.shard.IndexShardOperationPermits.innerAcquire(IndexShardOperationPermits.java:254) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:202) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:3376) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:3366) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.runUnderPrimaryPermit(RecoverySourceHandler.java:401) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.updateRetentionLease(RecoverySourceHandler.java:1016) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.createRetentionLease(RecoverySourceHandler.java:960) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler$FileBasedRecoveryContext.lambda$run$4(RecoverySourceHandler.java:736) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:236) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:310) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:230) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:259) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:173) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.MultiChunkTransfer.onCompleted(MultiChunkTransfer.java:149) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.MultiChunkTransfer.handleItems(MultiChunkTransfer.java:113) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.MultiChunkTransfer$1.write(MultiChunkTransfer.java:72) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:97) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcessAndRelease(AsyncIOProcessor.java:85) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:73) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.MultiChunkTransfer.addItem(MultiChunkTransfer.java:83) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.MultiChunkTransfer.start(MultiChunkTransfer.java:79) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler.sendFiles(RecoverySourceHandler.java:1436) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler$FileBasedRecoveryContext.lambda$run$3(RecoverySourceHandler.java:727) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:236) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:310) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:230) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:259) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:173) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$DelegatingResponseActionListener.onResponse(ActionListenerImplementations.java:182) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler$SnapshotRecoverFileRequestsSender.onRequestCompletion(RecoverySourceHandler.java:890) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler$SnapshotRecoverFileRequestsSender$1.onResponse(RecoverySourceHandler.java:828) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.indices.recovery.RecoverySourceHandler$SnapshotRecoverFileRequestsSender$1.onResponse(RecoverySourceHandler.java:825) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:310) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:230) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:259) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:173) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener.onResponse(ActionListenerImplementations.java:298) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$MappedActionListener.onResponse(ActionListenerImplementations.java:95) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener.onResponse(ActionListenerImplementations.java:298) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.support.RetryableAction$RetryingListener.onResponse(RetryableAction.java:149) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener.onResponse(ActionListenerImplementations.java:298) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:49) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1411) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:433) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.transport.InboundHandler$2.doRun(InboundHandler.java:390) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983) ~[elasticsearch-8.11.1.jar:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.11.1.jar:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
    at java.lang.Thread.run(Thread.java:1583) ~[?:?]