Open kingherc opened 2 hours ago
Pinging @elastic/es-distributed (Team:Distributed)
I'm unsure whether this might mean the onFailure might be called twice and whether that has any meaningful negative repercussions, so assinging low risk for now, but will try to handle it now.
Found exceptions from inside execute() can escape:
1> org.elasticsearch.index.shard.IndexShardClosedException: CurrentState[CLOSED] operation only allowed when not closed
1> at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:2411) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:2405) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShard.getReplicationGroup(IndexShard.java:3004) ~[main/:?]
1> at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.getReplicationGroup(TransportReplicationAction.java:1204) ~[main/:?]
1> at org.elasticsearch.action.support.replication.ReplicationOperation.checkActiveShardCount(ReplicationOperation.java:482) ~[main/:?]
1> at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:118) ~[main/:?]
1> at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.runWithPrimaryShardReference(TransportReplicationAction.java:538) ~[main/:?]
1> at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.lambda$doRun$0(TransportReplicationAction.java:443) ~[main/:?]
1> at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShard.lambda$wrapPrimaryOperationPermitListener$38(IndexShard.java:3585) ~[main/:?]
1> at org.elasticsearch.action.ActionListenerImplementations$DelegatingFailureActionListener.onResponse(ActionListenerImplementations.java:219) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShard.lambda$asyncBlockOperations$39(IndexShard.java:3597) ~[main/:?]
1> at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257) ~[main/:?]
1> at org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:400) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShardOperationPermits$1.doRun(IndexShardOperationPermits.java:119) ~[main/:?]
1> at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023) ~[main/:?]
1> at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27) ~[main/:?]
1> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
1> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
1> at java.lang.Thread.run(Thread.java:1575) ~[?:?]
Will open a fix.
CI Link
https://gradle-enterprise.elastic.co/s/uz5p3xwiwfsqq
Repro line
./gradlew ":server:internalClusterTest" --tests "org.elasticsearch.blocks.SimpleBlocksIT.testAddBlockWhileDeletingIndices" -Dtests.seed=18DD19966E2CF499 -Dtests.locale=dyo-SN -Dtests.timezone=Asia/Ust-Nera -Druntime.java=23
Does it reproduce?
Didn't try
Applicable branches
main
Failure history
No response
Failure excerpt
Likely introduced by PR https://github.com/elastic/elasticsearch/pull/115341