elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
818 stars 24.8k forks source link

SimpleBlocksIT.testAddBlockWhileDeletingIndices failing #116071

Open kingherc opened 2 hours ago

kingherc commented 2 hours ago

CI Link

https://gradle-enterprise.elastic.co/s/uz5p3xwiwfsqq

Repro line

./gradlew ":server:internalClusterTest" --tests "org.elasticsearch.blocks.SimpleBlocksIT.testAddBlockWhileDeletingIndices" -Dtests.seed=18DD19966E2CF499 -Dtests.locale=dyo-SN -Dtests.timezone=Asia/Ust-Nera -Druntime.java=23

Does it reproduce?

Didn't try

Applicable branches

main

Failure history

No response

Failure excerpt

Likely introduced by PR https://github.com/elastic/elasticsearch/pull/115341

REPRODUCE WITH: ./gradlew ":server:internalClusterTest" --tests "org.elasticsearch.blocks.SimpleBlocksIT.testAddBlockWhileDeletingIndices" -Dtests.seed=18DD19966E2CF499 -Dtests.locale=dyo-SN -Dtests.timezone=Asia/Ust-Nera -Druntime.java=23

SimpleBlocksIT > testAddBlockWhileDeletingIndices FAILED
    java.lang.AssertionError: [org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener/ChannelActionListener{TaskTransportChannel{task=160}{DirectResponseChannel{req=91}{indices:admin/block/add[s][p]}}}/org.elasticsearch.action.support.replication.TransportReplicationAction$$Lambda/0x00007f63e3a6ebc8@72916952] org.elasticsearch.ElasticsearchException: executed already
        at __randomizedtesting.SeedInfo.seed([18DD19966E2CF499]:0)
        at org.elasticsearch.action.ActionListener$3.assertFirstRun(ActionListener.java:393)
        at org.elasticsearch.action.ActionListener$3.onFailure(ActionListener.java:409)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onFailure(TransportReplicationAction.java:553)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.handleException(TransportReplicationAction.java:547)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.runWithPrimaryShardReference(TransportReplicationAction.java:541)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.lambda$doRun$0(TransportReplicationAction.java:443)
        at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257)
        at org.elasticsearch.index.shard.IndexShard.lambda$wrapPrimaryOperationPermitListener$38(IndexShard.java:3585)
        at org.elasticsearch.action.ActionListenerImplementations$DelegatingFailureActionListener.onResponse(ActionListenerImplementations.java:219)
        at org.elasticsearch.index.shard.IndexShard.lambda$asyncBlockOperations$39(IndexShard.java:3597)
        at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257)
        at org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:400)
        at org.elasticsearch.index.shard.IndexShardOperationPermits$1.doRun(IndexShardOperationPermits.java:119)
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1575)
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.gradle.api.internal.tasks.testing.worker.TestWorker (file:/opt/buildkite-agent/.gradle/wrapper/dists/gradle-8.10.2-all/7iv73wktx1xtkvlq19urqw1wm/gradle-8.10.2/lib/plugins/gradle-testing-base-infrastructure-8.10.2.jar)
WARNING: Please consider reporting this to the maintainers of org.gradle.api.internal.tasks.testing.worker.TestWorker
WARNING: System::setSecurityManager will be removed in a future release
elasticsearchmachine commented 2 hours ago

Pinging @elastic/es-distributed (Team:Distributed)

kingherc commented 1 hour ago

I'm unsure whether this might mean the onFailure might be called twice and whether that has any meaningful negative repercussions, so assinging low risk for now, but will try to handle it now.

kingherc commented 1 hour ago

Found exceptions from inside execute() can escape:

  1> org.elasticsearch.index.shard.IndexShardClosedException: CurrentState[CLOSED] operation only allowed when not closed
  1>    at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:2411) ~[main/:?]
  1>    at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:2405) ~[main/:?]
  1>    at org.elasticsearch.index.shard.IndexShard.getReplicationGroup(IndexShard.java:3004) ~[main/:?]
  1>    at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.getReplicationGroup(TransportReplicationAction.java:1204) ~[main/:?]
  1>    at org.elasticsearch.action.support.replication.ReplicationOperation.checkActiveShardCount(ReplicationOperation.java:482) ~[main/:?]
  1>    at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:118) ~[main/:?]
  1>    at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.runWithPrimaryShardReference(TransportReplicationAction.java:538) ~[main/:?]
  1>    at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.lambda$doRun$0(TransportReplicationAction.java:443) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257) ~[main/:?]
  1>    at org.elasticsearch.index.shard.IndexShard.lambda$wrapPrimaryOperationPermitListener$38(IndexShard.java:3585) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListenerImplementations$DelegatingFailureActionListener.onResponse(ActionListenerImplementations.java:219) ~[main/:?]
  1>    at org.elasticsearch.index.shard.IndexShard.lambda$asyncBlockOperations$39(IndexShard.java:3597) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:400) ~[main/:?]
  1>    at org.elasticsearch.index.shard.IndexShardOperationPermits$1.doRun(IndexShardOperationPermits.java:119) ~[main/:?]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023) ~[main/:?]
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27) ~[main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
  1>    at java.lang.Thread.run(Thread.java:1575) ~[?:?]

Will open a fix.