elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.22k stars 24.85k forks source link

[CI] KibanaThreadPoolIT testUserThreadPoolsAreBlocked failing with "suite timeout was reached" #107625

Closed albertzaharovits closed 3 weeks ago

albertzaharovits commented 7 months ago

A few failures since yesterday https://es-delivery-stats.elastic.dev/app/r/s/gJfOd , all with:

org.elasticsearch.kibana.KibanaThreadPoolIT > testUserThreadPoolsAreBlocked FAILED
    java.lang.Exception: Test abandoned because suite timeout was reached.

Build scan: https://gradle-enterprise.elastic.co/s/gjpjhnts54xoa/tests/:modules:kibana:internalClusterTest/org.elasticsearch.kibana.KibanaThreadPoolIT/testUserThreadPoolsAreBlocked

Reproduction line:

./gradlew ':modules:kibana:internalClusterTest' --tests "org.elasticsearch.kibana.KibanaThreadPoolIT.testUserThreadPoolsAreBlocked" -Dtests.seed=AB39E5A32B6FC95C -Dtests.locale=no -Dtests.timezone=Europe/Isle_of_Man -Druntime.java=17 -Dtests.fips.enabled=true

Applicable branches: main

Reproduces locally?: Didn't try

Failure history: Failure dashboard for org.elasticsearch.kibana.KibanaThreadPoolIT#testUserThreadPoolsAreBlocked&_a=(controlGroupInput:(chainingSystem:HIERARCHICAL,controlStyle:twoLine,ignoreParentSettings:(ignoreFilters:!f,ignoreQuery:!f,ignoreTimerange:!f,ignoreValidations:!t),panels:('0c0c9cb8-ccd2-45c6-9b13-96bac4abc542':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:task.keyword,grow:!t,id:'0c0c9cb8-ccd2-45c6-9b13-96bac4abc542',searchTechnique:wildcard,selectedOptions:!(),singleSelect:!t,title:'Gradle%20Task',width:medium),grow:!t,order:0,type:optionsListControl,width:small),'144933da-5c1b-4257-a969-7f43455a7901':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:name.keyword,grow:!t,id:'144933da-5c1b-4257-a969-7f43455a7901',searchTechnique:wildcard,selectedOptions:!('testUserThreadPoolsAreBlocked'),title:Test,width:medium),grow:!t,order:2,type:optionsListControl,width:medium),'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:className.keyword,grow:!t,id:'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850',searchTechnique:wildcard,selectedOptions:!('org.elasticsearch.kibana.KibanaThreadPoolIT'),title:Suite,width:medium),grow:!t,order:1,type:optionsListControl,width:medium)))))

Failure excerpt:

java.lang.Exception: Test abandoned because suite timeout was reached.

  at __randomizedtesting.SeedInfo.seed([AB39E5A32B6FC95C]:0)
elasticsearchmachine commented 7 months ago

Pinging @elastic/es-core-infra (Team:Core/Infra)

albertzaharovits commented 7 months ago

ping @williamrandolph since you edited the test yesterday.

williamrandolph commented 7 months ago

I don't know that I'll have time to get to this today, but I'll get a look as soon as I can. In the meantime I'm going to go ahead and mute. https://github.com/elastic/elasticsearch/pull/107630

pgomulka commented 6 months ago

judging by the thread dump it is blocked for 19minutes on:

client().prepareSearch(USER_INDEX)
                .setQuery(QueryBuilders.matchAllQuery())
                // Request times out if max concurrent shard requests is set to 1
                .setMaxConcurrentShardRequests(usually() ? SearchRequest.DEFAULT_MAX_CONCURRENT_SHARD_REQUESTS : randomIntBetween(2, 10))
                .get()

"TEST-KibanaThreadPoolIT.testUserThreadPoolsAreBlocked-seed#[AB39E5A32B6FC95C]" ID=34 WAITING on org.elasticsearch.action.support.PlainActionFuture$Sync@7c36647d
    at java.base@17.0.2/jdk.internal.misc.Unsafe.park(Native Method)
    - waiting on org.elasticsearch.action.support.PlainActionFuture$Sync@7c36647d
    at java.base@17.0.2/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
    at java.base@17.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:715)
    at java.base@17.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1047)
    at app//org.elasticsearch.action.support.PlainActionFuture$Sync.get(PlainActionFuture.java:273)
    at app//org.elasticsearch.action.support.PlainActionFuture.get(PlainActionFuture.java:93)
    at app//org.elasticsearch.client.internal.support.AbstractClient$RefCountedFuture.get(AbstractClient.java:1535)
    at app//org.elasticsearch.client.internal.support.AbstractClient$RefCountedFuture.get(AbstractClient.java:1515)
    at app//org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:45)
    at app//org.elasticsearch.action.support.PlainActionFuture.actionGet(PlainActionFuture.java:152)
    at app//org.elasticsearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:43)
    at app//org.elasticsearch.indices.SystemIndexThreadPoolTestCase.lambda$assertThreadPoolsBlocked$3(SystemIndexThreadPoolTestCase.java:96)
    at app//org.elasticsearch.indices.SystemIndexThreadPoolTestCase$$Lambda$3815/0x000000080161f060.run(Unknown Source)
    at app//org.apache.lucene.tests.util.LuceneTestCase._expectThrows(LuceneTestCase.java:3040)
    at app//org.apache.lucene.tests.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2877)
    at app//org.apache.lucene.tests.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2868)
    at app//org.elasticsearch.indices.SystemIndexThreadPoolTestCase.assertThreadPoolsBlocked(SystemIndexThreadPoolTestCase.java:90)
    at app//org.elasticsearch.indices.SystemIndexThreadPoolTestCase$$Lambda$3783/0x000000080160ef40.run(Unknown Source)
    at app//org.elasticsearch.indices.SystemIndexThreadPoolTestCase.runWithBlockedThreadPools(SystemIndexThreadPoolTestCase.java:66)
    at app//org.elasticsearch.indices.SystemIndexThreadPoolTestCase.testUserThreadPoolsAreBlocked(SystemIndexThreadPoolTestCase.java:75)
    at java.base@17.0.2/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base@17.0.2/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base@17.0.2/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

there is also this in logs:

  1> [2024-04-24T15:39:25,570][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][3] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,570][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][0] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,570][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][2] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,570][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][1] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,671][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][4] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,673][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][5] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,675][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][6] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,677][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][7] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,726][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][8] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible
  1> [2024-04-24T15:39:25,727][WARN ][o.e.i.s.IndexShard       ] [node_s0] [user_index][9] performing expensive diagnostic checks during shard startup [index.shard.check_on_startup=true]; these checks should only be enabled temporarily, you must remove this index setting as soon as possible

I wonder if it is relevant

pgomulka commented 6 months ago

I don't think there were changes around this functionality recently. I would lower the status of this medium-risk

pgomulka commented 6 months ago

I am not sure if I got into the wrong direction here. but one way to reproduce this is to allow for the task to end up on a queue for search threadpool. so we would have to modify the fillThreadPoolQueues and run with smaller timeout ./gradlew ':modules:kibana:internalClusterTest' --tests "org.elasticsearch.kibana.KibanaThreadPoolIT" -Dtests.timeoutSuite=30000!

                long queueSize = info.getQueueSize().singles();
                queueSize = threadPoolName.equals("search") ?queueSize - 1 : queueSize;
                for (int i = 0; i < queueSize; i++) {

however I cannot confirm how this would be even possible without this change. The search on user's index should fail - be rejected by the executor, but for some reason it got stuck.

Perhaps we could add some trace logging around submitting a task to check the queue size ? SearchService#runAsync

pgomulka commented 6 months ago

there was still a failure after the refactor. https://gradle-enterprise.elastic.co/s/y7yaqmdj7d22q/tests/task/:modules:kibana:internalClusterTest/details/org.elasticsearch.kibana.KibanaThreadPoolIT/testBlockedThreadPoolsRejectUserRequests?top-execution=1 I added more debugging

pgomulka commented 6 months ago

another failure https://gradle-enterprise.elastic.co/s/tvd7ekijd24gu/tests/task/:modules:kibana:internalClusterTest/details/org.elasticsearch.kibana.KibanaThreadPoolIT/testBlockedThreadPoolsRejectUserRequests/1/output

with the additional logging I cannot see anytask to be removed. the queues were also correctly filled with tasks from the test


Thread pool details get EsThreadPoolExecutor[name = node_s0/get, queue capacity = 1, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@50982f90[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details write TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s0/write, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@30e1ebe9[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details search TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s0/search, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@1a44ef92[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details get EsThreadPoolExecutor[name = node_s1/get, queue capacity = 1, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@3088726f[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details write TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s1/write, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@6b99aea9[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details search TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s1/search, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@35444ad[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details get EsThreadPoolExecutor[name = node_s2/get, queue capacity = 1, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@35b9007e[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details write TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s2/write, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@cabe7d3[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details search TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s2/search, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@7a02f741[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details get EsThreadPoolExecutor[name = node_s3/get, queue capacity = 1, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@372b9f1[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details write TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s3/write, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@38a2baf2[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
Thread pool details search TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s3/search, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@4ac2fd93[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
[org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007faf3fa5a760@4cd609d4]
mosche commented 6 months ago

Another one

elasticsearch / periodic / main / openjdk22 / checkPart1 / java-matrix Test failures: KibanaThreadPoolIT » testBlockedThreadPoolsRejectUserRequests KibanaThreadPoolIT » classMethod

mosche commented 6 months ago

And another one

elasticsearch / periodic / main / graalvm-ce17 / checkPart1 / java-matrix Test failures: KibanaThreadPoolIT » testBlockedThreadPoolsRejectUserRequests KibanaThreadPoolIT » classMethod

pgomulka commented 5 months ago

I have added additional debuging and confirmed that search tasks will 'force put' themselves on the threadpool queue https://gradle-enterprise.elastic.co/s/iscjlj2u2clcs

 1> [2024-05-24T02:05:25,046][INFO ][o.e.a.s.TransportSearchAction] [node_s0] FORK FORK
  1> [2024-05-24T02:05:25,047][INFO ][o.e.c.u.c.EsAbortPolicy  ] [node_s0] FORCE PUT TimedRunnable{original=org.elasticsearch.action.search.AbstractSearchAsyncAction$2@6c666e45, creationTimeNanos=71614759915791, startTimeNanos=0, finishTimeNanos=-1, failedOrRejected=false} TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s0/search, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@df9c20a[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
  1> [2024-05-24T02:05:45,015][DEBUG][o.e.k.KibanaThreadPoolIT ] [[Thread-5]] Thread pool details get EsThreadPoolExecutor[name = node_s0/get, queue capacity = 1, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@77778b3c[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
  1> [2024-05-24T02:05:45,016][DEBUG][o.e.k.KibanaThreadPoolIT ] [[Thread-5]] [org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007e0001b06780@a3d81fc]
  1> [2024-05-24T02:05:45,016][DEBUG][o.e.k.KibanaThreadPoolIT ] [[Thread-5]] Thread pool details write TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s0/write, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@36701621[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 0]]
  1> [2024-05-24T02:05:45,016][DEBUG][o.e.k.KibanaThreadPoolIT ] [[Thread-5]] [org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007e0001b06780@a3d81fc]
  1> [2024-05-24T02:05:45,016][DEBUG][o.e.k.KibanaThreadPoolIT ] [[Thread-5]] Thread pool details search TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s0/search, queue capacity = 1, task execution EWMA = 0s, total task execution time = 0s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@df9c20a[Running, pool size = 1, active threads = 1, queued tasks = 2, completed tasks = 0]]
  1> [2024-05-24T02:05:45,016][DEBUG][o.e.k.KibanaThreadPoolIT ] [[Thread-5]] [org.elasticsearch.kibana.KibanaThreadPoolIT$$Lambda/0x00007e0001b06780@a3d81fc, org.elasticsearch.action.search.AbstractSearchAsyncAction$2@6c666e45]

In a log snipet above you can see 2 tasks enqueued on a search threadpool when the queue was configured to only have size 1. https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/action/search/AbstractSearchAsyncAction.java#L398

pgomulka commented 5 months ago

a little more debugging and it turns out that when the test blocks it has to fall into the fork branch https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/action/search/AbstractSearchAsyncAction.java#L831 this only happens when the originalThread = current thread = elasticsearch[node_s0][transport_worker][T#1]

when the handling of that failure is on a test thread (why woudl that be?? something with the test setup?) it would pass originalThread = elasticsearch[node_s0][transport_worker][T#1] currentThread = TEST-KibanaThreadPoolIT.testBlockedThreadPoolsRejectUserRequests-seed#[3C76559446F8C5C]

stacktrace that is triggering that force execution

 at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNext(AbstractSearchAsyncAction.java:845) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNext(AbstractSearchAsyncAction.java:839) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:341) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:62) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:73) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.action.DelegatingActionListener.onFailure(DelegatingActionListener.java:31) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:53) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:636) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$UnregisterChildTransportResponseHandler.handleException(TransportService.java:1752) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1476) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]
pgomulka commented 5 months ago

I wonder if we are not observing a side effect of https://github.com/elastic/elasticsearch/issues/97916 here sometimes an execution is forked but sometimes it is executed in Test's thread https://github.com/elastic/elasticsearch/issues/97916 https://github.com/elastic/elasticsearch/pull/106915#discussion_r1544626612