Closed spinscale closed 6 years ago
I briefly looked at the NoSuchFileException I suspect it is an concurrency issue between reading and writing state files. This was previously raised by @ywelsch but we have yet to solve it.
This test failed again today on master but with a different issue related to failing to delete a index file:
ERROR 33.5s J0 | RareClusterStateIT.testUnassignedShardAndEmptyNodesInRoutingTable <<< FAILURES!
> Throwable #1: java.lang.AssertionError: Delete Index failed - not acked
> Expected: <true>
> but: was <false>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked(ElasticsearchAssertions.java:131)
> at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked(ElasticsearchAssertions.java:127)
> at org.elasticsearch.test.TestCluster.wipeIndices(TestCluster.java:140)
> at org.elasticsearch.test.TestCluster.wipe(TestCluster.java:77)
> at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:575)
> at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2036)
> at jdk.internal.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:547)
> at java.base/java.lang.Thread.run(Thread.java:844)Throwable #2: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=8467, name=elasticsearch[node_t0][clusterService#updateTask][T#1], state=RUNNABLE, group=TGRP-RareClusterStateIT]
> Caused by: java.lang.AssertionError
> at __randomizedtesting.SeedInfo.seed([3A6C7188C00DBD]:0)
> at org.elasticsearch.env.NodeEnvironment.deleteShardDirectoryUnderLock(NodeEnvironment.java:477)
> at org.elasticsearch.indices.IndicesService.deleteShardStore(IndicesService.java:683)
> at org.elasticsearch.index.IndexService.onShardClose(IndexService.java:448)
> at org.elasticsearch.index.IndexService.access$100(IndexService.java:93)
> at org.elasticsearch.index.IndexService$StoreCloseListener.handle(IndexService.java:530)
> at org.elasticsearch.index.IndexService$StoreCloseListener.handle(IndexService.java:515)
> at org.elasticsearch.index.store.Store.closeInternal(Store.java:382)
> at org.elasticsearch.index.store.Store.access$000(Store.java:129)
> at org.elasticsearch.index.store.Store$1.closeInternal(Store.java:150)
> at org.elasticsearch.common.util.concurrent.AbstractRefCounted.decRef(AbstractRefCounted.java:65)
> at org.elasticsearch.index.store.Store.decRef(Store.java:364)
> at org.elasticsearch.index.store.Store.close(Store.java:372)
> at org.elasticsearch.index.IndexService.closeShard(IndexService.java:428)
> at org.elasticsearch.index.IndexService.removeShard(IndexService.java:399)
> at org.elasticsearch.index.IndexService.close(IndexService.java:254)
> at org.elasticsearch.indices.IndicesService.removeIndex(IndicesService.java:542)
> at org.elasticsearch.indices.cluster.IndicesClusterStateService.deleteIndices(IndicesClusterStateService.java:263)
> at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:197)
> at org.elasticsearch.cluster.service.ClusterService.callClusterStateAppliers(ClusterService.java:861)
> at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:815)
> at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:633)
> at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:1117)
> at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569)
> at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238)
> at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201)
> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1161)
> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at java.base/java.lang.Thread.run(Thread.java:844)
This did not reproduce for me but the line is:
gradle :core:integTest -Dtests.seed=3A6C7188C00DBD -Dtests.class=org.elasticsearch.indices.state.RareClusterStateIT -Dtests.method="testUnassignedShardAndEmptyNodesInRoutingTable" -Dtests.security.manager=true -Dtests.jvm.argline="--add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.nio.file=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED --add-opens=java.base/java.util.regex=ALL-UNNAMED" -Dtests.locale=pt-GW -Dtests.timezone=Asia/Ulan_Bator
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+java9-periodic/2048/consoleText
@ywelsch can you take a look please?
@jaymode It's the same issue (exposed in the stack trace):
Caused by: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([3A6C7188C00DBD]:0) at org.elasticsearch.env.NodeEnvironment.deleteShardDirectoryUnderLock(NodeEnvironment.java:477)
I believe UpdateNumberOfReplicasIT.testAutoExpandNumberReplicas1ToData failed here for the same reason: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=debian/668 .
another one here: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+g1gc/2629 .
another one on 5.6 with java 9
gradle :core:integTest -Dtests.seed=5CA06AE61C782F90 -Dtests.class=org.elasticsearch.indices.state.RareClusterStateIT -Dtests.method="testUnassignedShardAndEmptyNodesInRoutingTable" -Dtests.security.manager=true -Dtests.jvm.argline="--add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.nio.file=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED --add-opens=java.base/java.util.regex=ALL-UNNAMED" -Dtests.locale=sbp-TZ -Dtests.timezone=SystemV/EST5
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.6+java9-periodic/99/consoleFull
Looks like we are cumulating these errors:
java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([74C174345C2EC14F]:0) at org.elasticsearch.env.NodeEnvironment.deleteShardDirectoryUnderLock(NodeEnvironment.java:460) at org.elasticsearch.indices.IndicesService.deleteShardStore(IndicesService.java:695)
See consoleText.txt
Another instance of this:
ERROR 43.5s J2 | RareClusterStateIT.testUnassignedShardAndEmptyNodesInRoutingTable <<< FAILURES!
> Throwable #1: java.lang.AssertionError: Delete Index failed - not acked
> Expected: <true>
> but: was <false>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked(ElasticsearchAssertions.java:134)
> at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked(ElasticsearchAssertions.java:130)
> at org.elasticsearch.test.TestCluster.wipeIndices(TestCluster.java:142)
> at org.elasticsearch.test.TestCluster.wipe(TestCluster.java:79)
> at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:578)
> at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2075)
> at java.lang.Thread.run(Thread.java:745)Throwable #2: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=1538, name=elasticsearch[node_t0][clusterApplierService#updateTask][T#1], state=RUNNABLE, group=TGRP-RareClusterStateIT]
> Caused by: java.lang.AssertionError
> at __randomizedtesting.SeedInfo.seed([429E55B62DC1E952]:0)
> at org.elasticsearch.env.NodeEnvironment.deleteShardDirectoryUnderLock(NodeEnvironment.java:453)
> at org.elasticsearch.indices.IndicesService.deleteShardStore(IndicesService.java:695)
> at org.elasticsearch.index.IndexService.onShardClose(IndexService.java:464)
> at org.elasticsearch.index.IndexService.access$100(IndexService.java:98)
> at org.elasticsearch.index.IndexService$StoreCloseListener.accept(IndexService.java:543)
> at org.elasticsearch.index.IndexService$StoreCloseListener.accept(IndexService.java:530)
> at org.elasticsearch.index.store.Store.closeInternal(Store.java:448)
> at org.elasticsearch.index.store.Store.access$000(Store.java:130)
> at org.elasticsearch.index.store.Store$1.closeInternal(Store.java:151)
> at org.elasticsearch.common.util.concurrent.AbstractRefCounted.decRef(AbstractRefCounted.java:65)
> at org.elasticsearch.index.store.Store.decRef(Store.java:430)
> at org.elasticsearch.index.store.Store.close(Store.java:438)
> at org.elasticsearch.index.IndexService.closeShard(IndexService.java:445)
> at org.elasticsearch.index.IndexService.removeShard(IndexService.java:415)
> at org.elasticsearch.index.IndexService.close(IndexService.java:275)
> at org.elasticsearch.indices.IndicesService.removeIndex(IndicesService.java:554)
> at org.elasticsearch.indices.cluster.IndicesClusterStateService.deleteIndices(IndicesClusterStateService.java:285)
> at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:219)
> at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498)
> at java.lang.Iterable.forEach(Iterable.java:75)
> at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495)
> at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482)
> at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
> at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161)
> at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:566)
> at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244)
> at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
See consoleText3.txt
@bleskes This build failure is open for over a year. Can you please see that it is addressed?
Another instance of this failure:
ERROR 43.4s J0 | RareClusterStateIT.testUnassignedShardAndEmptyNodesInRoutingTable <<< FAILURES!
> Throwable #1: java.lang.AssertionError: Delete Index failed - not acked
> Expected: <true>
> but: was <false>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked(ElasticsearchAssertions.java:134)
> at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked(ElasticsearchAssertions.java:130)
> at org.elasticsearch.test.TestCluster.wipeIndices(TestCluster.java:141)
> at org.elasticsearch.test.TestCluster.wipe(TestCluster.java:78)
> at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:579)
> at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2086)
> at java.lang.Thread.run(Thread.java:748)Throwable #2: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=1595, name=elasticsearch[node_t0][clusterApplierService#updateTask][T#1], state=RUNNABLE, group=TGRP-RareClusterStateIT]
> Caused by: java.lang.AssertionError: Paths exist that should have been deleted: [/private/var/lib/jenkins/workspace/elastic+elasticsearch+6.x+multijob-darwin-compatibility/server/build/testrun/integTest/J0/temp/org.elasticsearch.indices.state.RareClusterStateIT_5AC245D3140EDCC0-001/tempDir-004/data/nodes/0/indices/dCLhOOkKS9qeFqcGEXM_-w/0]
> at __randomizedtesting.SeedInfo.seed([5AC245D3140EDCC0]:0)
> at org.elasticsearch.env.NodeEnvironment.assertPathsDoNotExist(NodeEnvironment.java:470)
> at org.elasticsearch.env.NodeEnvironment.deleteShardDirectoryUnderLock(NodeEnvironment.java:460)
> at org.elasticsearch.indices.IndicesService.deleteShardStore(IndicesService.java:696)
> at org.elasticsearch.index.IndexService.onShardClose(IndexService.java:463)
> at org.elasticsearch.index.IndexService.access$100(IndexService.java:97)
> at org.elasticsearch.index.IndexService$StoreCloseListener.accept(IndexService.java:542)
> at org.elasticsearch.index.IndexService$StoreCloseListener.accept(IndexService.java:529)
> at org.elasticsearch.index.store.Store.closeInternal(Store.java:440)
> at org.elasticsearch.index.store.Store.access$000(Store.java:130)
> at org.elasticsearch.index.store.Store$1.closeInternal(Store.java:151)
> at org.elasticsearch.common.util.concurrent.AbstractRefCounted.decRef(AbstractRefCounted.java:65)
> at org.elasticsearch.index.store.Store.decRef(Store.java:422)
> at org.elasticsearch.index.store.Store.close(Store.java:430)
> at org.elasticsearch.index.IndexService.closeShard(IndexService.java:444)
> at org.elasticsearch.index.IndexService.removeShard(IndexService.java:414)
> at org.elasticsearch.index.IndexService.close(IndexService.java:274)
> at org.elasticsearch.indices.IndicesService.removeIndex(IndicesService.java:555)
> at org.elasticsearch.indices.cluster.IndicesClusterStateService.deleteIndices(IndicesClusterStateService.java:285)
> at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:219)
> at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498)
> at java.lang.Iterable.forEach(Iterable.java:75)
> at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495)
> at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482)
> at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
> at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161)
> at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:566)
> at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244)
> at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
another one at https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.6+multijob-unix-compatibility/os=fedora/810/consoleText
not reproducible locally
@bleskes Can you help find a path forward on getting this build failure resolved? I know that #19338 was proposed previously; can we take another look?
Another one here:
> Caused by: java.lang.AssertionError: Paths exist that should have been deleted: [/var/lib/jenkins/workspace/elastic+elasticsearch+master+multijob-java-periodic/ESJAVA/java9/ESRUNTIME/java10/nodes/linux/server/build/testrun/integTest/J0/temp/org.elasticsearch.indices.state.RareClusterStateIT_C9901BC8F496F938-001/tempDir-004/d1/nodes/0/indices/rFA6bNDHQT-cB8eWUA1MPw/0, /var/lib/jenkins/workspace/elastic+elasticsearch+master+multijob-java-periodic/ESJAVA/java9/ESRUNTIME/java10/nodes/linux/server/build/testrun/integTest/J0/temp/org.elasticsearch.indices.state.RareClusterStateIT_C9901BC8F496F938-001/tempDir-004/d3/nodes/0/indices/rFA6bNDHQT-cB8eWUA1MPw/0, /var/lib/jenkins/workspace/elastic+elasticsearch+master+multijob-java-periodic/ESJAVA/java9/ESRUNTIME/java10/nodes/linux/server/build/testrun/integTest/J0/temp/org.elasticsearch.indices.state.RareClusterStateIT_C9901BC8F496F938-001/tempDir-004/d0/nodes/0/indices/rFA6bNDHQT-cB8eWUA1MPw/0, /var/lib/jenkins/workspace/elastic+elasticsearch+master+multijob-java-periodic/ESJAVA/java9/ESRUNTIME/java10/nodes/linux/server/build/testrun/integTest/J0/temp/org.elasticsearch.indices.state.RareClusterStateIT_C9901BC8F496F938-001/tempDir-004/d2/nodes/0/indices/rFA6bNDHQT-cB8eWUA1MPw/0]
> at __randomizedtesting.SeedInfo.seed([C9901BC8F496F938]:0)
> at org.elasticsearch.env.NodeEnvironment.assertPathsDoNotExist(NodeEnvironment.java:462)
> at org.elasticsearch.env.NodeEnvironment.deleteShardDirectoryUnderLock(NodeEnvironment.java:452)
> at org.elasticsearch.indices.IndicesService.deleteShardStore(IndicesService.java:696)
> at org.elasticsearch.index.IndexService.onShardClose(IndexService.java:463)
...
1> java.nio.file.NoSuchFileException: /var/lib/jenkins/workspace/elastic+elasticsearch+master+multijob-java-periodic/ESJAVA/java9/ESRUNTIME/java10/nodes/linux/server/build/testrun/integTest/J0/temp/org.elasticsearch.indices.state.RareClusterStateIT_C9901BC8F496F938-001/tempDir-004/d0/nodes/0/indices/rFA6bNDHQT-cB8eWUA1MPw/0/_state/state-0.st
1> at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]
1> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
1> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
1> at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:215) ~[?:?]
1> at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212) ~[lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
1> at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212) ~[lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
1> at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212) ~[lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
1> at org.apache.lucene.mockfile.HandleTrackingFS.newByteChannel(HandleTrackingFS.java:240) ~[lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
1> at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212) ~[lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
1> at org.apache.lucene.mockfile.HandleTrackingFS.newByteChannel(HandleTrackingFS.java:240) ~[lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
1> at java.nio.file.Files.newByteChannel(Files.java:369) ~[?:?]
1> at java.nio.file.Files.newByteChannel(Files.java:415) ~[?:?]
1> at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:77) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
1> at org.elasticsearch.gateway.MetaDataStateFormat.read(MetaDataStateFormat.java:179) ~[main/:?]
1> at org.elasticsearch.gateway.MetaDataStateFormat.loadLatestState(MetaDataStateFormat.java:319) [main/:?]
1> at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:119) [main/:?]
1> at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:61) [main/:?]
1> at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:140) [main/:?]
1> at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:260) [main/:?]
1> at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:256) [main/:?]
1> at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) [main/:?]
1> at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:656) [main/:?]
1> at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) [main/:?]
1> at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [main/:?]
1> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) [?:?]
1> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
1> at java.lang.Thread.run(Thread.java:844) [?:?]
...
Note to future commenters: please seek out a stack trace for the code that resurrected the directory, rather than just the assertion failure. One may expect it to involve a call to MetaDataStateFormat.loadLatestState
which should help with the search. The few examples that I can find only implicate TransportNodesListGatewayStartedShards
, but we want to know if there are any other ways to get into this state.
I took a look at testUnassignedShardAndEmptyNodesInRoutingTable and that test is as old as time and does a very bogus thing - it is an IT test which extracts the GatewayAllocator
from the node and tells it to allocated unassigned shards, while giving it a conjured cluster state with no nodes in it (it uses the DiscoveryNodes.EMPTY_NODES
. This is never a cluster state we want to reroute on (we always have at least master node in it). I'm going to just delete the test as I don't think it adds much value.
Obviously there is a problem here but I feel this is better tracked by #29140 where we'll add a targeted test.
Elasticsearch version: 5.x branch
Details at https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=fedora/214/consoleFull
Two exceptions are popping up here (compared to my local run, where dont get any exception in the logs), one is a
ClassCastException
for trying to castLocalTransportAddress
toInetSocketTransportAddress
(as the test does not mock zenpings), but also there is aNoSuchFileException
for an index, when cleaning up.@jpountz or @bleskes can you take a look maybe?