Closed colings86 closed 6 years ago
another one (from 5.x): https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-intake/1249 . I saw this fail on master too.
Failed on master with leaked treads: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+oracle-java10-periodic/98/consoleText
Failure:
1> [2018-02-17T21:58:39,882][INFO ][o.e.n.Node ] [node_s0] closed
2> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl$$Lambda$1438/416814803.apply(Unknown Source)
2> at java.base@10-ea/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
2> at java.base@10-ea/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1492)
2> at java.base@10-ea/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
2> at java.base@10-ea/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
2> at java.base@10-ea/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
2> at java.base@10-ea/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2> at java.base@10-ea/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:553)
2> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.instances(GceInstancesServiceImpl.java:82)
2> at app//org.elasticsearch.discovery.gce.GceUnicastHostsProvider.buildDynamicNodes(GceUnicastHostsProvider.java:132)
2> at app//org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:309)
2> at app//org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:286)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery.pingAndWait(ZenDiscovery.java:1044)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:894)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:448)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery.access$2500(ZenDiscovery.java:90)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1253)
2> at app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573)
2> at java.base@10-ea/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
2> at java.base@10-ea/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2> at java.base@10-ea/java.lang.Thread.run(Thread.java:844)
2> Feb 18, 2018 2:58:44 AM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
2> INFO: Starting to interrupt leaked threads:
2> 1) Thread[id=35, name=elasticsearch[node_s0][generic][T#1], state=RUNNABLE, group=TGRP-GceDiscoverTests]
2> Feb 18, 2018 2:58:47 AM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
2> SEVERE: There are still zombie threads that couldn't be terminated:
2> 1) Thread[id=35, name=elasticsearch[node_s0][generic][T#1], state=RUNNABLE, group=TGRP-GceDiscoverTests]
2> at java.base@10-ea/java.net.SocketInputStream.socketRead0(Native Method)
2> at java.base@10-ea/java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
2> at java.base@10-ea/java.net.SocketInputStream.read(SocketInputStream.java:171)
2> at java.base@10-ea/java.net.SocketInputStream.read(SocketInputStream.java:141)
2> at java.base@10-ea/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:425)
2> at java.base@10-ea/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:154)
2> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1031)
2> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
2> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1402)
2> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1429)
2> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
2> at java.base@10-ea/sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:567)
2> at java.base@10-ea/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
2> at java.base@10-ea/sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:163)
2> at app//com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:104)
2> at app//com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
2> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
2> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
2> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
2> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.lambda$instances$0(GceInstancesServiceImpl.java:71)
2> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl$$Lambda$1441/2115050294.run(Unknown Source)
2> at java.base@10-ea/java.security.AccessController.doPrivileged(Native Method)
ERROR 0.00s J0 | GceDiscoverTests (suite) <<< FAILURES!
2> at app//org.elasticsearch.cloud.gce.util.Access.doPrivilegedIOException(Access.java:59)
2> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.lambda$instances$2(GceInstancesServiceImpl.java:69)
2> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl$$Lambda$1438/416814803.apply(Unknown Source)
2> at java.base@10-ea/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
2> at java.base@10-ea/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1492)
2> at java.base@10-ea/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
2> at java.base@10-ea/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
2> at java.base@10-ea/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
2> at java.base@10-ea/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2> at java.base@10-ea/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:553)
2> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.instances(GceInstancesServiceImpl.java:82)
2> at app//org.elasticsearch.discovery.gce.GceUnicastHostsProvider.buildDynamicNodes(GceUnicastHostsProvider.java:132)
2> at app//org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:309)
2> at app//org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:286)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery.pingAndWait(ZenDiscovery.java:1044)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:894)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:448)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery.access$2500(ZenDiscovery.java:90)
2> at app//org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1253)
2> at app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573)
2> at java.base@10-ea/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
2> at java.base@10-ea/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2> at java.base@10-ea/java.lang.Thread.run(Thread.java:844)
2> REPRODUCE WITH: ./gradlew :plugins:discovery-gce:test -Dtests.seed=BF02897A506A46A6 -Dtests.class=org.elasticsearch.discovery.gce.GceDiscoverTests -Dtests.security.manager=true -Dtests.locale=en-US -Dtests.timezone=Etc/UTC
2> REPRODUCE WITH: ./gradlew :plugins:discovery-gce:test -Dtests.seed=BF02897A506A46A6 -Dtests.class=org.elasticsearch.discovery.gce.GceDiscoverTests -Dtests.security.manager=true -Dtests.locale=en-US -Dtests.timezone=Etc/UTC
2> NOTE: test params are: codec=Asserting(Lucene70): {}, docValues:{}, maxPointsInLeafNode=1774, maxMBSortInHeap=7.55923951920908, sim=RandomSimilarity(queryNorm=true): {}, locale=yo-NG, timezone=America/Bogota
2> NOTE: Linux 4.4.92-6.18-default amd64/Oracle Corporation 10-ea (64-bit)/cpus=4,threads=2,free=456735288,total=536870912
2> NOTE: All tests run in this JVM: [RetryHttpInitializerWrapperTests, GceDiscoverTests]
> Throwable #1: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.elasticsearch.discovery.gce.GceDiscoverTests:
> 1) Thread[id=35, name=elasticsearch[node_s0][generic][T#1], state=RUNNABLE, group=TGRP-GceDiscoverTests]
> at java.base@10-ea/java.net.SocketInputStream.socketRead0(Native Method)
> at java.base@10-ea/java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.base@10-ea/java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.base@10-ea/java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.base@10-ea/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:425)
> at java.base@10-ea/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:154)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1031)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1402)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1429)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
> at java.base@10-ea/sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:567)
> at java.base@10-ea/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
> at java.base@10-ea/sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:163)
> at app//com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:104)
> at app//com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.lambda$instances$0(GceInstancesServiceImpl.java:71)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl$$Lambda$1441/2115050294.run(Unknown Source)
> at java.base@10-ea/java.security.AccessController.doPrivileged(Native Method)
> at app//org.elasticsearch.cloud.gce.util.Access.doPrivilegedIOException(Access.java:59)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.lambda$instances$2(GceInstancesServiceImpl.java:69)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl$$Lambda$1438/416814803.apply(Unknown Source)
> at java.base@10-ea/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at java.base@10-ea/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1492)
> at java.base@10-ea/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at java.base@10-ea/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at java.base@10-ea/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
> at java.base@10-ea/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at java.base@10-ea/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:553)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.instances(GceInstancesServiceImpl.java:82)
> at app//org.elasticsearch.discovery.gce.GceUnicastHostsProvider.buildDynamicNodes(GceUnicastHostsProvider.java:132)
> at app//org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:309)
> at app//org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:286)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery.pingAndWait(ZenDiscovery.java:1044)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:894)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:448)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery.access$2500(ZenDiscovery.java:90)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1253)
> at app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573)
> at java.base@10-ea/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
> at java.base@10-ea/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at java.base@10-ea/java.lang.Thread.run(Thread.java:844)
> at __randomizedtesting.SeedInfo.seed([BF02897A506A46A6]:0)Throwable #2: com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie threads that couldn't be terminated:
> 1) Thread[id=35, name=elasticsearch[node_s0][generic][T#1], state=RUNNABLE, group=TGRP-GceDiscoverTests]
> at java.base@10-ea/java.net.SocketInputStream.socketRead0(Native Method)
> at java.base@10-ea/java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.base@10-ea/java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.base@10-ea/java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.base@10-ea/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:425)
> at java.base@10-ea/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:154)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1031)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1402)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1429)
> at java.base@10-ea/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
> at java.base@10-ea/sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:567)
> at java.base@10-ea/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
> at java.base@10-ea/sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:163)
> at app//com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:104)
> at app//com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
> at app//com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.lambda$instances$0(GceInstancesServiceImpl.java:71)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl$$Lambda$1441/2115050294.run(Unknown Source)
> at java.base@10-ea/java.security.AccessController.doPrivileged(Native Method)
> at app//org.elasticsearch.cloud.gce.util.Access.doPrivilegedIOException(Access.java:59)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.lambda$instances$2(GceInstancesServiceImpl.java:69)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl$$Lambda$1438/416814803.apply(Unknown Source)
> at java.base@10-ea/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at java.base@10-ea/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1492)
> at java.base@10-ea/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at java.base@10-ea/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at java.base@10-ea/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
> at java.base@10-ea/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at java.base@10-ea/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:553)
> at app//org.elasticsearch.cloud.gce.GceInstancesServiceImpl.instances(GceInstancesServiceImpl.java:82)
> at app//org.elasticsearch.discovery.gce.GceUnicastHostsProvider.buildDynamicNodes(GceUnicastHostsProvider.java:132)
> at app//org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:309)
> at app//org.elasticsearch.discovery.zen.UnicastZenPing.ping(UnicastZenPing.java:286)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery.pingAndWait(ZenDiscovery.java:1044)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:894)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:448)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery.access$2500(ZenDiscovery.java:90)
> at app//org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1253)
> at app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573)
> at java.base@10-ea/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
> at java.base@10-ea/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at java.base@10-ea/java.lang.Thread.run(Thread.java:844)
> at __randomizedtesting.SeedInfo.seed([BF02897A506A46A6]:0)
Completed [4/4] on J0 in 60.17s, 1 test, 2 errors <<< FAILURES!
This test failed 4 times in last 60 days for with leaked threads error.
This tests starts a http server and I think that the http server gets stopped while nodes are getting stopped, which sometimes causes the discovery logic to get stuck at reading from a socket.
I think this test can be made simpler. No need for @BeforeClass
and @AfterClass
, as there is just a single test. This will perhaps debugging this test failure easier.
@martijnvg I started to simplify this test in #28193. If that's ok I'll create a pull request just to simplify the test without the read/connect timeouts that #28193 also adds.
build URL: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.3+multijob-unix-compatibility/os=opensuse/144/consoleFull
Reproduce command:
stack trace: