apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.38k stars 3.67k forks source link

Flaky integration test for input sources #11744

Open jihoonson opened 2 years ago

jihoonson commented 2 years ago

There are 3 jobs for integration testing for input sources running on Travis. These tests seem to be almost always failing recently. The stack trace below is copied from https://app.travis-ci.com/github/apache/druid/builds/238435945, which is commonly found in those failed jobs.

2021-09-27T18:06:17,485 ERROR [main] org.apache.druid.testing.utils.LoggerListener - Failed [org.apache.druid.tests.indexer.ITHttpInputSourceTest.doTest]
org.apache.druid.java.util.common.ISE: Max number of retries[240] exceeded for Task[index_parallel_wikipedia_http_inputsource_test_85c6a767-8acb-4219-9747-189741b98916 %Россия 한국 中国!?_ddhmammk_2021-09-27T17:46:02.898Z]. Failing.
    at org.apache.druid.testing.utils.ITRetryUtil.retryUntil(ITRetryUtil.java:94) ~[druid-integration-tests-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT]
    at org.apache.druid.testing.clients.OverlordResourceTestClient.waitUntilTaskCompletes(OverlordResourceTestClient.java:282) ~[druid-integration-tests-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT]
    at org.apache.druid.testing.clients.OverlordResourceTestClient.waitUntilTaskCompletes(OverlordResourceTestClient.java:277) ~[druid-integration-tests-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT]
    at org.apache.druid.tests.indexer.AbstractITBatchIndexTest.submitTaskAndWait(AbstractITBatchIndexTest.java:323) ~[test-classes/:?]
    at org.apache.druid.tests.indexer.AbstractITBatchIndexTest.doIndexTest(AbstractITBatchIndexTest.java:146) ~[test-classes/:?]
    at org.apache.druid.tests.indexer.AbstractITBatchIndexTest.doIndexTest(AbstractITBatchIndexTest.java:114) ~[test-classes/:?]
    at org.apache.druid.tests.indexer.ITHttpInputSourceTest.doTest(ITHttpInputSourceTest.java:44) ~[test-classes/:?]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_252]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_252]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_252]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_252]
    at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132) ~[testng-7.3.0.jar:?]
    at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:599) ~[testng-7.3.0.jar:?]
    at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174) ~[testng-7.3.0.jar:?]
    at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46) ~[testng-7.3.0.jar:?]
    at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822) ~[testng-7.3.0.jar:?]
    at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147) ~[testng-7.3.0.jar:?]
    at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146) ~[testng-7.3.0.jar:?]
    at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128) ~[testng-7.3.0.jar:?]
    at java.util.ArrayList.forEach(ArrayList.java:1257) [?:1.8.0_252]
    at org.testng.TestRunner.privateRun(TestRunner.java:764) [testng-7.3.0.jar:?]
    at org.testng.TestRunner.run(TestRunner.java:585) [testng-7.3.0.jar:?]
    at org.testng.DruidTestRunnerFactory$DruidTestRunner.runTests(DruidTestRunnerFactory.java:101) [druid-integration-tests-0.23.0-SNAPSHOT.jar:?]
    at org.testng.DruidTestRunnerFactory$DruidTestRunner.run(DruidTestRunnerFactory.java:88) [druid-integration-tests-0.23.0-SNAPSHOT.jar:?]
    at org.testng.SuiteRunner.runTest(SuiteRunner.java:384) [testng-7.3.0.jar:?]
    at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378) [testng-7.3.0.jar:?]
    at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337) [testng-7.3.0.jar:?]
    at org.testng.SuiteRunner.run(SuiteRunner.java:286) [testng-7.3.0.jar:?]
    at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53) [testng-7.3.0.jar:?]
    at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96) [testng-7.3.0.jar:?]
    at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218) [testng-7.3.0.jar:?]
    at org.testng.TestNG.runSuitesLocally(TestNG.java:1140) [testng-7.3.0.jar:?]
    at org.testng.TestNG.runSuites(TestNG.java:1069) [testng-7.3.0.jar:?]
    at org.testng.TestNG.run(TestNG.java:1037) [testng-7.3.0.jar:?]
    at org.apache.maven.surefire.testng.TestNGExecutor.run(TestNGExecutor.java:283) [surefire-testng-2.22.0.jar:2.22.0]
    at org.apache.maven.surefire.testng.TestNGXmlTestSuite.execute(TestNGXmlTestSuite.java:75) [surefire-testng-2.22.0.jar:2.22.0]
    at org.apache.maven.surefire.testng.TestNGProvider.invoke(TestNGProvider.java:120) [surefire-testng-2.22.0.jar:2.22.0]
    at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:383) [surefire-booter-2.22.0.jar:2.22.0]
    at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:344) [surefire-booter-2.22.0.jar:2.22.0]
    at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125) [surefire-booter-2.22.0.jar:2.22.0]
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:417) [surefire-booter-2.22.0.jar:2.22.0]
    Suppressed: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.druid.java.util.common.ISE: Error while making request to url[http://127.0.0.1:8081/druid/coordinator/v1/datasources/wikipedia_http_inputsource_test_85c6a767-8acb-4219-9747-189741b98916%20%25%D0%A0%D0%BE%D1%81%D1%81%D0%B8%D1%8F%20%ED%95%9C%EA%B5%AD%20%E4%B8%AD%E5%9B%BD%21%3F/intervals] status[204 No Content] content[]
        at org.apache.druid.testing.clients.CoordinatorResourceTestClient.getSegmentIntervals(CoordinatorResourceTestClient.java:154) ~[druid-integration-tests-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT]
        at org.apache.druid.tests.indexer.AbstractIndexerTest.unloadAndKillData(AbstractIndexerTest.java:91) ~[test-classes/:?]
        at org.apache.druid.tests.indexer.AbstractIndexerTest.lambda$unloader$0(AbstractIndexerTest.java:71) ~[test-classes/:?]
        at org.apache.druid.tests.indexer.ITHttpInputSourceTest.doTest(ITHttpInputSourceTest.java:53) ~[test-classes/:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_252]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_252]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_252]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_252]
        at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132) ~[testng-7.3.0.jar:?]
        at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:599) ~[testng-7.3.0.jar:?]
        at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174) ~[testng-7.3.0.jar:?]
        at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46) ~[testng-7.3.0.jar:?]
        at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822) ~[testng-7.3.0.jar:?]
        at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147) ~[testng-7.3.0.jar:?]
        at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146) ~[testng-7.3.0.jar:?]
        at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128) ~[testng-7.3.0.jar:?]
        at java.util.ArrayList.forEach(ArrayList.java:1257) [?:1.8.0_252]
        at org.testng.TestRunner.privateRun(TestRunner.java:764) [testng-7.3.0.jar:?]
        at org.testng.TestRunner.run(TestRunner.java:585) [testng-7.3.0.jar:?]
        at org.testng.DruidTestRunnerFactory$DruidTestRunner.runTests(DruidTestRunnerFactory.java:101) [druid-integration-tests-0.23.0-SNAPSHOT.jar:?]
        at org.testng.DruidTestRunnerFactory$DruidTestRunner.run(DruidTestRunnerFactory.java:88) [druid-integration-tests-0.23.0-SNAPSHOT.jar:?]
        at org.testng.SuiteRunner.runTest(SuiteRunner.java:384) [testng-7.3.0.jar:?]
        at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378) [testng-7.3.0.jar:?]
        at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337) [testng-7.3.0.jar:?]
        at org.testng.SuiteRunner.run(SuiteRunner.java:286) [testng-7.3.0.jar:?]
        at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53) [testng-7.3.0.jar:?]
        at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96) [testng-7.3.0.jar:?]
        at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218) [testng-7.3.0.jar:?]
        at org.testng.TestNG.runSuitesLocally(TestNG.java:1140) [testng-7.3.0.jar:?]
        at org.testng.TestNG.runSuites(TestNG.java:1069) [testng-7.3.0.jar:?]
        at org.testng.TestNG.run(TestNG.java:1037) [testng-7.3.0.jar:?]
        at org.apache.maven.surefire.testng.TestNGExecutor.run(TestNGExecutor.java:283) [surefire-testng-2.22.0.jar:2.22.0]
        at org.apache.maven.surefire.testng.TestNGXmlTestSuite.execute(TestNGXmlTestSuite.java:75) [surefire-testng-2.22.0.jar:2.22.0]
        at org.apache.maven.surefire.testng.TestNGProvider.invoke(TestNGProvider.java:120) [surefire-testng-2.22.0.jar:2.22.0]
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:383) [surefire-booter-2.22.0.jar:2.22.0]
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:344) [surefire-booter-2.22.0.jar:2.22.0]
        at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125) [surefire-booter-2.22.0.jar:2.22.0]
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:417) [surefire-booter-2.22.0.jar:2.22.0]
    Caused by: java.lang.RuntimeException: org.apache.druid.java.util.common.ISE: Error while making request to url[http://127.0.0.1:8081/druid/coordinator/v1/datasources/wikipedia_http_inputsource_test_85c6a767-8acb-4219-9747-189741b98916%20%25%D0%A0%D0%BE%D1%81%D1%81%D0%B8%D1%8F%20%ED%95%9C%EA%B5%AD%20%E4%B8%AD%E5%9B%BD%21%3F/intervals] status[204 No Content] content[]
        at org.apache.druid.testing.clients.CoordinatorResourceTestClient.makeRequest(CoordinatorResourceTestClient.java:433) ~[druid-integration-tests-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT]
        at org.apache.druid.testing.clients.CoordinatorResourceTestClient.getSegmentIntervals(CoordinatorResourceTestClient.java:145) ~[druid-integration-tests-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT]
        ... 37 more
    Caused by: org.apache.druid.java.util.common.ISE: Error while making request to url[http://127.0.0.1:8081/druid/coordinator/v1/datasources/wikipedia_http_inputsource_test_85c6a767-8acb-4219-9747-189741b98916%20%25%D0%A0%D0%BE%D1%81%D1%81%D0%B8%D1%8F%20%ED%95%9C%EA%B5%AD%20%E4%B8%AD%E5%9B%BD%21%3F/intervals] status[204 No Content] content[]
        at org.apache.druid.testing.clients.CoordinatorResourceTestClient.makeRequest(CoordinatorResourceTestClient.java:427) ~[druid-integration-tests-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT]
        at org.apache.druid.testing.clients.CoordinatorResourceTestClient.getSegmentIntervals(CoordinatorResourceTestClient.java:145) ~[druid-integration-tests-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT]
        ... 37 more
FrankChen021 commented 2 years ago

All my PRs are suffering this failure 😭 Is this problem introduced by any PR?

jihoonson commented 2 years ago

Yeah, every PR is blocked because of this :cry:. I don't know if there is a particular change that introduced this issue. I'm suspecting it's something about lack of memory though. I think @loquisgon is working on this.