apache / helix

Mirror of Apache Helix
Apache License 2.0
461 stars 224 forks source link

fix TestAutoRebalance.testAutoRebalance #1374

Closed kaisun2000 closed 3 years ago

kaisun2000 commented 3 years ago

LOG 976 touch 7

TestAutoRebalance.testAutoRebalance:177 expected: but was:

2020-09-17T11:27:23.8495920Z [ERROR] testAutoRebalance(org.apache.helix.integration.rebalancer.TestAutoRebalance) Time elapsed: 10.815 s <<< FAILURE! 2020-09-17T11:27:23.8497305Z java.lang.AssertionError: expected: but was: 2020-09-17T11:27:23.8499039Z at org.apache.helix.integration.rebalancer.TestAutoRebalance.testAutoRebalance(TestAutoRebalance.java:177) 2020-09-17T11:27:23.8500501Z

kaisun2000 commented 3 years ago
    Thread.sleep(100);
    result = ClusterStateVerifier.verifyByPolling(
        new ExternalViewBalancedVerifier(_gZkClient, CLUSTER_NAME, TEST_DB), 10000, 100);
    Assert.assertTrue(result);

Need to get rid of these legacy verifier.

kaisun2000 commented 3 years ago

LOG 1244 after default to delayedAutoRebalancer

2020-09-23T03:25:49.9699881Z [ERROR] testAutoRebalance(org.apache.helix.integration.rebalancer.TestAutoRebalance) Time elapsed: 31.639 s <<< FAILURE! 2020-09-23T03:25:49.9701635Z java.lang.AssertionError: expected: but was: 2020-09-23T03:25:49.9703882Z at org.apache.helix.integration.rebalancer.TestAutoRebalance.testAutoRebalance(TestAutoRebalance.java:179) 2020-09-23T03:25:49.9706164Z

kaisun2000 commented 3 years ago
    Thread.sleep(TestHelper.DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
    result = ClusterStateVerifier.verifyByPolling(
        new ExternalViewBalancedVerifier(_gZkClient, CLUSTER_NAME, TEST_DB), 30000, 100);
    Assert.assertTrue(result);
kaisun2000 commented 3 years ago
    // setup storage cluster
    _gSetupTool.addCluster(CLUSTER_NAME, true);
    _gSetupTool.addResourceToCluster(CLUSTER_NAME, TEST_DB, _PARTITIONS, STATE_MODEL,
        RebalanceMode.FULL_AUTO + "");

This previously using AutoRebalancer with AutoRebalance strategy, now it is using DelayedAutoRebalancer with AutoRebalance strategy. Shall we change to using AutoRebalancer with AutoRebalanceStrategy, as we are testing AutoRebalance? @jiajunwang

kaisun2000 commented 3 years ago

LOG 1482

<<< FAILURE! - in TestSuite 2020-09-25T05:36:20.3582205Z [ERROR] testAutoRebalance(org.apache.helix.integration.rebalancer.TestAutoRebalance) Time elapsed: 31.81 s <<< FAILURE! 2020-09-25T05:36:20.3584157Z java.lang.AssertionError: expected: but was: 2020-09-25T05:36:20.3586595Z at org.apache.helix.integration.rebalancer.TestAutoRebalance.testAutoRebalance(TestAutoRebalance.java:186) 2020-09-25T05:36:20.3588251Z 2020-09-25T05:36:20.7485311Z [ERROR] Failures: 2020-09-25T05:36:20.7490402Z [ERROR] TestAutoRebalance.testAutoRebalance:186 expected: but was:

020-09-25T04:48:04.6388323Z START TestAutoRebalance at Fri Sep 25 04:48:04 UTC 2020 2020-09-25T04:48:06.1099349Z true: wait 9ms, org.apache.helix.integration.rebalancer.TestAutoRebalance$ExternalViewBalancedVerifier@5d4e13e1 2020-09-25T04:48:06.1101169Z START TestAutoRebalance testAutoRebalance at Fri Sep 25 04:48:06 UTC 2020 2020-09-25T04:48:06.3820957Z true: wait 267ms, org.apache.helix.integration.rebalancer.TestAutoRebalance$ExternalViewBalancedVerifier@76f35feb 2020-09-25T04:48:37.9076332Z false: org.apache.helix.integration.rebalancer.TestAutoRebalance$ExternalViewBalancedVerifier@3f0ae3a8: wait 30040ms to verify 2020-09-25T04:48:37.9210525Z END TestAutoRebalance testAutoRebalance at Fri Sep 25 04:48:37 UTC 2020, took: 31810ms. 2020-09-25T04:48:

kaisun2000 commented 3 years ago
    Thread.sleep(TestHelper.DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
    result = ClusterStateVerifier.verifyByPolling(
        new ExternalViewBalancedVerifier(_gZkClient, CLUSTER_NAME, TEST_DB), 30000, 100);
    Assert.assertTrue(result);  ---> failed
kaisun2000 commented 3 years ago

enlarge waiting time does not work

LOG 1501

2020-09-25T08:38:01.7996107Z [ERROR] Tests run: 1205, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 4,733.556 s <<< FAILURE! - in TestSuite 2020-09-25T08:38:01.7998096Z [ERROR] testAutoRebalance(org.apache.helix.integration.rebalancer.TestAutoRebalance) Time elapsed: 301.695 s <<< FAILURE! 2020-09-25T08:38:01.7999504Z java.lang.AssertionError: expected: but was: 2020-09-25T08:38:01.8001197Z at org.apache.helix.integration.rebalancer.TestAutoRebalance.testAutoRebalance(TestAutoRebalance.java:186) 2020-09-25T08:38:01.8002651Z 2020-09-25T08:38:02.1984857Z [ERROR] Failures: 2020-09-25T08:38:02.1988718Z [ERROR] TestAutoRebalance.testAutoRebalance:186 expected: but was: 2020-09-25T08:38:02.1993000Z [ERROR] Tests run: 1205, Failures: 1, Errors: 0, Skipped: 1

020-09-25T07:47:25.4292432Z START TestAutoRebalance at Fri Sep 25 07:47:25 UTC 2020 2020-09-25T07:47:25.4830146Z 1688836 [ZkClient-EventThread-38417-localhost:2183] ERROR org.apache.helix.zookeeper.zkclient.callback.ZkAsyncCallbacks - Interrupted waiting for success 2020-09-25T07:47:25.4832019Z java.lang.InterruptedException 2020-09-25T07:47:25.4833088Z at java.lang.Object.wait(Native Method) 2020-09-25T07:47:25.4833976Z at java.lang.Object.wait(Object.java:502) 2020-09-25T07:47:25.4836459Z at org.apache.helix.zookeeper.zkclient.callback.ZkAsyncCallbacks$DefaultCallback.waitForSuccess(ZkAsyncCallbacks.java:220) 2020-09-25T07:47:25.4839410Z at org.apache.helix.zookeeper.zkclient.ZkClient.issueSync(ZkClient.java:1308) 2020-09-25T07:47:25.4841174Z at org.apache.helix.zookeeper.zkclient.ZkClient.access$300(ZkClient.java:83) 2020-09-25T07:47:25.4842916Z at org.apache.helix.zookeeper.zkclient.ZkClient$4.run(ZkClient.java:1334) 2020-09-25T07:47:25.4844292Z at org.apache.helix.zookeeper.zkclient.ZkEventThread.run(ZkEventThread.java:99) 2020-09-25T07:47:28.7393028Z true: wait 7ms, org.apache.helix.integration.rebalancer.TestAutoRebalance$ExternalViewBalancedVerifier@53a84ff4 2020-09-25T07:47:28.7394873Z START TestAutoRebalance testAutoRebalance at Fri Sep 25 07:47:28 UTC 2020 2020-09-25T07:47:28.8840849Z true: wait 140ms, org.apache.helix.integration.rebalancer.TestAutoRebalance$ExternalViewBalancedVerifier@6127b13c 2020-09-25T07:52:30.4192196Z false: org.apache.helix.integration.rebalancer.TestAutoRebalance$ExternalViewBalancedVerifier@f2c267a: wait 300057ms to verify 2020-09-25T07:52:30.4348593Z END TestAutoRebalance testAutoRebalance at Fri Sep 25 07:52:30 UTC 2020, took: 301692ms. 2020-09-25T07:52:30.4349741Z START TestAutoRebalance testDropResourceAutoRebalance at Fri Sep 25 07:52:30 UTC 2020 2020-09-25T07:52:31.4914022Z true: wait 28ms, org.apache.helix.integration.rebalancer.TestAutoRebalance$ExternalViewBalancedVerifier@213c36b4 2020-09-25T07:52:32.5078882Z verifyEmptyCurStateAndExtView: wait 1000ms to verify (true) 2020-09-25T07:52:32.7238571Z true: wait 203ms, org.apache.helix.integration.rebalancer.TestAutoRebalance$ExternalViewBalancedVerifier@6ec6d6d8 2020-09-25T07:52:33.7370656Z verifyEmptyCurStateAndExtView: wait 1000ms to verify (true) 2020-09-25T07:52:33.7396615Z END TestAutoRebalance testDropResourceAutoRebalance at Fri Sep 25 07:52:33 UTC 2020, took: 3306ms. 2020-09-25T07:52:33.7515529Z AfterClass: TestAutoRebalance of ZkStandAloneCMTestBase called 2020-09-25T07:52:33.8497807Z END TestAutoRebalance at Fri Sep 25 07:52:33 UTC 2020 2020-09-25T07:52:33.8501550Z AfterClass:TestAutoRebalance afterclass of ZkTestBase called! 2020-09-25T07:52:33.8502195Z **** SYSTEM Physical Memory:7292203008 2020-09-25T07:52:33.8502654Z **** total memory:4090 free memory:2615 2020-09-25T07:52:33.8503143Z TestAutoRebalance has active threads cnt:868

jiajunwang commented 3 years ago

Close test unstable tickets since we have an automatic tracking mechanism https://github.com/apache/helix/pull/1757 now for tracking the most recent test issues.