elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.9k stars 24.73k forks source link

[CI] FullClusterRestartIT testWatcherWithApiKey failing #84700

Open valeriy42 opened 2 years ago

valeriy42 commented 2 years ago

Seems to be failing regularly with the same error recently.

Build scan: https://gradle-enterprise.elastic.co/s/b3aibsuqwx4ww/tests/:x-pack:qa:full-cluster-restart:v7.2.0%23oldClusterTest/org.elasticsearch.xpack.restart.FullClusterRestartIT/testWatcherWithApiKey

Reproduction line: ./gradlew ':x-pack:qa:full-cluster-restart:v7.2.0#oldClusterTest' -Dtests.class="org.elasticsearch.xpack.restart.FullClusterRestartIT" -Dtests.method="testWatcherWithApiKey" -Dtests.seed=89F33062A7870C11 -Dtests.bwc=true -Dtests.locale=ar-KW -Dtests.timezone=Europe/Brussels -Druntime.java=17

Applicable branches: 8.0

Reproduces locally?: Didn't try

Failure history: https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.xpack.restart.FullClusterRestartIT&tests.test=testWatcherWithApiKey

Failure excerpt:

java.lang.AssertionError: expected:<executed> but was:<null>

  at __randomizedtesting.SeedInfo.seed([89F33062A7870C11:60A8FA1458B41D38]:0)
  at org.junit.Assert.fail(Assert.java:88)
  at org.junit.Assert.failNotEquals(Assert.java:834)
  at org.junit.Assert.assertEquals(Assert.java:118)
  at org.junit.Assert.assertEquals(Assert.java:144)
  at org.elasticsearch.xpack.restart.FullClusterRestartIT.lambda$testWatcherWithApiKey$0(FullClusterRestartIT.java:248)
  at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1069)
  at org.elasticsearch.xpack.restart.FullClusterRestartIT.testWatcherWithApiKey(FullClusterRestartIT.java:245)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:568)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)
  at java.lang.Thread.run(Thread.java:833)
elasticmachine commented 2 years ago

Pinging @elastic/es-data-management (Team:Data Management)

masseyke commented 2 years ago

It looks like this one is a timeout from the stack trace above, and from this in the logs:

  1> [2022-03-07T10:22:07,024][INFO ][o.e.x.r.FullClusterRestartIT] [testWatcherWithApiKey] before test
  1> [2022-03-07T10:22:07,027][INFO ][o.e.x.r.FullClusterRestartIT] [testWatcherWithApiKey] initializing REST clients against [http://[::1]:33333, http://127.0.0.1:45153, http://[::1]:45317, http://127.0.0.1:34561]
  1> [2022-03-07T10:22:41,230][INFO ][o.e.x.r.FullClusterRestartIT] [testWatcherWithApiKey] after test

Running it locally, it only takes 4-5 seconds. So I'm not sure why it's so much slower sometimes on the CI server.

pugnascotia commented 2 years ago

Another failure: https://gradle-enterprise.elastic.co/s/fa5zfc5on4v5w

n1v0lg commented 2 years ago

Another one https://gradle-enterprise.elastic.co/s/u7flo66by74b2

luigidellaquila commented 2 years ago

Still failing: https://gradle-enterprise.elastic.co/s/mfofju4qvgkqg

idegtiarenko commented 2 years ago

One more failure: https://gradle-enterprise.elastic.co/s/zj3onfpgzyjyi

idegtiarenko commented 2 years ago

And one more: https://gradle-enterprise.elastic.co/s/t4f5rxsye47qy

pxsalehi commented 1 year ago

A similar failure but with

./gradlew ':x-pack:qa:full-cluster-restart:v7.3.2#oldClusterTest' -Dtests.class="org.elasticsearch.xpack.restart.FullClusterRestartIT" -Dtests.method="testWatcherWithApiKey" -Dtests.seed=D916BB57EB898DDF -Dtests.bwc=true -Dtests.locale=es-PE -Dtests.timezone=Asia/Vientiane -Druntime.java=17

https://gradle-enterprise.elastic.co/s/prx5cfgu3pv2e/tests/:x-pack:qa:full-cluster-restart:v7.3.2%23oldClusterTest/org.elasticsearch.xpack.restart.FullClusterRestartIT/testWatcherWithApiKey

valeriy42 commented 1 year ago

The test is still failing:

kingherc commented 1 year ago

Still failing. https://gradle-enterprise.elastic.co/s/lfiq7c4dxtzkw/tests/:x-pack:qa:full-cluster-restart:v6.8.9%23bwcTest/org.elasticsearch.xpack.restart.FullClusterRestartIT/testWatcherWithApiKey%20%7Bcluster=OLD%7D?top-execution=1

Will mute.

kingherc commented 1 year ago

Another one at https://gradle-enterprise.elastic.co/s/xqrve2db4nxiq/console-log?task=:x-pack:qa:full-cluster-restart:v7.17.10%23bwcTest :

:x-pack:qa:full-cluster-restart:v7.17.10#bwcTest FAILED 
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 -Dsun.jnu.encoding=UTF8   
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 -Dsun.jnu.encoding=UTF8   
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 -Dsun.jnu.encoding=UTF8   
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 -Dsun.jnu.encoding=UTF8   
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 -Dsun.jnu.encoding=UTF8   
org.elasticsearch.xpack.restart.FullClusterRestartIT > testWatcherWithApiKey {cluster=UPGRADED} FAILED  
    java.lang.AssertionError: version increased: [true], executed: [false]  
    Expected: is <true> 
         but: was <false>   
        at __randomizedtesting.SeedInfo.seed([D08E51FCB5D10E4C:39D59B8A4AE21F65]:0) 
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) 
        at org.junit.Assert.assertThat(Assert.java:956) 
        at org.elasticsearch.xpack.restart.FullClusterRestartIT.lambda$testWatcherWithApiKey$1(FullClusterRestartIT.java:279)   
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1112)   
        at org.elasticsearch.xpack.restart.FullClusterRestartIT.testWatcherWithApiKey(FullClusterRestartIT.java:270)    
Tests with failures:    
 - org.elasticsearch.xpack.restart.FullClusterRestartIT.testWatcherWithApiKey {cluster=UPGRADED}    
78 tests completed, 1 failed, 14 skipped
davidkyle commented 1 year ago

Slightly different error to the headline above

java.lang.AssertionError: version increased: [true], executed: [false] 
Expected: is <true> 
but: was <false>

at org.elasticsearch.xpack.restart.FullClusterRestartIT.lambda$testWatcherWithApiKey$1(FullClusterRestartIT.java:279)
at org.elasticsearch.xpack.restart.FullClusterRestartIT.testWatcherWithApiKey(FullClusterRestartIT.java:270)

https://gradle-enterprise.elastic.co/s/efiml4ickhj3c/tests/:x-pack:qa:full-cluster-restart:v7.17.10%23bwcTest/org.elasticsearch.xpack.restart.FullClusterRestartIT/testWatcherWithApiKey%20%7Bcluster=UPGRADED%7D?top-execution=1

masseyke commented 6 months ago

Maybe we're running into https://github.com/elastic/elasticsearch/issues/69842? I can't find any evidence of it, but it does seem that watcher is just not running for some reason.