Closed talevy closed 4 years ago
Pinging @elastic/es-core-infra
Occurred again https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=centos/58/console
REPRODUCE WITH: ./gradlew :x-pack:qa:smoke-test-watcher-with-security:integTestRunner \
-Dtests.seed=484163AFA937AD5E \
-Dtests.class=org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT \
-Dtests.method="testSearchTransformInsufficientPermissions" \
-Dtests.security.manager=true \
-Dtests.locale=it \
-Dtests.timezone=Antarctica/Mawson \
-Dcompiler.java=11 \
-Druntime.java=8
I haven't reproduced this exact issue... but stepping through the code, there appears to be a bug in the start/stop logic in SmokeTestWatcherWithSecurityIT
such that it does not busy wait as (I believe) is the intent of the code.
Starting Watcher here is wrapped in an assertBusy()
which appears to attempt to wait for 10 seconds max until the Watcher is indeed started. However, the break
here allows not only to break the switch, but also effectively stops the busy wait logic since it exits the lamda normally. assertBusy only busy waits while an assertionError
is getting thrown in the (lamda) code block.
Stop Watcher has the same bug here (note the assertion in both is asserting the success/failure of the http call, not the state of Watcher)... and it looks like SmokeTestWatcherWithSecurityClientYamlTestSuiteIT
may have the same issue (but haven't stepped trough that yet).
This means that Watcher is not guaranteed to be started or stopped between tests since it is not properly blocking till started or stopped, and is likely the root cause of many of these Watcher failures.
I believe the fix is to simply change the break
to a AssertionError
for the state change switches.
Also, this specific test is peculiar since it can pass without Watcher ever started.
@jakelandis I got something that looks quite similar to this today on 6.5, not sure if this should have been fixed you your PR
./gradlew :x-pack:qa:smoke-test-watcher-with-security:integTestRunner \
-Dtests.seed=2EA37470CE126E3E \
-Dtests.class=org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT \
-Dtests.method="testSearchTransformInsufficientPermissions" \
-Dtests.security.manager=true \
-Dtests.locale=es-UY \
-Dtests.timezone=ART \
-Dcompiler.java=11 \
-Druntime.java=8
java.lang.AssertionError:
Expected: is a value equal to or greater than <1>
but: <0> was less than <1>
at __randomizedtesting.SeedInfo.seed([2EA37470CE126E3E:B3D6708EEF1969D3]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT.lambda$getWatchHistoryEntry$3(SmokeTestWatcherWithSecurityIT.java:328)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:848)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:822)
at org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT.getWatchHistoryEntry(SmokeTestWatcherWithSecurityIT.java:306)
at org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT.getWatchHistoryEntry(SmokeTestWatcherWithSecurityIT.java:301)
at org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT.testSearchTransformInsufficientPermissions(SmokeTestWatcherWithSecurityIT.java:237)
Un-muted this test on PR #42409 to obtain additional logs.
If (when?) this test fails again please obtain the following information before muting the test:
Closing in favour of #30777 as this test fails in a similar way&_a=(columns:!(_source),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:e58bf320-7efd-11e8-bf69-63c8ef516157,key:test,negate:!f,params:(query:testSearchTransformInsufficientPermissions,type:phrase),type:phrase,value:testSearchTransformInsufficientPermissions),query:(match:(test:(query:testSearchTransformInsufficientPermissions,type:phrase))))),index:e58bf320-7efd-11e8-bf69-63c8ef516157,interval:auto,query:(language:lucene,query:'class:*SmokeTestWatcherWithSecurityIT'),sort:!(time,desc))) as the mentioned issue.
PR #50931 will add more logging in case this fails again.
stacktrace