Closed DaveCTurner closed 5 years ago
Pinging @elastic/es-core-infra
This happened in a different test today: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/6494/consoleFull
14:45:41 2> REPRODUCE WITH: ./gradlew :x-pack:plugin:rollup:internalClusterTest -Dtests.seed=9C130591FD311456 -Dtests.class=org.elasticsearch.xpack.rollup.RollupIT -Dtests.method="testTwoJobsStartStopDeleteOne" -Dtests.security.manager=true -Dtests.locale=el-CY -Dtests.timezone=Brazil/Acre
14:45:41 FAILURE 1.76s | RollupIT.testTwoJobsStartStopDeleteOne <<< FAILURES!
14:45:41 > Throwable #1: java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_sd3
14:45:41 > Expected: <0L>
14:45:41 > but: was <4775L>
14:45:41 > at __randomizedtesting.SeedInfo.seed([9C130591FD311456:BA93A46A17310921]:0)
14:45:41 > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
14:45:41 > at org.elasticsearch.test.InternalTestCluster.ensureEstimatedStats(InternalTestCluster.java:2016)
14:45:41 > at org.elasticsearch.test.TestCluster.assertAfterTest(TestCluster.java:93)
14:45:41 > at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2047)
14:45:41 > at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:590)
14:45:41 > at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2163)
Another one on that rollup test today:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/6596/console
./gradlew :x-pack:plugin:rollup:internalClusterTest \
-Dtests.seed=2CD38EA34BC1EAAC \
-Dtests.class=org.elasticsearch.xpack.rollup.RollupIT \
-Dtests.method="testTwoJobsStartStopDeleteOne" \
-Dtests.security.manager=true \
-Dtests.locale=sl-SI \
-Dtests.timezone=America/Miquelon
When I get a chance I'll see if there's anything in common between the watcher test and this rollup one.
And again on the rollup test:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+intake/2014/console
./gradlew :x-pack:plugin:rollup:internalClusterTest \
-Dtests.seed=1D037D5E37846F9 \
-Dtests.class=org.elasticsearch.xpack.rollup.RollupIT \
-Dtests.method="testTwoJobsStartStopDeleteOne" \
-Dtests.security.manager=true \
-Dtests.locale=es-CL \
-Dtests.timezone=Pacific/Truk
The assertion is this:
16:24:57 FAILURE 7.00s | RollupIT.testTwoJobsStartStopDeleteOne <<< FAILURES!
16:24:57 > Throwable #1: java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_s0
16:24:57 > Expected: <0L>
16:24:57 > but: was <7042L>
16:24:57 > at __randomizedtesting.SeedInfo.seed([1D037D5E37846F9:2750962E09785B8E]:0)
16:24:57 > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
16:24:57 > at org.elasticsearch.test.InternalTestCluster.ensureEstimatedStats(InternalTestCluster.java:2023)
16:24:57 > at org.elasticsearch.test.TestCluster.assertAfterTest(TestCluster.java:93)
16:24:57 > at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2054)
16:24:57 > at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:586)
16:24:57 > at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2153)
16:24:57 > at java.lang.Thread.run(Thread.java:748)
I looked into this a bit... no idea what's going on. The Rollup test isn't anything special (edit: and doesn not index any docs, like the earlier auditing failures were doing), and that watcher test is literally just:
public class PreventFailingBuildIT extends ESIntegTestCase {
public void testSoThatTestsDoNotFail() {
// Noop
// This is required because if tests are not enable no
// tests will be run in the entire project and all tests will fail.
}
}
I'm likely going to delete all of RollupIT for unrelated reasons (see https://github.com/elastic/elasticsearch/issues/31258#issuecomment-397440302) but we can leave it enabled for now to debug this problem if desired, since it seems to be failing on that one Rollup test in particular. I can't reproduce locally though.
same again
00:51:11 FAILURE 6.42s | RollupIT.testTwoJobsStartStopDeleteOne <<< FAILURES!
00:51:11 > Throwable #1: java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_s0
00:51:11 > Expected: <0L>
00:51:11 > but: was <11421L>
00:51:11 > at __randomizedtesting.SeedInfo.seed([698DCF0963902073:4F0D6EF289903D04]:0)
00:51:11 > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
00:51:11 > at org.elasticsearch.test.InternalTestCluster.ensureEstimatedStats(InternalTestCluster.java:2020)
00:51:11 > at org.elasticsearch.test.TestCluster.assertAfterTest(TestCluster.java:93)
00:51:11 > at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2051)
00:51:11 > at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:586)
00:51:11 > at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2159)
00:51:11 > at java.lang.Thread.run(Thread.java:748)
16:10:08 FAILURE 4.58s J1 | ShrinkIndexIT.testShrinkIndexPrimaryTerm <<< FAILURES!
16:10:08 > Throwable #1: java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_s2
16:10:08 > Expected: <0L>
16:10:08 > but: was <1322L>
16:10:08 > at __randomizedtesting.SeedInfo.seed([2A7F48E21D6F8FBF:F37D7B03379AF2]:0)
16:10:08 > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
16:10:08 > at org.elasticsearch.test.InternalTestCluster.ensureEstimatedStats(InternalTestCluster.java:2020)
16:10:08 > at org.elasticsearch.test.TestCluster.assertAfterTest(TestCluster.java:93)
16:10:08 > at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2051)
16:10:08 > at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:592)
16:10:08 > at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2185)
16:10:08 > at jdk.internal.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
16:10:08 > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
16:10:08 > at java.base/java.lang.reflect.Method.invoke(Method.java:564)
16:10:08 > at java.base/java.lang.Thread.run(Thread.java:844)
Failed again a few times over the weekend https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=fedora/1168/console This time in different ShrinkIndexIT tests.
java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_sd2
Expected: <0L>
but: was <1254L>
at __randomizedtesting.SeedInfo.seed([814E471A446BC55F:ABC272835A33D012]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.elasticsearch.test.InternalTestCluster.ensureEstimatedStats(InternalTestCluster.java:2013)
at org.elasticsearch.test.TestCluster.assertAfterTest(TestCluster.java:93)
at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2044)
at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:592)
at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2179)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:965)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)
Another ShrinkIndexIT
failure: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+intake/2165 . I haven't muted the tests as this seems like a more general problem that can't be addressed by muting all the possibly affected tests.
This one is also in ShrinkIndexIT
: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-windows-compatibility/1791/console
Piling another one here: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA=java11,ES_RUNTIME_JAVA=java8,nodes=virtual&&linux/223/consoleFull
for the BasicWatcherTests.testConditionSearchWithIndexedTemplate
test.
Relevant log lines:
05:52:24 1> [2018-08-07T13:52:22,004][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] after test
05:52:24 FAILURE 1.10s J1 | BasicWatcherTests.testConditionSearchWithIndexedTemplate <<< FAILURES!
05:52:24 > Throwable #1: java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_s0
05:52:24 > Expected: <0L>
05:52:24 > but: was <740L>
05:52:24 > at __randomizedtesting.SeedInfo.seed([831554F4DAAB5B36:27A252D4EB6139B8]:0)
05:52:24 > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
05:52:24 > at org.elasticsearch.test.InternalTestCluster.ensureEstimatedStats(InternalTestCluster.java:2012)
05:52:24 > at org.elasticsearch.test.TestCluster.assertAfterTest(TestCluster.java:93)
05:52:24 > at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2043)
05:52:24 > at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:593)
05:52:24 > at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2186)
05:52:24 > at java.lang.Thread.run(Thread.java:748)
and this happened earlier (might be misleading, though):
05:52:24 1> org.elasticsearch.env.ShardLockObtainFailedException: [.watches][0]: obtaining shard lock timed out after 0ms
05:52:24 1> at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:671) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:590) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.env.NodeEnvironment.lockAllForIndex(NodeEnvironment.java:536) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.env.NodeEnvironment.deleteIndexDirectorySafe(NodeEnvironment.java:487) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.indices.IndicesService.deleteIndexStoreIfDeletionAllowed(IndicesService.java:722) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.indices.IndicesService.deleteIndexStore(IndicesService.java:709) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.indices.IndicesService.removeIndex(IndicesService.java:615) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.indices.cluster.IndicesClusterStateService.deleteIndices(IndicesClusterStateService.java:284) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:218) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:492) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
05:52:24 1> at java.lang.Iterable.forEach(Iterable.java:75) [?:1.8.0_181]
05:52:24 1> at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:489) [elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
@dakrone Would you look at these build failures?
Yes I'll look into these today
I muted the assertion while I diagnose this, so that it doesn't occur any more for now.
@dakrone did you find a solution for this?
@s1monw sorry I missed your message, I have not found a solution for this yet
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+g1gc/362/consoleText contains another case of this:
> Throwable #1: java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_s1, are there still Lucene indices around?
> Expected: <0L>
> but: was <2508L>
> at __randomizedtesting.SeedInfo.seed([4AA7930E8DA3CCA6:602BA69793FBD9EB]:0)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.elasticsearch.test.InternalTestCluster.ensureEstimatedStats(InternalTestCluster.java:2458)
> at org.elasticsearch.test.TestCluster.assertAfterTest(TestCluster.java:94)
> at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2493)
> at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:593)
> at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2191)
> at java.lang.Thread.run(Thread.java:748)
I've opened a couple of PRs for helping to track this down, #38807 and #38903, I would appreciate anyone taking a look at them.
Is it the case that "Accounting breaker not reset" generally means "An index existed when we expected no indices to exist"?
.ml-annotations
index was only introduced in 6.6 and this issue pre-dates that. But looking at the tests that did cause early failures of this assertion, they're all in areas of the product that might create indices or change the lifetime of indices.If "Accounting breaker not reset" generally means "An index existed when we expected no indices to exist" then I think more people will be able to contribute to fixing this, as it probably just means that some of the test suites for parts of the product that create indices need more advanced post test cleanup code.
Is it the case that "Accounting breaker not reset" generally means "An index existed when we expected no indices to exist"?
Memory is given back to the circuit breaker via a segment core (the immutable part of a segment, ie. everything but deletes and dv updates) closed listener. So the meaning is rather that an index was still open.
Our logic regarding closing indices is a bit subtle due to ref-counting: if there is an ongoing request while a reader is closed, then the close() call only decrements a counter and the reader is effectively closed when the ongoing search request releases the reader. Only then do close listeners run.
I wonder that this may cause race conditions if we check the memory used by the accounting breaker while the reader has been closed but close listeners have not run yet. This would explain why we are seeing this problem more in tests that have long asynchronous tasks that increment the ref count like shrinking, rollups and Watcher.
I have tried to think about how we could somehow wait for all listeners to have run, but this looks tricky. Maybe we should look at whether we still get failures if we put this assertion under an assertBusy?
Maybe we should look at whether we still get failures if we put this assertion under an assertBusy?
I opened a PR to do that in #39211
Build stats&_a=(columns:!(_source),filters:!(),index:e58bf320-7efd-11e8-bf69-63c8ef516157,interval:auto,query:(language:lucene,query:'stacktrace:%22Accounting%20breaker%20not%20reset%20to%200%22'),sort:!(time,desc))) tracking
another case, https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.x+multijob-unix-compatibility/os=sles-12/59/console
REPRODUCE WITH: ./gradlew :server:integTest \
-Dtests.seed=382E6963024BB1D4 \
-Dtests.class=org.elasticsearch.action.admin.indices.create.ShrinkIndexIT \
-Dtests.method="testShrinkIndexPrimaryTerm" \
-Dtests.security.manager=true \
-Dtests.locale=be-BY \
-Dtests.timezone=SystemV/AST4ADT \
-Dcompiler.java=11 \
-Druntime.java=8
SplitIndexIT
failed with this error.
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.x+internalClusterTest/2354/console
The assertBusy did not help:
java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_s0, are there still Lucene indices around?
...
Expected: <0L>
but: was <1326L>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.elasticsearch.test.InternalTestCluster.lambda$ensureEstimatedStats$31(InternalTestCluster.java:2419)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:850)
... 40 more
Suppressed: java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_s0, are there still Lucene indices around?
Expected: <0L>
but: was <1326L>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.elasticsearch.test.InternalTestCluster.lambda$ensureEstimatedStats$31(InternalTestCluster.java:2419)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:850)
... 40 more
Here's the reproduce line
./gradlew :server:integTest \
-Dtests.seed=F93D4074E6F9AF6F \
-Dtests.class=org.elasticsearch.action.admin.indices.create.SplitIndexIT \
-Dtests.method="testSplitIndexPrimaryTerm" \
-Dtests.security.manager=true \
-Dtests.locale=es-CR \
-Dtests.timezone=America/Argentina/San_Juan \
-Dcompiler.java=11 \
-Druntime.java=8
Two more SplitIndexIT
failures:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.0+internalClusterTest/2803/console
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.x+matrix-java-periodic/ES_BUILD_JAVA=openjdk12,ES_RUNTIME_JAVA=java8fips,nodes=immutable&&linux&&docker/82/console
Some more failures in https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.0+matrix-java-periodic/ES_BUILD_JAVA=openjdk12,ES_RUNTIME_JAVA=zulu11,nodes=immutable&&linux&&docker/127/console, this time in ShrinkIndexIT
:
2> java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_sd4, are there still Lucene indices around?
Expected: <0L>
but: was <1394L>
one more in https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.0+matrix-java-periodic/ES_BUILD_JAVA=openjdk12,ES_RUNTIME_JAVA=java12,nodes=immutable&&linux&&docker/134/consoleFull for SplitIndexIT
org.elasticsearch.action.admin.indices.create.SplitIndexIT > testSplitIndexPrimaryTerm FAILED
java.lang.AssertionError: Accounting breaker not reset to 0 on node: node_sd3, are there still Lucene indices around?
Expected: <0L>
but: was <1256L>
REPRODUCE WITH: ./gradlew :server:integTest --tests "org.elasticsearch.action.admin.indices.create.SplitIndexIT.testSplitIndexPrimaryTerm" -Dtests.seed=BCF154492FFC0AA9 -Dtests.security.manager=true -Dtests.locale=sq-MK -Dtests.timezone=America/Toronto -Dcompiler.java=12 -Druntime.java=12
I've opened a couple of PRs for helping to track this down, #38807 and #38903, I would appreciate anyone taking a look at them.
@dakrone I can take this one.
Pinging @elastic/es-distributed
Another instance with SplitIndexIT: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.1+matrix-java-periodic/ES_BUILD_JAVA=openjdk12,ES_RUNTIME_JAVA=java8,nodes=immutable&&linux&&docker/32/console
./gradlew :server:integTest --tests "org.elasticsearch.action.admin.indices.create.SplitIndexIT.testSplitIndexPrimaryTerm" \
-Dtests.seed=AED85B1C36454065 \
-Dtests.security.manager=true \
-Dtests.locale=es-US \
-Dtests.timezone=Etc/GMT+12 \
-Dcompiler.java=12 \
-Druntime.java=8
I am on it.
This is a Lucene issue. I'll open the fix soon.
Talked to @DaveCTurner on another channel, we prefer to leave this issue open until we have a new Lucene snapshot.
This issue was fixed on 7.4+ as LUCENE-8809 was integrated on master and 7.x. Unfortunately, we are unlikely to have a new minor Lucene release for 7.2 and 7.3. We can upgrade to Lucene-7.7.2 on the 6.8 branch once the 7.3 branch is no longer used for releases.
I am closing this issue and leaving the accounting circuit breaker disabled on 6.8, 7.2 and 7.3.
CI output: https://internal-ci.elastic.co/job/elastic+x-pack-elasticsearch+6.2+multijob-windows-compatibility/682/console
Stack trace:
I see nothing else suspicious in the test output or in the logs on the worker. Sorry, not a lot to go on here.