Closed jillesvangurp closed 10 years ago
thanks for reporting @jillesvangurp - we'll take a look
@jillesvangurp I think I found a small issue where there is a small window of time that a field is unset, but the warmer that needs is running. In order to confirm this can you share how you start the embedded node and run your test? (for example you wait for green status before running?) Sharing code snippets how the node is brought up before the tests run would even be more helpful.
Sure no problem:
String defaultIndexDirectory = "target/data-"+UUID.randomUUID().toString();
String indexDir = config.getString("estestserver.indexdir",defaultIndexDirectory);
String logDir = config.getString("estestserver.logdir",defaultIndexDirectory + "/logs");
String esPort = config.getString("estestserver.port","9299");
File file = new File(defaultIndexDirectory);
file.mkdirs();
LOG.info("using " + file.getAbsolutePath() + " for es data and logs");
Settings settings = ImmutableSettings.settingsBuilder()
.put("name", "test-node-"+RandomStringUtils.randomAlphabetic(5))
.put("cluster.name", "linko-dev-cluster-"+RandomStringUtils.randomAlphabetic(5))
.put("index.gateway.type", "none")
.put("gateway.type", "none")
.put("discovery.zen.ping.multicast.ping.enabled", "false")
.put("discovery.zen.ping.multicast.enabled", "false")
.put("path.data", indexDir)
.put("path.logs", logDir)
.put("foreground", "true")
.put("http.port", esPort)
.put("http.cors.enabled", "true")
.put("http.cors.allow-origin","/https?:\\/\\/(localhost|kibana.*\\.linko\\.io)(:[0-9]+)?/")
.build();
LOG.info(settings.toDelimitedString(';'));
NodeBuilder nodeBuilder = NodeBuilder.nodeBuilder()
.settings(settings)
.loadConfigSettings(false);
node = nodeBuilder
.build();
// register a shutdown hook
Runtime.getRuntime().addShutdownHook(new Thread() {
@Override
public void run() {
node.close();
}
});
node.start();
// wait until the shards are ready
node.client().admin().cluster().prepareHealth().setWaitForGreenStatus().execute().actionGet();
One additional bit of information that I just realized may be relevant here is that we have a parent child relation between user and usercontact. So, the exception is happening when it is doing something with the child type. For reference, here's a gist with the full mapping for the index: https://gist.github.com/jillesvangurp/d0cd29573b876f9cc4d3
Also, we're using testNg and surefire. The elasticsearch node is started in a @BeforeSuite. We use a very large threadcount and randomized order to surface any issues related to inter test dependencies and stability of our system. This pretty much means all our integration test classes are starting at the same time. We generally use randomized test data and there are a lot of calls to /_refresh to ensure indices are committed in each tests.
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<parallel>classes</parallel>
<threadCount>50</threadCount>
<runOrder>random</runOrder>
<argLine>-Xms1024m -Xmx2048m</argLine>
<properties>
<property>
<name>listener</name>
<value>io.linko.ng.testutil.TestProgressLogger</value>
</property>
</properties>
</configuration>
</plugin>
Thanks for the provided information @jillesvangurp
I opened: #8168 for this issue. Are you able to verify if the non fatal NPE doesn't occur any more with this fix in your test infrastructure?
No problem. Unfortunately, most of our builds don't trigger this exception; so it is a bit hard for me to confirm. I've only spotted it twice out of dozens of test runs over the past week. If you issue another beta, I'll be able to depend on that at least.
I've seen the stacktrace below in our integration tests a couple of time. We're starting elasticsearch as an embedded node. The error appears to be non fatal since our integration tests pass anyway. I've seen this stacktrace twice in the past week but can't reproduce it reliably.
We are running our maven tests concurrently and in randomized order, so there are a lot of integration tests hitting our elasticsearch node all at once right after it starts and reports a green status.
Using elasticsearch 1.4.0 Beta1
17-10-2014T15:56:40+0200 W warmer - [test-node-gstJI] [inbot_users_v27][2] failed to load random access for [_type:usercontact] org.elasticsearch.common.util.concurrent.UncheckedExecutionException: java.lang.NullPointerException at org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2203) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.common.cache.LocalCache.get(LocalCache.java:3937) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache.getAndLoadIfNotPresent(FixedBitSetFilterCache.java:132) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache.access$100(FixedBitSetFilterCache.java:75) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache$FixedBitSetFilterWarmer$1.run(FixedBitSetFilterCache.java:284) ~[elasticsearch-1.4.0.Beta1.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0] at java.lang.Thread.run(Thread.java:744) [na:1.8.0] Caused by: java.lang.NullPointerException: null at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache$2.call(FixedBitSetFilterCache.java:157) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache$2.call(FixedBitSetFilterCache.java:132) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282) ~[elasticsearch-1.4.0.Beta1.jar:na] at org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2197) ~[elasticsearch-1.4.0.Beta1.jar:na] ... 8 common frames omitted