elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.59k stars 24.63k forks source link

AssertionError inside ElasticsearchDirectoryReader.addReaderCloseListener #41795

Closed costin closed 5 years ago

costin commented 5 years ago

Elasticsearch version (bin/elasticsearch --version):

Occurs in master while running an integration test

Plugins installed: []

None

JVM version (java -version):

JVM: 12.0.1 (Oracle Corporation 12.0.1+12)

OS version (uname -a if on a Unix-like system): OS: Windows 10 10.0 amd64

Description of the problem including expected versus actual behavior:

While running integration test for querying frozen / throttled indices, the assertion error occurs causing the JVM to simply bail-out. Interestingly this seems to affect only queries that perform aggregations, normal queries or index discoveries work fine.

The exception thrown is:

[2019-05-03T16:36:21,897][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-0] fatal error in thread [elasticsearch[node-0][search_throttled][T#1]], exiting
java.lang.AssertionError: null
    at org.elasticsearch.common.lucene.index.ElasticsearchDirectoryReader.addReaderCloseListener(ElasticsearchDirectoryReader.java:106) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:134) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1355) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1289) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:347) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:393) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:358) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:354) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.search.SearchService$4.doRun(SearchService.java:1069) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:43) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:757) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
    at java.lang.Thread.run(Thread.java:835) [?:?]
=========================================
:x-pack:plugin:sql:qa:single-node:integTestRunner (Thread[Daemon worker,5,main]) completed. Took 23.823 secs.
:x-pack:plugin:sql:qa:single-node:integTestCluster#stop (Thread[Daemon worker,5,main]) started.

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including (e.g.) index creation, mappings, settings, query etc. The easier you make for us to reproduce it, the more likely that somebody will take the time to look at it.

I can reproduce the failure every time when running from the command line, inside Kibana or the IDE (through JDBC) somehow the query works. The queries that fail looks as such:

{
  "size" : 0,
  "timeout" : "90000ms",
  "_source" : false,
  "stored_fields" : "_none_",
  "aggregations" : {
    "groupby" : {
      "composite" : {
        "size" : 83,
        "sources" : [
          {
            "16" : {
              "terms" : {
                "field" : "gender",
                "missing_bucket" : true,
                "order" : "asc"
              }
            }
          }
        ],
        "after" : {
          "16" : "M"
        }
      }
    }
  }
} on [frozen_emp]
  1. Get SQL frozen PR branch from https://github.com/costin/elasticsearch/commits/sql/frozen-indices Current commit is 7bddf5c7d22265cd8c9b1cdd058d688fd0ac3652
  2. Run the frozen test: gradlew --parallel :x-pack:plugin:sql:qa:single-node:integTestRunner --tests "org.elasticsearch.xpack.sql.qa.single_node.JdbcFrozenCsvSpecIT" -Dcompiler.java=12 -Druntime.java=12
  3. the failure should appear shortly:
    
    > Task :x-pack:plugin:sql:qa:single-node:integTestRunner

org.elasticsearch.xpack.sql.qa.single_node.JdbcFrozenCsvSpecIT > test {frozen.testCountFromFrozen} FAILED org.elasticsearch.xpack.sql.client.ClientException: Cannot POST address http://127.0.0.1:52249/_sql?error_trace (Connection refused: connect)

    Caused by:
    java.net.ConnectException: Connection refused: connect

java.net.ConnectException: Timeout connecting to [/127.0.0.1:52249]

    Caused by:
    java.net.ConnectException: Timeout connecting to [/127.0.0.1:52249]

java.net.ConnectException: Connection refused: no further information

    Caused by:
    java.net.ConnectException: Connection refused: no further information

REPRODUCE WITH: gradlew :x-pack:plugin:sql:qa:single-node:integTestRunner --tests "org.elasticsearch.xpack.sql.qa.single_node.JdbcFrozenCsvSpecIT.test {frozen.testCountFromFrozen}" -Dtests.seed=2D9C665AAF8EA8DF -Dtests.security.manager=true -Dtests.locale=nl-BE -Dtests.timezone=Asia/Macao -Dcompiler.java=12 -Druntime.java=12

org.elasticsearch.xpack.sql.qa.single_node.JdbcFrozenCsvSpecIT > classMethod FAILED java.net.ConnectException: Timeout connecting to [/127.0.0.1:52249]


**Provide logs (if relevant)**:

[2019-05-03T17:36:00,052][INFO ][o.e.e.NodeEnvironment ] [node-0] using [1] data paths, mounts [[WORK (E:)]], net usable_space [376.9gb], net total_space [476.9gb], types [NTFS] [2019-05-03T17:36:00,055][INFO ][o.e.e.NodeEnvironment ] [node-0] heap size [494.9mb], compressed ordinary object pointers [true] [2019-05-03T17:36:00,059][INFO ][o.e.n.Node ] [node-0] node name [node-0], node ID [NOLXSfG_R8aIjuFl1cnYzg] [2019-05-03T17:36:00,059][INFO ][o.e.n.Node ] [node-0] version[8.0.0-SNAPSHOT], pid[1376], build[default/zip/41bdc7a/2019-05-03T14:19:45.496945400Z], OS[Windows 10/10.0/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/12.0.1/12.0.1+12] [2019-05-03T17:36:00,060][INFO ][o.e.n.Node ] [node-0] JVM home [E:\work\es\elasticsearch\x-pack\plugin\sql\qa\single-node\build\cluster\integTestCluster node0\elasticsearch-8.0.0-SNAPSHOT\jdk] [2019-05-03T17:36:00,060][INFO ][o.e.n.Node ] [node-0] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=C:\Users\costin\AppData\Local\Temp\elasticsearch, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -Dio.netty.allocator.type=unpooled, -Xms512m, -Xmx512m, -ea, -esa, -Delasticsearch, -Des.path.home=E:\work\es\elasticsearch\x-pack\plugin\sql\qa\single-node\build\cluster\integTestCluster node0\elasticsearch-8.0.0-SNAPSHOT, -Des.path.conf=E:\work\es\elasticsearch\x-pack\plugin\sql\qa\single-node\build\cluster\integTestCluster node0\elasticsearch-8.0.0-SNAPSHOT\config, -Des.distribution.flavor=default, -Des.distribution.type=zip, -Des.bundled_jd=true] [2019-05-03T17:36:00,061][WARN ][o.e.n.Node ] [node-0] version [8.0.0-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production [2019-05-03T17:36:01,600][INFO ][o.e.p.PluginsService ] [node-0] loaded module [aggs-matrix-stats] [2019-05-03T17:36:01,600][INFO ][o.e.p.PluginsService ] [node-0] loaded module [analysis-common] [2019-05-03T17:36:01,601][INFO ][o.e.p.PluginsService ] [node-0] loaded module [data-frame] [2019-05-03T17:36:01,601][INFO ][o.e.p.PluginsService ] [node-0] loaded module [ingest-common] [2019-05-03T17:36:01,601][INFO ][o.e.p.PluginsService ] [node-0] loaded module [ingest-geoip] [2019-05-03T17:36:01,612][INFO ][o.e.p.PluginsService ] [node-0] loaded module [ingest-user-agent] [2019-05-03T17:36:01,612][INFO ][o.e.p.PluginsService ] [node-0] loaded module [lang-expression] [2019-05-03T17:36:01,613][INFO ][o.e.p.PluginsService ] [node-0] loaded module [lang-mustache] [2019-05-03T17:36:01,613][INFO ][o.e.p.PluginsService ] [node-0] loaded module [lang-painless] [2019-05-03T17:36:01,613][INFO ][o.e.p.PluginsService ] [node-0] loaded module [mapper-extras] [2019-05-03T17:36:01,613][INFO ][o.e.p.PluginsService ] [node-0] loaded module [parent-join] [2019-05-03T17:36:01,613][INFO ][o.e.p.PluginsService ] [node-0] loaded module [percolator] [2019-05-03T17:36:01,613][INFO ][o.e.p.PluginsService ] [node-0] loaded module [rank-eval] [2019-05-03T17:36:01,613][INFO ][o.e.p.PluginsService ] [node-0] loaded module [reindex] [2019-05-03T17:36:01,614][INFO ][o.e.p.PluginsService ] [node-0] loaded module [repository-url] [2019-05-03T17:36:01,614][INFO ][o.e.p.PluginsService ] [node-0] loaded module [transport-netty4] [2019-05-03T17:36:01,614][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-ccr] [2019-05-03T17:36:01,614][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-core] [2019-05-03T17:36:01,614][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-deprecation] [2019-05-03T17:36:01,614][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-graph] [2019-05-03T17:36:01,615][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-ilm] [2019-05-03T17:36:01,615][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-logstash] [2019-05-03T17:36:01,615][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-ml] [2019-05-03T17:36:01,615][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-monitoring] [2019-05-03T17:36:01,615][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-rollup] [2019-05-03T17:36:01,615][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-security] [2019-05-03T17:36:01,615][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-sql] [2019-05-03T17:36:01,616][INFO ][o.e.p.PluginsService ] [node-0] loaded module [x-pack-watcher] [2019-05-03T17:36:01,616][INFO ][o.e.p.PluginsService ] [node-0] no plugins loaded [2019-05-03T17:36:04,177][INFO ][o.e.d.DiscoveryModule ] [node-0] using discovery type [zen] and seed hosts providers [settings, file] [2019-05-03T17:36:04,578][INFO ][o.e.n.Node ] [node-0] initialized [2019-05-03T17:36:04,578][INFO ][o.e.n.Node ] [node-0] starting ...

[2019-05-03T17:36:04,819][INFO ][o.e.c.c.Coordinator ] [node-0] setting initial configuration to VotingConfiguration{NOLXSfG_R8aIjuFl1cnYzg} [2019-05-03T17:36:04,819][WARN ][o.e.d.FileBasedSeedHostsProvider] [node-0] expected, but did not find, a dynamic hosts list at [E:\work\es\elasticsearch\x-pack\plugin\sql\qa\single-node\build\cluster\integTestCluster node0\elasticsearch-8.0.0-SNAPSHOT\config\unicast_hosts.txt]

[2019-05-03T17:36:04,928][INFO ][o.e.n.Node ] [node-0] started [2019-05-03T17:36:04,954][INFO ][o.e.c.s.MasterService ] [node-0] elected-as-master ([1] nodes joined)[{node-0}{NOLXSfG_R8aIjuFl1cnYzg}{gPWCEd6vSFGDnGoVSc3avA}{127.0.0.1}{127.0.0.1:52182}{xpack.installed=true, testattr=test} elect leader, _BECOME_MASTERTASK, _FINISHELECTION], term: 1, version: 1, reason: master node changed {previous [], current [{node-0}{NOLXSfG_R8aIjuFl1cnYzg}{gPWCEd6vSFGDnGoVSc3avA}{127.0.0.1}{127.0.0.1:52182}{xpack.installed=true, testattr=test}]} [2019-05-03T17:36:04,990][INFO ][o.e.c.s.ClusterApplierService] [node-0] master node changed {previous [], current [{node-0}{NOLXSfG_R8aIjuFl1cnYzg}{gPWCEd6vSFGDnGoVSc3avA}{127.0.0.1}{127.0.0.1:52182}{xpack.installed=true, testattr=test}]}, term: 1, version: 1, reason: Publication{term=1, version=1} [2019-05-03T17:36:05,017][INFO ][o.e.g.GatewayService ] [node-0] recovered [0] indices into cluster_state [2019-05-03T17:36:05,084][INFO ][o.e.l.LicenseService ] [node-0] license [2d11c7a6-e52e-4173-9ff8-056ad5f00bf7] mode [trial] - valid [2019-05-03T17:36:07,440][INFO ][o.e.c.m.MetaDataCreateIndexService] [node-0] [test_emp] creating index, cause [api], templates [], shards [1]/[1], mappings [_doc] [2019-05-03T17:36:07,967][INFO ][o.e.c.m.MetaDataCreateIndexService] [node-0] [test_emp_copy] creating index, cause [api], templates [], shards [1]/[1], mappings [_doc] [2019-05-03T17:36:08,182][INFO ][o.e.c.m.MetaDataCreateIndexService] [node-0] [logs] creating index, cause [api], templates [], shards [1]/[1], mappings [_doc] [2019-05-03T17:36:08,421][INFO ][o.e.c.m.MetaDataCreateIndexService] [node-0] [frozen_emp] creating index, cause [api], templates [], shards [1]/[1], mappings [_doc] [2019-05-03T17:36:08,596][INFO ][o.e.c.m.MetaDataIndexStateService] [node-0] closing indices [frozen_emp/O6c5wuceTau7cmaP41cU7Q] [2019-05-03T17:36:08,623][INFO ][o.e.c.m.MetaDataIndexStateService] [node-0] completed closing of indices [frozen_emp] [2019-05-03T17:36:08,685][INFO ][o.e.c.r.a.AllocationService] [node-0] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[frozen_emp][0]] ...]). [2019-05-03T17:36:08,708][INFO ][o.e.c.s.IndexScopedSettings] [node-0] updating [index.search.throttled] from [false] to [true] [2019-05-03T17:36:08,713][INFO ][o.e.c.m.MetaDataIndexStateService] [node-0] opening indices [[frozen_emp/O6c5wuceTau7cmaP41cU7Q]] [2019-05-03T17:36:08,770][INFO ][o.e.c.r.a.AllocationService] [node-0] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[frozen_emp][0]] ...]). [2019-05-03T17:36:09,114][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-0] fatal error in thread [elasticsearch[node-0][search_throttled][T#1]], exiting java.lang.AssertionError: null at org.elasticsearch.common.lucene.index.ElasticsearchDirectoryReader.addReaderCloseListener(ElasticsearchDirectoryReader.java:106) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:134) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1355) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1289) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:347) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:393) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:358) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:354) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.search.SearchService$4.doRun(SearchService.java:1069) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:43) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:757) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:835) [?:?]



I'm currently on holidays so my availability is spotty but I'll be back next week.
elasticmachine commented 5 years ago

Pinging @elastic/es-distributed

costin commented 5 years ago

The issue is reproduced on the CI as well : https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request-2/13036/console As the frozen index agg query is ran, the assertion error gets thrown, the JVM stops and the rest of the tests fail with connection timeout.