jprante / elasticsearch-jdbc

JDBC importer for Elasticsearch
Apache License 2.0
2.84k stars 710 forks source link

NoNodeAvailableException with Elasticsearch 1.7.2 and jdbc importer 1.7.2.1 #669

Open arunkoshi opened 8 years ago

arunkoshi commented 8 years ago

Hi,

I was working on upgrading our current setup which uses jdbc river with Elasticsearch 0.90.9 to the recent most versions of the products (Elasticsearch 1.7.2 and jdbc importer 1.7.2.1). I am running this on

OS : Windows 2012 R2
RAM : 8GB 
Java : Oracle JRE 1.7 update 80
ES_HEAP_SIZE : 3g
ES Cluster : Single node, importer runs on the same box. 
index.refresh_interval: 60s

The box doesn't run any other service which could hog resources.

i tried out the importer with following settings (listing only relevant settings here)

"index": "myindex",
"elasticsearch" : { "cluster" : "MY_ES_CLUSTER", "host" : "localhost", "port" : 9300},
"query_timeout" : 1800,
"max_bulk_actions" : 1000,
"max_concurrent_bulk_requests" : 0,
"flush_interval" : "5s",
"max_bulk_volume" : "10m",
"max_request_wait" : "120s",
"max_retries" : 1,
"max_retries_wait" : "5s",
"sql" : [....]

I also tried max_bulk_actions of 2000/5000. And also tried flush_interval of 10s.

This works fine for a very small dataset (i have about 10 queries in the sql section each loading data into a different type in the index). However when i run this for roughly ~2gb dataset, it runs for about ~20 mins loading almost 90% of the data and then i get the following exception.

Note that i am able to load without any issues in my previous ES 0.90.9 + jdbc river setup on the same box, with the exact same dataset.

So i am wondering if there is any other setting i should be tweaking? I guess i could bump up the elasticsearch.timeout (i.e. client.transport.ping_timeout) to something like 30s. But i don't know if i am missing something else here.

[19:06:30,042][DEBUG][BulkTransportClient      ][pool-2-thread-1] after bulk [4373] [succeeded=4349341] [failed=0] [62ms] [concurrent requests=1]
[19:06:30,120][DEBUG][BulkTransportClient      ][pool-2-thread-1] before bulk [4374] [actions=1000] [bytes=117096] [concurrent requests=2]
[19:06:30,183][DEBUG][BulkTransportClient      ][pool-2-thread-1] after bulk [4374] [succeeded=4350341] [failed=0] [63ms] [concurrent requests=1]
[19:06:30,246][DEBUG][BulkTransportClient      ][pool-2-thread-1] before bulk [4375] [actions=1000] [bytes=117556] [concurrent requests=2]
[19:06:30,261][INFO ][org.elasticsearch.client.transport][elasticsearch[importer][generic][T#2]] [importer] failed to get node info for [#transport#-1][DTBNG-ElasticSearch][inet[localhost/127.0.0.1:10300]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/127.0.0.1:10300]][cluster:monitor/nodes/info] request_id [4597] timed out after [5016ms]
    at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.7.0_80]
    at java.lang.Thread.run(Unknown Source) [?:1.7.0_80]
[19:06:30,261][DEBUG][org.elasticsearch.transport.netty][elasticsearch[importer][generic][T#2]] [importer] disconnecting from [[#transport#-1][DTBNG-ElasticSearch][inet[localhost/127.0.0.1:10300]]] due to explicit disconnect call
[19:06:30,323][DEBUG][BulkTransportClient      ][pool-2-thread-1] after bulk [4375] [succeeded=4351341] [failed=0] [77ms] [concurrent requests=1]
[19:06:30,402][DEBUG][BulkTransportClient      ][pool-2-thread-1] before bulk [4376] [actions=1000] [bytes=115854] [concurrent requests=2]
[19:06:30,402][ERROR][BulkTransportClient      ][pool-2-thread-1] bulk [4376] error
org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: []
    at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:305) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:200) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.client.transport.support.InternalTransportClient.execute(InternalTransportClient.java:106) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.client.transport.support.InternalTransportClient.execute(InternalTransportClient.java:97) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:162) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.client.transport.TransportClient.bulk(TransportClient.java:365) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.action.bulk.BulkProcessor.execute(BulkProcessor.java:314) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.action.bulk.BulkProcessor.executeIfNeeded(BulkProcessor.java:299) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.action.bulk.BulkProcessor.internalAdd(BulkProcessor.java:281) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:264) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:260) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:246) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.support.client.transport.BulkTransportClient.bulkIndex(BulkTransportClient.java:274) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.support.client.transport.BulkTransportClient.bulkIndex(BulkTransportClient.java:33) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSink.index(StandardSink.java:237) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.common.util.SinkKeyValueStreamListener.end(SinkKeyValueStreamListener.java:63) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.common.util.SinkKeyValueStreamListener.end(SinkKeyValueStreamListener.java:26) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.common.util.PlainKeyValueStreamListener.values(PlainKeyValueStreamListener.java:158) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.processRow(StandardSource.java:1124) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.nextRow(StandardSource.java:1072) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.merge(StandardSource.java:808) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.execute(StandardSource.java:697) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.fetch(StandardSource.java:605) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardContext.fetch(StandardContext.java:215) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardContext.execute(StandardContext.java:190) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.JDBCImporter.process(JDBCImporter.java:118) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.Importer.newRequest(Importer.java:241) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.Importer.newRequest(Importer.java:57) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.pipeline.AbstractPipeline.call(AbstractPipeline.java:86) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.pipeline.AbstractPipeline.call(AbstractPipeline.java:17) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at java.util.concurrent.FutureTask.run(Unknown Source) [?:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.7.0_80]
    at java.lang.Thread.run(Unknown Source) [?:1.7.0_80]
[19:06:31,527][ERROR][importer.jdbc.context.standard][pool-2-thread-1] org.elasticsearch.ElasticsearchIllegalStateException: client is closed
java.io.IOException: org.elasticsearch.ElasticsearchIllegalStateException: client is closed
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.fetch(StandardSource.java:639) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardContext.fetch(StandardContext.java:215) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardContext.execute(StandardContext.java:190) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.JDBCImporter.process(JDBCImporter.java:118) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.Importer.newRequest(Importer.java:241) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.Importer.newRequest(Importer.java:57) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.pipeline.AbstractPipeline.call(AbstractPipeline.java:86) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.pipeline.AbstractPipeline.call(AbstractPipeline.java:17) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at java.util.concurrent.FutureTask.run(Unknown Source) [?:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.7.0_80]
    at java.lang.Thread.run(Unknown Source) [?:1.7.0_80]
Caused by: org.elasticsearch.ElasticsearchIllegalStateException: client is closed
    at org.xbib.elasticsearch.support.client.transport.BulkTransportClient.bulkIndex(BulkTransportClient.java:270) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.support.client.transport.BulkTransportClient.bulkIndex(BulkTransportClient.java:33) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSink.index(StandardSink.java:237) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.common.util.SinkKeyValueStreamListener.end(SinkKeyValueStreamListener.java:63) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.common.util.SinkKeyValueStreamListener.end(SinkKeyValueStreamListener.java:26) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.common.util.PlainKeyValueStreamListener.values(PlainKeyValueStreamListener.java:158) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.processRow(StandardSource.java:1124) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.nextRow(StandardSource.java:1072) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.merge(StandardSource.java:808) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.execute(StandardSource.java:697) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSource.fetch(StandardSource.java:605) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    ... 11 more
[19:06:31,527][DEBUG][importer.jdbc.context.standard][pool-2-thread-1] after fetch
[19:06:31,542][DEBUG][importer.jdbc.sink.standard][pool-2-thread-1] afterFetch: flush ingest
[19:06:31,542][ERROR][importer.jdbc.context.standard][pool-2-thread-1] client is closed
org.elasticsearch.ElasticsearchIllegalStateException: client is closed
    at org.xbib.elasticsearch.support.client.transport.BulkTransportClient.flushIngest(BulkTransportClient.java:360) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.support.client.transport.BulkTransportClient.flushIngest(BulkTransportClient.java:33) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSink.flushIngest(StandardSink.java:313) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardSink.afterFetch(StandardSink.java:129) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardContext.afterFetch(StandardContext.java:233) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.elasticsearch.jdbc.strategy.standard.StandardContext.execute(StandardContext.java:193) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.JDBCImporter.process(JDBCImporter.java:118) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.Importer.newRequest(Importer.java:241) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.tools.Importer.newRequest(Importer.java:57) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.pipeline.AbstractPipeline.call(AbstractPipeline.java:86) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at org.xbib.pipeline.AbstractPipeline.call(AbstractPipeline.java:17) [elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
    at java.util.concurrent.FutureTask.run(Unknown Source) [?:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.7.0_80]
    at java.lang.Thread.run(Unknown Source) [?:1.7.0_80]
[19:06:31,542][DEBUG][importer                 ][pool-2-thread-1] close (no op)
[19:06:31,542][DEBUG][importer                 ][main] execution completed
[19:06:31,542][DEBUG][importer                 ][main] cleanup (no op)
[19:06:31,542][WARN ][BulkTransportClient      ][Thread-1] no client
jprante commented 8 years ago

You have a single node, and you get ES to the moment where it is busy for more than 5 seconds, after having indexed millions of docs. Check your cluster stats and decide if you want to enlarge the cluster capacity by giving more power to indexing (segment merging), or by adding nodes (which is easier by far).

LxiaoGirl commented 8 years ago

I have the same problem! I used to use _river! perfect! now, I use importer. error! [19:06:30,261][INFO ][org.elasticsearch.client.transport][elasticsearch[importer][generic][T#2]] [importer] failed to get node info for [#transport#-1][DTBNG-ElasticSearch][inet[localhost/127.0.0.1:10300]], disconnecting... org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/127.0.0.1:10300]][cluster:monitor/nodes/info] request_id [4597] timed out after [5016ms] at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]

anandp504 commented 7 years ago

@jprante I am using elasticsearch-jdbc version 2.2.0.0 and elasticsearch version 2.2.0. When I index a large dataset from a remote machine onto the elasticsearch node (single node), it works fine. But, if I run the jdbc importer on the same machine where elasticsearch is running, I get the error mentioned above. How do we get past this error?

jprante commented 7 years ago

@anandp504 this issue was about timeouts that are caused by Elasticsearch 1.x cluster, not by JDBC importer. If you encounter issues with 2.x you should open a new issue. The only solution is examining the Elasticsearch cluster logs and find the cause of the trouble. I can not see there is an issue with JDBC importer. It feels like a misconfiguration. Eventually, bulk parameters are set too high.