Open arunkoshi opened 8 years ago
You have a single node, and you get ES to the moment where it is busy for more than 5 seconds, after having indexed millions of docs. Check your cluster stats and decide if you want to enlarge the cluster capacity by giving more power to indexing (segment merging), or by adding nodes (which is easier by far).
I have the same problem! I used to use _river! perfect! now, I use importer. error! [19:06:30,261][INFO ][org.elasticsearch.client.transport][elasticsearch[importer][generic][T#2]] [importer] failed to get node info for [#transport#-1][DTBNG-ElasticSearch][inet[localhost/127.0.0.1:10300]], disconnecting... org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/127.0.0.1:10300]][cluster:monitor/nodes/info] request_id [4597] timed out after [5016ms] at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529) ~[elasticsearch-jdbc-1.7.2.1-uberjar.jar:?]
@jprante I am using elasticsearch-jdbc version 2.2.0.0 and elasticsearch version 2.2.0. When I index a large dataset from a remote machine onto the elasticsearch node (single node), it works fine. But, if I run the jdbc importer on the same machine where elasticsearch is running, I get the error mentioned above. How do we get past this error?
@anandp504 this issue was about timeouts that are caused by Elasticsearch 1.x cluster, not by JDBC importer. If you encounter issues with 2.x you should open a new issue. The only solution is examining the Elasticsearch cluster logs and find the cause of the trouble. I can not see there is an issue with JDBC importer. It feels like a misconfiguration. Eventually, bulk parameters are set too high.
Hi,
I was working on upgrading our current setup which uses jdbc river with Elasticsearch 0.90.9 to the recent most versions of the products (Elasticsearch 1.7.2 and jdbc importer 1.7.2.1). I am running this on
The box doesn't run any other service which could hog resources.
i tried out the importer with following settings (listing only relevant settings here)
I also tried max_bulk_actions of 2000/5000. And also tried flush_interval of 10s.
This works fine for a very small dataset (i have about 10 queries in the sql section each loading data into a different type in the index). However when i run this for roughly ~2gb dataset, it runs for about ~20 mins loading almost 90% of the data and then i get the following exception.
Note that i am able to load without any issues in my previous ES 0.90.9 + jdbc river setup on the same box, with the exact same dataset.
So i am wondering if there is any other setting i should be tweaking? I guess i could bump up the elasticsearch.timeout (i.e. client.transport.ping_timeout) to something like 30s. But i don't know if i am missing something else here.