neo4j-contrib / neo4j-elasticsearch

Neo4j ElasticSearch Integration
Apache License 2.0
210 stars 79 forks source link

Neo4j and AWS ElasticSearch Service integration failed occasionally #24

Open Kunal-Dethe opened 7 years ago

Kunal-Dethe commented 7 years ago

Hello All,

I have been using this module to insert data into ElasticSearch from Neo4j. It works fine when used on the local, development and staging server given that the ElasticSearch service is running on the server itself.

But when the Amazon AWS ElasticSearch service is used and data is added in the Neo4j db - sometimes the data is not getting inserted into ElasticSearch.

There is no error or exception thrown while the transaction takes place between the Neo4j and ElasticSearch. Checked the logs file created at /var/log/neo4j/console.log, /var/log/neo4j/http.log

As the data is inserted sometimes and most importantly when ElasticSearch is on the same server, the settings does not seems to be of any issue.

So it's getting difficult to debug as why it could be happening.

Any ideas are appreciated.

jexp commented 7 years ago

Hi Kunal,

we can add some more logging if that helps you. Can you check your network setup so that the neo4j box sees the ES box and vice versa?

Which version do you use? The one for Neo4j 3.0 ? The insertion happens asynchronously as fire & forget. but we can check responses and log them if that helps.

Kunal-Dethe commented 7 years ago

Hello @jexp

Thanks for the reply.

As for the network setup, the server is a EC2 instance where the Neo4j is installed and ElasticSearch service in question is AWS ElasticSearch Service. As it does work sometimes I am not understanding any issue with the network.

Neo4j version: 2.3.6 ElasticSearch version: 2.3.2

Again to point out, this only happens when the AWS ElasticSearch Service is connected and not with the one running on EC2 instance itself.

It would be of really great help to know if there is any way to log the transactions happening between the Neo4j and ElasticSearch services.

Below is the content of the log file: /var/log/neo4j/console.log

2016-09-02 12:27:47.494+0000 INFO  Remote interface ready and available at http://0.0.0.0:7474/
12:28:42.520 [NodeChecker RUNNING] ERROR i.s.c.config.discovery.NodeChecker - Error executing NodesInfo!
io.searchbox.client.config.exception.NoServerConfiguredException: No Server is assigned to client to connect
        at io.searchbox.client.AbstractJestClient$ServerPool.getNextServer(AbstractJestClient.java:132) ~[jest-common-2.0.2.jar:na]
        at io.searchbox.client.AbstractJestClient.getNextServer(AbstractJestClient.java:81) ~[jest-common-2.0.2.jar:na]
        at io.searchbox.client.http.JestHttpClient.prepareRequest(JestHttpClient.java:80) ~[jest-2.0.2.jar:na]
        at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:46) ~[jest-2.0.2.jar:na]
        at io.searchbox.client.config.discovery.NodeChecker.runOneIteration(NodeChecker.java:65) ~[jest-common-2.0.2.jar:na]
        at com.google.common.util.concurrent.AbstractScheduledService$ServiceDelegate$Task.run(AbstractScheduledService.java:189) [guava-19.0.jar:na]
        at com.google.common.util.concurrent.Callables$3.run(Callables.java:100) [guava-19.0.jar:na]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
12:28:42.540 [NodeChecker RUNNING] INFO  i.s.client.AbstractJestClient - Setting server pool to a list of 1 servers: [ELASTICSEARCH_URL]
12:29:42.541 [NodeChecker RUNNING] DEBUG i.s.client.http.JestHttpClient - GET method created based on client request
12:29:42.541 [NodeChecker RUNNING] DEBUG i.s.client.http.JestHttpClient - Request method=GET url=ELASTICSEARCH_URL/_nodes/_all/http
12:29:42.553 [NodeChecker RUNNING] DEBUG io.searchbox.action.AbstractAction - Request and operation succeeded
12:29:42.553 [NodeChecker RUNNING] DEBUG i.s.c.config.discovery.NodeChecker - Discovered 0 HTTP hosts:
12:29:42.553 [NodeChecker RUNNING] INFO  i.s.client.AbstractJestClient - Setting server pool to a list of 0 servers: []
12:29:42.553 [NodeChecker RUNNING] WARN  i.s.client.AbstractJestClient - No servers are currently available to connect.

The response from the API: ELASTICSEARCH_URL/_nodes/_all/http

EC2 instance:

{"cluster_name":"elasticsearch","nodes":{"X9zagEOlSK-h3l9dSG08PA":{"name":"Her","transport_address":"172.31.50.210:9300","host":"172.31.50.210","ip":"172.31.50.210","version":"2.3.0","build":"8371be8","http_address":"172.31.50.210:9200","http":{"bound_address":["[::]:9200"],"publish_address":"172.31.50.210:9200","max_content_length_in_bytes":104857600}}}}

AWS ElasticSearch instance:

{"cluster_name":"102372860153:ES_DONAIN_NAME","nodes":{"kXO7l2ZyRgaDq44Ohx4qCA":{"name":"Cassie Lang","version":"2.3.2","build":"0944b4b"}}}
nishadk123 commented 7 years ago

Hello @jexp ,

Would appreciate if you can provide some further insights on the issue.

We got in touch with AWS for the same and here is what they have to say,

jexp commented 7 years ago

https://github.com/searchbox-io/Jest/issues/382

Probably have to add support for the mentioned aws signing interceptor. Do you have capacity to test that out?

I have no AWS ES to test it.

vvavepacket commented 7 years ago

@jexp can i give you AWS ES so you can test it?