graphaware / neo4j-to-elasticsearch

GraphAware Framework Module for Integrating Neo4j with Elasticsearch
261 stars 57 forks source link

Neo4j to Elastic search bulk operation failing via the graphaware plugin. #136

Closed gjerin594 closed 5 years ago

gjerin594 commented 6 years ago

In our scenario, we have data similar to syslog / monitoring metrics that are continuously collected and posted to Neo4j. This data is then pushed to ES using the graphaware plugin. Over the last few days, we have started noticing an error with the bulk operations from Neo4j while pushing to ES. Could you please help in resolving the issue?

In the JVM options file we have set the values of the java heap size to 40GB. However, even after increasing the size of the Java Heap the issue is seen to persist.

I have attached a copy of the Neo4j configuration file, jvm options file and a snippet of the error we are receiving jvm_options.txt neo4j_conf.txt neo4j_error.txt

We are using

  1. Neo4j-3.3.0
  2. ElasticSearch - 5.6.4
  3. Graphaware Plugin : a. graphaware-neo4j-to-elasticsearch-3.3.0.51 b. graphaware-server-community-all-3.3.0.51 c. graphaware-uuid-3.3.0.51
gjerin594 commented 6 years ago

Hi @ikwattro , Could you please help out with this issue? We are losing very precious data as well as time as the issue seems to still persist. Thanks in advance!

gjerin594 commented 6 years ago

I did go through https://github.com/graphaware/neo4j-to-elasticsearch/issues/42 and i did try the steps mentioned there. Our ES heap size was increased and the queue size was decreased for Neo4j from 10000 to 1000. The error is still persisting.

gjerin594 commented 6 years ago

ES JVM heap size - 30GB Neo4j heap size -20GB Neo4j Page Cache size - 30GB

ikwattro commented 6 years ago

Based on your errors, the plugin cannot connect to your ES instance. Can you make a curl request to the ES cluster ?

gjerin594 commented 6 years ago

@ikwattro - Yes. I am able to curl to the ES instance. It is hosted on the same machine. We have already pushed around 90000+ records from Neo4j to ES successfully. Over the last two weeks however, we have had issues with the push from Neo4j to ES. The same error occurred when we deleted the data from Neo4j. While the changes were being mapped, the error was thrown and many records still persisted in ES even though the deletion occurred in Neo4j.

ikwattro commented 6 years ago

Try to disable bulk, there will be more logging regarding the issue.

gjerin594 commented 6 years ago

@ikwattro . Will do that and get back with the results. Much thanks!

gjerin594 commented 6 years ago

Hi @ikwattro , we are still testing on the issue and are facing issues. The bulk mapping is now disabled. Is there any way that we can do a refresh operation? For eg: The data is present in Neo4j but it is absent in Elastic search due to bulk push failure. Would there be any way, we can refresh the data so that the missing data is mapped via the plugin?

ikwattro commented 6 years ago

@gjerin594 Export is triggered when nodes are updated, so update a property, like a fake timestamp on a node that should be replicated and you would want to inspect.

gjerin594 commented 6 years ago

@ikwattro Thanks for the prompt response! Will check it out.

ikwattro commented 5 years ago

@gjerin594 any update ?