philippkueng / node-neo4j

Neo4j REST API wrapper for Node.js
MIT License
211 stars 44 forks source link

insertNode() ERR with very large batch operations #33

Closed lazaruslarue closed 10 years ago

lazaruslarue commented 10 years ago

Trying to batch insert a lot of data with versions of this(a), or a timeout-wrapped version(b):

var A_batchInsert = function(arrObject) {
  for (var i = 0; i < arrObject.length; i++) {
    db.insertNode(arrObject[i],['LabelName'], cb);
  }
};
var B_batchInsert = function(arrObject) {
  for (var i = 0; i < arrObject.length; i++) {
    setTimeout(db.insertNode(arrObject[i],['LabelName'], cb), 10);
  }
};

My file has something like 7000 lines, and returns the below error. Works fine when I include only 50 or so lines of the file.

events.js:72
        throw er; // Unhandled 'error' event
              ^
Error: connect EMFILE
    at errnoException (net.js:901:11)
    at connect (net.js:764:19)
    at net.js:842:9
    at asyncCallback (dns.js:68:16)
    at Object.onanswer [as oncomplete] (dns.js:121:9)

File format is like this:

var arrObject = [
  {"val":"salt","description":"salt_desc","term":"salt_term"},
  {"val":"sugar","description":"sugar_desc","term":"sugar_term"},
  {"val":"butter","description":"butter_desc","term":"butter_term"},
  {"val":"onion","description":"onion_desc","term":"onion_term"}
];
philippkueng commented 10 years ago

Hi @lazaruslarue, based on your issue and without running myself yet i think you're DoS-ing your Neo4j instance. I suggest you use a batch-query for those type of things.

The library itself has no queuing built in, so all calls you make go straight to Neo4j, which probably just can't keep up with it.

If it's not related to the number of connections, please let me know. I'll look into it then after my exams, which will end by the 27.1.

Cheers, Phil

lazaruslarue commented 10 years ago

thanks for the quick response, @philippkueng. i think the DoS situation probably correct. for the time being, i've worked around the issue by creating a cypher query for each entry. this seems to do the trick. if i learn anything else about the issue i'll let you know. good luck with exams :)

philippkueng commented 10 years ago

Hi @lazaruslarue,

I still couldn't reproduce your scenario. However going with the DoS theory I think Neo4j prefers you to use a batchQuery for those kind of tasks. http://docs.neo4j.org/chunked/stable/rest-api-batch-ops.html

Let me know in case that still bugs you. Will close the issue in the meantime (spring-cleaning)

Best, Phil