neo4j-contrib / neo4j-apoc-procedures

Awesome Procedures On Cypher for Neo4j - codenamed "apoc"                     If you like it, please ★ above ⇧            
https://neo4j.com/labs/apoc
Apache License 2.0
1.69k stars 495 forks source link

Improve thread pool management to prevent blocking and waiting when using a combination of apoc components that use thread pools #562

Open bradnussbaum opened 7 years ago

jexp commented 7 years ago

Do you have some more details here? Esp. thread-dumps or other logs that show the blocking.

We've seen some of this in the graph-algos library when a ton of tasks got scheduled, the thread-queue was full and the tasks had to be rescheduled after a short wait.

bradnussbaum commented 7 years ago

@jexp Just talking to Alex, sounds like you have reached a consensus about what is going to be added. Can you provide an update of what will get in?

alex-price commented 7 years ago

@jexp Can you clarify that parallel is indeed bugged and deprecated? Or am I misinterpreting this result being incorrect?

with timestamp() as start call apoc.cypher.parallel('call apoc.util.sleep(1000) return timestamp() as end', { x: range(1, 100) }, 'x') yield value return collect(value.end - start) as execution_time

[1016, 2023, 3024, 4028, 5030, 6032, 7038, 8044, 9050, 10051, 11053, 12054, 13057, 14062, 15067, 16072, 17073, 18078, 19082, 20085, 21086, 22088, 23091, 24095, 25100, 26104, 27106, 28107, 29108, 30114, 31115, 32119, 33121, 34126, 35129, 36135, 37138, 38144, 39145, 40148, 41154, 42160, 43164, 44170, 45171, 46176, 47179, 48184, 49186, 50191, 51197, 52200, 53204, 54210, 55212, 56215, 57217, 58222, 59223, 60227, 61229, 62233, 63240, 64244, 65250, 66253, 67258, 68261, 69266, 70270, 71275, 72277, 73282, 74286, 75292, 76294, 77298, 78298, 79303, 80309, 81311, 82315, 83320, 84321, 85323, 86329, 87333, 88335, 89338, 90344, 91349, 92351, 93355, 94356, 95361, 96367, 97367, 98369, 99373, 100374]

bradnussbaum commented 7 years ago

Looks like parallel and parallel2 have slightly different behavior when it comes to projecting the partition objects. See below:

Working parallel:

with timestamp() as start, range(1, 100) AS range unwind range as num with collect({num:num, start:start}) AS partitions, start with { partitions:partitions , start:start } as params, start call apoc.cypher.parallel('with {partitions} as partition call apoc.util.sleep(100) return timestamp() - partition.start AS execution_time', params, 'partitions') yield value return collect(value.execution_time) as execution_times

Fails with parallel2:

with timestamp() as start, range(1, 100) AS range unwind range as num with collect({num:num, start:start}) AS partitions, start with { partitions:partitions , start:start } as params, start call apoc.cypher.parallel2('with {partitions} as partition call apoc.util.sleep(100) return timestamp() - partition.start AS execution_time', params, 'partitions') yield value return collect(value.execution_time) as execution_times

with exception:

Failed to call procedure apoc.cypher.parallel2(fragment :: STRING?, params :: MAP?, parallelizeOn :: STRING?) :: (value :: MAP?): Error executing in parallel WITH {start} AS start,{direction} AS direction UNWIND {partitions} AS partitionswith {partitions} as partition call apoc.util.sleep(100) return timestamp() - partition.start AS execution_time

Working parallel2:

with timestamp() as start, range(1, 100) AS range unwind range as num with collect({num:num, start:start}) AS partitions, start with { partitions:partitions , start:start } as params, start call apoc.cypher.parallel2('with {partitions}[0] as partition call apoc.util.sleep(100) return timestamp() - partition.start AS execution_time', params, 'partitions') yield value return collect(value.execution_time) as execution_times

pradeepponduri commented 6 years ago

Is parallel2 depricated in 3.4.4?

nadja-muller commented 1 year ago

Reopened since the procedures are in extended.