neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
621 stars 160 forks source link

Fix getFJPoolWithConcurrency issue #138

Closed chozo99 closed 2 years ago

chozo99 commented 2 years ago

I found a problem where the threads keeps increasing. Every time function gds.pageRank.stream is called.

ps -L -p 106821
...
106821 109402 ?        00:00:00 neo4j.BoltNetwo
106821 109404 ?        00:00:00 neo4j.BoltNetwo
106821 109413 ?        00:00:00 gds-forkjoin-1
106821 109414 ?        00:00:00 gds-forkjoin-2
106821 109415 ?        00:00:00 gds-forkjoin-3
106821 109416 ?        00:00:00 gds-forkjoin-0
106821 109513 ?        00:00:00 gds-forkjoin-1
106821 109514 ?        00:00:00 gds-forkjoin-2
106821 109515 ?        00:00:00 gds-forkjoin-3
106821 109516 ?        00:00:00 gds-forkjoin-0
106821 109581 ?        00:00:00 neo4j.BoltNetwo
106821 109582 ?        00:00:00 neo4j.BoltNetwo
106821 109591 ?        00:00:00 gds-forkjoin-1
106821 109592 ?        00:00:00 gds-forkjoin-2
106821 109593 ?        00:00:00 gds-forkjoin-3
106821 109594 ?        00:00:00 gds-forkjoin-0
106821 109688 ?        00:00:00 gds-forkjoin-1
106821 109689 ?        00:00:00 gds-forkjoin-2
106821 109690 ?        00:00:00 gds-forkjoin-3
106821 109691 ?        00:00:00 gds-forkjoin-0
...

I confirmed that more than 100,000 gds-forkjoin-x threads are created.

two getFJPoolWithConcurrency cases do not call shutdown https://github.com/neo4j/graph-data-science/blob/5dc18a00857ae3f917560e448b562d86083ce5d1/pregel/src/main/java/org/neo4j/gds/beta/pregel/Pregel.java#L152 https://github.com/neo4j/graph-data-science/blob/5dc18a00857ae3f917560e448b562d86083ce5d1/core/src/main/java/org/neo4j/gds/core/utils/paged/HugeMergeSort.java#L34

but one getFJPoolWithConcurrency cases call shutdown https://github.com/neo4j/graph-data-science/blob/5dc18a00857ae3f917560e448b562d86083ce5d1/core/src/main/java/org/neo4j/gds/core/concurrency/ParallelUtil.java#L79

I separated these two. getFJPoolWithConcurrency, newFJPoolWithConcurrency

Note. This issue has occurred since version 1.6.0

s1ck commented 2 years ago

Hi @chozo99

Thanks for catching this. I'll review your changes. Before we can merge it, you would need to sign the CLA. You can find the instructions here: https://neo4j.com/developer/cla/ Just ping me on this PR once you signed and submitted it. We'll close the PR and merge it internally, you'll still be the commit author.

s1ck commented 2 years ago

I had to make some more changes in order to respect the configured concurrency, but your commits will still show up. It should be synced within the next 24 hours.