neo4j-contrib / neo4j-graph-algorithms

Efficient Graph Algorithms for Neo4j
https://github.com/neo4j/graph-data-science/
GNU General Public License v3.0
770 stars 194 forks source link

[WIP] Huge Graph: Parallel Scan of Node and Relationship Store #771

Closed knutwalker closed 5 years ago

knutwalker commented 5 years ago

Note: Includes #738 and #766

This PR applies the same loading scheme from #738 to Nodes as well. Still WIP as I would like to see some testing on larger graphs first. The unmapping or reverse mapping, where we map from our graph-id to neo-node-id is still single threaded, but can be improved upon later on.

jexp commented 5 years ago

@knutwalker @mneedham I tested it on mattis with the 3bn node 18bn rel store.

It works with Azul Zing, with G1 it takes a lot of time for GC pauses. Also I think there might be some memory that we don't account for in the allocation tracker.

But I'm in favor of merging it in.

jexp commented 5 years ago

@knutwalker do you want to do a separate one for 3.5, or just cherry-pick this one over? I guess some of the kernel APIs might have changed, or perhaps not.

knutwalker commented 5 years ago

@jexp I need to integrate the changes from #842 as well (Node property support). I suggest to get this all done on 3.4 and then make one separate 3.5 version with everything already included.