Closed Selevaniuk closed 4 years ago
Hello @Selevaniuk and thanks for reporting this. This is most likely an issue with the alpha algorithm.
Unfortunately, GDS does not integrate with Neo4j's default memrec feature, but we have our own .estimate
procedures which estimate algorithm and graph catalog memory requirements based on the configured workload. These modes exist for all procedures in the production-ready tier, but are missing for most of the alpha procedures. These help to understand how much heap is necessary to guarantee your workload does not run into the above problem.
The general advice that I can give is: allocate more heap. If necessary, you can take away some gigs from the page cache (it doesn't help GDS much) as well as from dbms.tx_state.max_off_heap_memory
as GDS does not accumulate much transaction state. This could have the adverse effect of reducing standard Neo4j performance however.
I forgot to include a link to the manual where memory settings are discussed: https://neo4j.com/docs/graph-data-science/current/installation/#System-requirements
In general we can unfortunately not guarantee that you will never hit a OutOfMemoryError. We have a memory guard feature enabled for all algorithms that we have implemented .estimate
procedures for, which is only guaranteed for the production-ready tier in the library.
More on memory guard: https://neo4j.com/docs/graph-data-science/current/common-usage/memory-estimation/#estimate-heap-control More on GDS tiers: https://neo4j.com/docs/graph-data-science/current/algorithms/
I will close this issue now. Please feel free to reach back if you have further questions.
Hi!
I have a graph: nodes = 3.3 million, relationships = 26 million. I run gds.alpha.shortestPath.stream. If I run from node id "0" to node id "5" (or "100" or "1000"), then everything works quickly. If I run from node id "0" to node id "3,000,000" (or up to "50,000"), then I have an error: "Failed to invoke procedure gds.alpha.shortestPath.stream: Caused by: java.lang.OutOfMemoryError: Java heap space".
My conf settings: dbms.memory.heap.initial_size=12g dbms.memory.heap.max_size=12g dbms.tx_state.max_off_heap_memory=8g dbms.memory.pagecache.size=4g
There are indexes (id). database size = 2G.
FROM MEMREC:
ShortestPath from cypher (not from GDS) on this database doesn't produce errors and runs in a few seconds (from any node to any node).
Is this a problem in the implementation of the algorithm (gds.alpha.shortestPath.stream)?
log:
Thanks