neo4j-contrib / neo4j-graph-algorithms

Efficient Graph Algorithms for Neo4j
https://github.com/neo4j/graph-data-science/
GNU General Public License v3.0
769 stars 195 forks source link

Neo4j crashes after repeated PageRank calls #860

Closed jindrazak closed 5 years ago

jindrazak commented 5 years ago

Using Java driver I want to run Pagerank repeatedly in a loop. However, after certain number of iterations neo4j crashes.

Steps to reproduce:

  1. create single node
  2. run PageRank repeatedly in an infinite loop
  3. after certain number of iterations neo4j crashes

Here is my Java class:

import org.neo4j.driver.v1.*;

public class NeoTest implements AutoCloseable {

    private final Driver driver;

    public NeoTest( String uri, String user, String password ) {
        driver = GraphDatabase.driver( uri, AuthTokens.basic( user, password ) );
    }

    public static void main(String[] args) throws Exception {
        try (NeoTest finder = new NeoTest("bolt://localhost:7687", "neo4j", "password")) {
            finder.prepareData();
            finder.performPageRanks();
        }
    }

    public void prepareData()
    {
        try ( Session session = driver.session() )
        {
            String deleteQuery = "MATCH (n) DETACH DELETE n";
            session.run(deleteQuery);
            String createQuery = "MERGE (n:MyNode)";
            session.run(createQuery);
        }
    }

    public void performPageRanks()
    {
        try ( Session session = driver.session() )
        {
            String query =  "MATCH (subject:MyNode) " +
                            "CALL algo.pageRank.stream(" +
                                "'MyNode'," +
                                "'LINKS'," +
                                "{iterations:40, dampingFactor:0.85, sourceNodes: [subject]}" +
                            ") " +
                            "YIELD nodeId, score " +
                            "RETURN algo.getNodeById(nodeId) as res, score " +
                            "ORDER BY score DESC";

            int total = 0;
            while(true){
                session.run(query);
                total++;
                if(total % 50 == 0){
                    System.out.println("Number of iterations: " + total);
                }
            }
        }
    }

    @Override
    public void close() throws Exception {
        driver.close();
    }
}

Java log:

Mar 28, 2019 12:27:03 PM org.neo4j.driver.internal.logging.JULogger info
INFO: Direct driver instance 410495873 created for server address localhost:7687
Number of iterations: 50
Number of iterations: 100
Number of iterations: 150
Number of iterations: 200
Number of iterations: 250
Number of iterations: 300
Number of iterations: 350
Number of iterations: 400
Number of iterations: 450
Number of iterations: 500
Number of iterations: 550
Number of iterations: 600
Number of iterations: 650
Number of iterations: 700
Number of iterations: 750
Number of iterations: 800
Number of iterations: 850
Number of iterations: 900
Number of iterations: 950
Number of iterations: 1000
Mar 28, 2019 12:27:17 PM org.neo4j.driver.internal.logging.JULogger info
INFO: Closing driver instance 410495873
Mar 28, 2019 12:27:17 PM org.neo4j.driver.internal.logging.JULogger info
INFO: Closing connection pool towards localhost:7687
Exception in thread "main" org.neo4j.driver.v1.exceptions.ServiceUnavailableException: Connection to the database terminated. This can happen due to network instabilities, or due to restarts of the database
    at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:122)
    at org.neo4j.driver.internal.NetworkSession.run(NetworkSession.java:94)
    at org.neo4j.driver.internal.NetworkSession.run(NetworkSession.java:76)
    at org.neo4j.driver.internal.AbstractStatementRunner.run(AbstractStatementRunner.java:41)
    at org.neo4j.driver.internal.AbstractStatementRunner.run(AbstractStatementRunner.java:77)
    at NeoTest.performPageRanks(NeoTest.java:45)
    at NeoTest.main(NeoTest.java:14)
    Suppressed: org.neo4j.driver.internal.util.ErrorUtil$InternalExceptionCause
        at org.neo4j.driver.internal.util.ErrorUtil.newConnectionTerminatedError(ErrorUtil.java:50)
        at org.neo4j.driver.internal.util.ErrorUtil.newConnectionTerminatedError(ErrorUtil.java:43)
        at org.neo4j.driver.internal.async.inbound.ChannelErrorHandler.channelInactive(ChannelErrorHandler.java:76)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
        at org.neo4j.driver.internal.shaded.io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
        at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
        at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
        at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
        at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
        at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
        at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
        at org.neo4j.driver.internal.shaded.io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1010)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
        at org.neo4j.driver.internal.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1409)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
        at org.neo4j.driver.internal.shaded.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:927)
        at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
        at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
        at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
        at org.neo4j.driver.internal.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
        at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
        at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:834)

Process finished with exit code 1

Neo4j logs:

debug.log: https://pastebin.com/5EeZwKep neo4j.log: https://pastebin.com/DwAYnw6P

Mats-SX commented 5 years ago

Hello @jindrazak and thank you for reporting this issue.

I agree that Neo4j crashing like this is undesirable behaviour, and would probably label this as a bug. However, the usage pattern that you're exemplifying is not one that we're optimising for. The library is mainly intended to be used for large graphs where executing the algorithm is a substantial effort, and something typically is done only a very small number of times.

If I'm reading your setup correctly, it seems as though the algorithms was successfully executed at least 1000 times before crashing, which I would label as good enough. We don't expect users to run the algorithm that many times in lockstep. If it would crash at 10 or 50 subsequent runs, then that would be worse, but after 1000 runs I think this doesn't reach highest priority.

Are you seeing this behaviour for much fewer runs if your graph is larger?

jindrazak commented 5 years ago

Ok, so I generated more nodes and relationships to my graph using this simple algorithm:

int nodes = 10;
//create nodes
for (int i = 0; i < nodes; i++) {
    String createQuery = "CREATE (x:MyNode{id:" + i + "})";
    session.run(createQuery);
}
//create relationships
for (int i = 0; i < nodes - 1; i++) {
    String createQuery = "MATCH (a {id:" + i + "}), (b {id:" + (i + 1) + "}) CREATE (a)-[:rel]->(b)";
    session.run(createQuery);
}

It creates a graph that looks like this: (a)--(b)--(c)-- ... --(y)--(z)

Then, I made a couple of measurements about how many iterations it can process before it crashes. Here is the data:

Number of nodes Number of PR iterations before crash
1 1040
2 521
3 351
4 266
5 213
6 175
7 150
8 132
9 117
10 105
jjaderberg commented 5 years ago

@jindrazak Where do you see that Neo4j crashes? The log shows that your connections are being terminated, but that is likely because of how you use the driver. You run a bunch of queries but you never consume the results. At the very least you should call StatementResult#consume(). See the results section in the driver manual. Also, you should use transaction functions instead of auto-commit transactions, to get automatic retries for example.

What are you trying to accomplish anyway? I don't see what the use case is for running pagerank hundreds or thousands of times for 1-10 node graphs.