neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
621 stars 160 forks source link

Using GDS in Neo4j community edition embedded server mode #165

Closed TheTeethOfTheHydra closed 2 years ago

TheTeethOfTheHydra commented 2 years ago

I am attempting to use Neo4j 4.4.3 in embedded mode with GDS 1.8.2 and am encountering a problem. Namely, a simple test of calling 'gds.graph.create' does not appear to create a named graph with which I can subsequently apply algorithms to.

I am using JDK 11.0.14 on Centos7. Here is the relevant portion of my POM for my Neo4j Server Application:

        <dependency>
            <groupId>org.neo4j</groupId>
            <artifactId>neo4j</artifactId>
            <version>4.4.3</version>
        </dependency>
        <dependency>
            <groupId>org.neo4j.gds</groupId>
            <artifactId>proc-common</artifactId>
            <version>1.8.2</version>
        </dependency>
        <dependency>
            <groupId>org.neo4j.gds</groupId>
            <artifactId>algo-common</artifactId>
            <version>1.8.2</version>
        </dependency>
        <dependency>
            <groupId>org.neo4j.gds</groupId>
            <artifactId>core</artifactId>
            <version>1.8.2</version>
        </dependency>

Here is the code that I believe initializes a new neo4j database, registers GDS and performs basic tests of a non-GDS transaction and then two GDS calls.

        final DatabaseManagementService dbms;
        final GraphDatabaseService graphDb;
    final String embeddedGraphDbFilepath = “/db/live-db/graph.db”;

        File graphDbFile = new File(embeddedGraphDbFilepath);

        Map<String, String> settings = new HashMap<>();
        settings.put("dbms.logs.debug.level", "DEBUG");
        settings.put("dbms.tx_log.rotation.retention_policy", "false");
        settings.put("dbms.security.procedures.unrestricted", "jwt.security.*,gds.*,apoc.*");
        settings.put("dbms.security.procedures.allowlist", "gds.*");
        settings.put("dbms.databases.writable", "neo4j");
        dbms = new DatabaseManagementServiceBuilder(graphDbFile.toPath())
                .setConfigRaw(settings)
                .build();
        graphDb = dbms.database("neo4j");

        final GlobalProcedures proceduresRegistry = ((GraphDatabaseAPI)graphDb).getDependencyResolver().resolveDependency(GlobalProcedures.class);

        final Set<Class<? extends BaseProc>> procedures = new Reflections("org.neo4j.graphalgo").getSubTypesOf(BaseProc.class);
        procedures.addAll(new Reflections("org.neo4j.gds.embeddings").getSubTypesOf(BaseProc.class));
        procedures.addAll(new Reflections("org.neo4j.gds.paths").getSubTypesOf(BaseProc.class));
        procedures.add(GraphDropProc.class);
        procedures.add(GraphListProc.class);
        procedures.add(GraphCreateProc.class);

        for(Class<? extends BaseProc> procedureClass : procedures)
            proceduresRegistry.registerProcedure(procedureClass, true);

        final Class[] functionsToRegister =
        {
            AsNodeFunc.class,
            NodePropertyFunc.class,
            VersionFunc.class
        };

        for(Class f : functionsToRegister)
            proceduresRegistry.registerFunction(f, true);

        try(Transaction tx = graphDb.beginTx())
        {
                tx.execute(“CREATE(a:TestNode{testNumber: 1})");
                tx.commit();
        }

        try(Transaction tx = graphDb.beginTx())
        {
                Result resultSet = tx.execute(“CALL gds.graph.create(\"dvGraph\", [\"WikipediaArticle\"], [\"References\"])”);
                System.out.println("Execution type: " + resultSet.getQueryExecutionType().toString());
                tx.commit();
        }

        try(Transaction tx = graphDb.beginTx())
        {
                Result resultSet = tx.execute(“CALL gds.graph.list() YIELD graphName”);
                while(resultSet.hasNext())
                        System.out.println("Found graph: " + resultSet.next().get(“graphName”));
        }

        dbms.shutdown();

Running the above code produces no errors but outputs no named graph (like the expected dvGraph that is in the create call).. The query execution type output when calling gds.graph.create is “READ_ONLY” which seems unusual. The execution time on the gds.graph.create is only a few milliseconds, whereas running the same command in the built-in server mode of neo4j community edition 4.4.3 takes perhaps 20 seconds with the same graph database and results in a verified named graph listed in the gds.graph.list output and usable by gds algos.

Since there doesn’t seem to be any documentation on the procedure to enable GDS in Neo4j embedded mode, can you either provide the procedure or review the above and suggest what might be going wrong?

vnickolov commented 2 years ago

@TheTeethOfTheHydra thank you for reporting this.

Please check this closed issue: https://github.com/neo4j/graph-data-science/issues/91 it seems like the thing you are trying to do.

TheTeethOfTheHydra commented 2 years ago

I agree and not only did I review this issue, I included what I learned from it in my own attempt to get GDS working. But as I’ve reported, the versions of neo4j and gds I am using are not throwing any exceptions but gds.graph.create does not appear to do anything. Thoughts on why that would be?

On Tue, Feb 8, 2022 at 5:27 AM Veselin Nikolov @.***> wrote:

@TheTeethOfTheHydra https://github.com/TheTeethOfTheHydra thank you for reporting this.

Please check this closed issue: #91 https://github.com/neo4j/graph-data-science/issues/91 it seems like the thing you are trying to do.

— Reply to this email directly, view it on GitHub https://github.com/neo4j/graph-data-science/issues/165#issuecomment-1032451141, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXSX3OELUNBGPMGB2T2RFQ3U2DV25ANCNFSM5NN2AZ4Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

Mats-SX commented 2 years ago

@TheTeethOfTheHydra If it seems to be doing nothing at all, it could be that you need to pull the results from the results stream, before you commit the transaction. Try doing it like this:

        try(Transaction tx = graphDb.beginTx())
        {
                Result resultSet = tx.execute(“CALL gds.graph.create(\"dvGraph\", [\"WikipediaArticle\"], [\"References\"])”);
                System.out.println("Execution type: " + resultSet.getQueryExecutionType().toString());
                System.out.println(resultSet.resultAsString());
                tx.commit();
        }

Maybe that makes a difference? At the very least, could you provide the output that this extra print gives you?

TheTeethOfTheHydra commented 2 years ago

I'm happy to report that this addition worked, the query took approx 35 seconds to run. From this, I inferred that the issue was I did not include a "YIELD" clause on the "CALL" and/or iterate through results and I gather this has the effect of deferring the execution (which was not subsequently being done in that transaction scope). Much appreciated and this would be great detail to add to any embedded mode documentation since I guess the browser-based app bundled with server mode must be doing this under the covers since the same gds.graph.create CALL with no YIELD still executes the named graph build. Thanks so much!

On Fri, Feb 11, 2022 at 3:00 AM Mats Rydberg @.***> wrote:

@TheTeethOfTheHydra https://github.com/TheTeethOfTheHydra If it seems to be doing nothing at all, it could be that you need to pull the results from the results stream, before you commit the transaction. Try doing it like this:

  try(Transaction tx = graphDb.beginTx())

  {

          Result resultSet = tx.execute(“CALL gds.graph.create(\"dvGraph\", [\"WikipediaArticle\"], [\"References\"])”);

          System.out.println("Execution type: " + resultSet.getQueryExecutionType().toString());

          System.out.println(resultSet.resultAsString());

          tx.commit();

  }

Maybe that makes a difference? At the very least, could you provide the output that this extra print gives you?

— Reply to this email directly, view it on GitHub https://github.com/neo4j/graph-data-science/issues/165#issuecomment-1035959065, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXSX3OENS6FGAZCZX7GKAYDU2S62LANCNFSM5NN2AZ4Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

vnickolov commented 2 years ago

@TheTeethOfTheHydra thank you for the update, I am closing this issue now.

Mats-SX commented 2 years ago

@TheTeethOfTheHydra Please consult the official Neo4j documentation resources to learn more on this topic. For example, this is a good page to read: https://neo4j.com/docs/java-reference/current/java-embedded/cypher-java/