neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
596 stars 157 forks source link

Link prediction pipeline for heterogenous graphs #215

Closed meg261995 closed 1 year ago

meg261995 commented 1 year ago

Does the link prediction machine learning pipeline work for heterogenous graphs?

I have a use case as below:

My use case is to predict relation between nodes but of different type. I have a graph something like this. (:session)-[:contains]->(:order), (:customer)-[:has]->(:session),(:order)-[:has]->(:product), (:order)-[:to]->(:relation) There are many customers who have placed orders. Some of the orders specify to whom the order was intended to (relation) i.e., mother/father etc. and some orders do not. For these orders my intention is to predict to whom the order was likely intended to.

So does the link prediction machine learning pipeline of neo4j GDS work for heterogenous graphs?

Mats-SX commented 1 year ago

Hello @meg261995 and thank you for reaching out to us!

The LinkPrediction algorithm itself doesn't have any specific support for this kind of use case, in being able to tell different relationship types apart from each other. The algorithm sees any node pair as a potential link that it should consider for prediction.

However, you can encode differences for different relationship types within a node embedding. The GraphSAGE algorithm for example has support for this. In this way, the embeddings for the node pairs could hold information that helps the LP algorithm tell different nodes apart. Still, it could be the case that the embedding is insufficient, and LP still would suggest "the wrong" kind of links.

This discussion is true of our latest version 2.1. But we are heavy at work on this very topic, and will offer enhanced LinkPrediction control features in our upcoming version 2.2. There, we plan to offer you control over what type of relationship the LP algorithm will learn to predict, while still allowing node property steps in the pipeline, such as embeddings, to take a richer model into account. You can check out our alpha releases if you want early access and leave feedback on the in-development features. You can download our latest alpha release 2.2.0-alpha02 via this link.

All the best Mats

meg261995 commented 1 year ago

Thanks a lot. Will surely get back

meg261995 commented 1 year ago

Is this version 2.2.0 compatible with neo4j 4.4.2? After I placed the jar file in the plugins folder and restarted neo4j, the status says its running but for some reason it is not. I cannot access neo4j. It was running good before this.

Edit: It was the compatibility issue and I had to rollback to previous version of GDS. Could you please let me know the version compatibility for GDS 2.2.0 alpha02?

adamnsch commented 1 year ago

Hi @meg261995,

There shouldn't be a compatibility issue between GDS 2.2.0-alpha02 and Neo4j 4.4.2: https://neo4j.com/docs/graph-data-science/2.2-preview/installation/supported-neo4j-versions/

What do you see in the database logs?

meg261995 commented 1 year ago

Logs are filled with a lot of information. I was not sure what to search for. If you could help out I will look again. However found this; ERROR [o.n.b.t.p.ProtocolHandshaker] Fatal error occurred during protocol handshaking:

adamnsch commented 1 year ago

It's a bit hard to make any conclusions without more context. Would you be able to provide more of the log?

meg261995 commented 1 year ago

Exactly, but there is nothing eye catching other than this :( I'm trying to install and work on neo4j Desktop. I will get back if it doesn't work here as well. Thank you so much for the answers. Much appreciated :)

Mats-SX commented 1 year ago

@meg261995 While I agree with Adam that 4.4.2 "should" work fine, it is a pretty old patch (8-9 months). I recommend that you upgrade to use Neo4j version 4.4.10, which contains many bugfixes for the database. In the compatibility testing we do with GDS and Neo4j versions we try to always upgrade to the latest version, so we may be unaware of incompatibilities of older patches. It is generally a good idea to try to keep up with patches of the database for security reasons, but also to avoid hitting particular bugs that may already be fixed.

meg261995 commented 1 year ago

Yes sir, while the server was 4.4.1, my Desktop version is 4.4.9 and it is working fine with GDS 2.2.0 as of now. Thank you:)

Mats-SX commented 1 year ago

@meg261995 That's great! If you want to try out the steaming fresh heterogeneity support for Link Prediction pipelines, you can find our preview documentation for 2.2 here: https://neo4j.com/docs/graph-data-science/2.2-preview/machine-learning/linkprediction-pipelines/training/#linkprediction-pipeline-examples-train-filtering

If you have any feedback on things that don't work, or that do work, or that you think should work differently, please just reach back here. We may close this issue, but you can still reply to it, or you can open new issues.

Thanks for using GDS! Regards Mats

meg261995 commented 1 year ago

Yes definitely. Thank you so much

adamnsch commented 1 year ago

Great! Will close this then :)

meg261995 commented 1 year ago

Hello again,

I have started working with the pipeline following the procedures mentioned in LP pipeline, all the steps are neat and clear except for the 'Configuring the relationship split' part. Having a real difficulty here in understanding how it works without any illustration with an example graph. Example code is given but there is no explanation. It would be really helpful if you could give an example or a reference.

adamnsch commented 1 year ago

Hello again,

I have started working with the pipeline following the procedures mentioned in LP pipeline, all the steps are neat and clear except for the 'Configuring the relationship split' part. Having a real difficulty here in understanding how it works without any illustration with an example graph. Example code is given but there is no explanation. It would be really helpful if you could give an example or a reference.

Hey! Could you open a new issue with this request since it's not related to the original issue that was resolved (so that it's easier to find for other users with similar problems)?

Thanks, Adam

meg261995 commented 1 year ago

Hi,

Yes, I have opened a new issue for this. Thanks