neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
621 stars 160 forks source link

Support source and target node type specification for link prediction #192

Closed devineyfajr closed 1 year ago

devineyfajr commented 2 years ago

Is your feature request related to a problem? Please describe.

Sometimes we need to predict links between two different node types, say between :Identity nodes and :Organization nodes. Who belongs to an organization? Currently we can create a graph that has all the Identity and Organization nodes, and all the (i:Identity)-[r:BELONGS_TO]->(o:Organization) relationships, but it appears that when negative edges are sampled, they include (O,I), (O,O), and (I,I) pairs, when we really only want (I,O) pairs.

Also, in the prediction phase, it appears predictions are made for all four combinations, instead of just (I,O) pairs. This unnecessarily increases runtime.

Describe the solution you would like

Add sourceNodeTypes and targetNodeTypes variables somewhere in the pipeline configuration. They could default to all node types if not specified. Then modify the negative sampling and prediction routines to use them.

Describe alternatives you have considered

Additional context

adamnsch commented 2 years ago

Hi @devineyfajr, This is a nice feature request - thank you. We have discussed this internally too, and it's in our backlog. Hopefully we can get to it soon! We will keep you posted. Adam

adamnsch commented 2 years ago

Hi @devineyfajr,

Just to give you an update, this feature will be included in the upcoming GDS 2.2 release.

Adam

devineyfajr commented 2 years ago

Cool

On 2022-08-16 03:06, Adam Schill Collberg wrote:

Hi @devineyfajr [1],

Just to give you an update, this feature will be included in the upcoming GDS 2.2 release.

Adam

-- Reply to this email directly, view it on GitHub [2], or unsubscribe [3]. You are receiving this because you were mentioned.Message ID: @.***>

-- Frank A Deviney Jr, PhD Data Scientist @.*** www.ccri.com [4] A Bicycle Friendly Business

Links:

[1] https://github.com/devineyfajr [2] https://github.com/neo4j/graph-data-science/issues/192#issuecomment-1216232226 [3] https://github.com/notifications/unsubscribe-auth/AAKRPQGENHJLDWOUWP53VWDVZM4WLANCNFSM5VRQO6AQ [4] http://www.ccri.com

gurugecl commented 2 years ago

Awesome @adamnsch. I could really use this feature right now as well. Is there an approximate date for when the GDS 2.2 update will take place?

FlorentinD commented 2 years ago

@gurugecl The current approximation is End of September.

FlorentinD commented 1 year ago

@gurugecl 2.2.0 is now released :)