neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
629 stars 161 forks source link

Add a property to an in-memory graph #53

Open davidmakovoz opened 4 years ago

davidmakovoz commented 4 years ago

I have an in-memory graph created using the gds.graph.create() procedure. I want to add a property to some of the nodes in the graph. Simply setting the properties of those nodes after creating the graph doesn't work. I would need to delete the graph and create it again to include the newly added properties. I see that it is possible to add properties to an existing graph using the mutate flavor of different algorithms. It is possible to remove a property from an in-memory graph using the gds.graph.removeNodeProperties() procedure. Wouldn't it be helpful to have a gds.graph.addNodeProperties() procedure? Here is one example of where it would be useful. I run community detection. I create a graph using the existing nodes and their properties. I run mutate several times adding different communityIds, e.g. for louvain, wcc, scc. Then I decided to run labelPropagation and add a new property to my nodes to use as the seedProperty. However, this new property is not in the graph and cannot be used as the seedProperty. So I can either run using an anonymous graph or create a new graph, but the new graph will not have those communityIds created previously. It's definitely not a show-stopper, but it would useful to have.

Mats-SX commented 4 years ago

@davidmakovoz Thanks for requesting this feature, and especially for describing the use case. It is very helpful for us to understand when building something like this out.

I agree that the idea makes very good sense. Currently in our API one has to specify everything to read from the Neo4j database a priori, and then one can exercise the GDS library features. There is a detachment, and we want to minimise the detachment as much as possible, as it leads to the sort of situations that you describe. It isn't very dynamic.

What seems to be missing is effectively the inverse of gds.graph.writeNodeProperties() I think. But in general, this is included in a more fluent way of moving things on and off the GDS workbench (or graph catalog) for efficient algorithm access.

Mats-SX commented 4 years ago

And perhaps I should add that I think the current situation is inconvenient, but acceptable. I'm imagining that in situations such as in the example you gave, one would first prototype and use small data scales where recomputation is not an issue, and once one starts finalising and building everything together at full scale there would usually not be a lot of new surprises that one has to go back to the start to reconfigure and recompute for. This isn't a great argument, of course, but it allows users to succeed (and I understand that you would agree).

austin-InDro commented 2 years ago

Hello, I'm currently facing an issue like this with Neo4j version 4.4.9 (community).

Will this feature be implemented anytime soon? Instead of adding a single property. I was hoping for a "re-update graph" feature if multiple nodes and relationships have been added since the last use of the graph in an algorithm.