neo4j / graph-data-science-client

A Python client for the Neo4j Graph Data Science (GDS) library
https://neo4j.com/product/graph-data-science/
Apache License 2.0
183 stars 44 forks source link

Get a dataframe of nodeId and nodeLabel in absence of any other node properties #666

Open Hossein-Tohidi opened 1 month ago

Hossein-Tohidi commented 1 month ago

Looking for a function that generates a dataframe consisting of nodeId and nodeLabel When the graph does not have any node attributes (just nodeId and nodeLabels are present), I cannot find a way to get the nodeLabels back. The nodeProperties.stream returns an empty dataframe. (This is working fine when we do have any node properties).

nodes_df = gds.graph.nodeProperties.stream(G, list(node_props), listNodeLabels=True)

Additionally, G.node_labels() only produces a list of labels without mapping them to the corresponding nodeIds. I checked the G._graph_info dictionary as well, but the map does not seem to be stored there.

For edges, we have gds.graph.relationships.stream and gds.graph.relationshipProperties.stream, which support retrieving edges with or without properties. However, I couldn't find similar functionalities for nodes.

adamnsch commented 1 month ago

Hi @Hossein-Tohidi,

Thank you for the feature request.

We are aware of this limitation, and there will likely be a new feature that will let you do what you're asking before too long. Until then, would you perhaps be able to project your graph with some dummy node property with a default value? If all nodes use the default value for a property, virtually no extra memory will be used for that node property. So it should not impact performance and so on (except slightly when doing gds.graph.nodeProperties.stream).

Hope this was helpful, Adam

Hossein-Tohidi commented 1 month ago

Thanks @adamnsch, unfortunately, that workaround might not work in our wrapper function (a utility function that enables communication to Neo4j/GDS) for two reasons. 1- We dont know in advance whether or not the user of the wrapper function will or will not specify any node property to be returned (the graph might be constructed by directly using a GDS functionalities, and not through the wrapper function) 2- Modifying the graph seems not possible (adding a node property to an already created gds graph object, without recreating it).

Hossein-Tohidi commented 1 month ago

I was able to run one algorithm on the graph like louvain to mutate the graph by adding a node property and at the end drop that property. It might not be ideal but it works.

adamnsch commented 1 month ago

@Hossein-Tohidi I see

I was able to run one algorithm on the graph like louvain to mutate the graph by adding a node property and at the end drop that property. It might not be ideal but it works.

That's an interesting workaround. If you find that Louvain takes a while to run, you could use a linear complexity algorithm like degree centrality instead. Anyway, it sounds like you're unblocked for now which is good. We'll let you know when we launch the aforementioned feature that will allow you to skip using the workaround.

Thanks, Adam