neo4j / graph-data-science-client

A Python client for the Neo4j Graph Data Science (GDS) library
https://neo4j.com/product/graph-data-science/
Apache License 2.0
190 stars 46 forks source link

Idempotent pageRank #721

Open Mintactus opened 2 months ago

Mintactus commented 2 months ago

The pageRank algo from the Python client is not idempotent.

I got this error: neo4j.exceptions.ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure gds.pageRank.mutate: Caused by: java.lang.UnsupportedOperationException: Node property pageRank already exists}

Could we have a replace option?

Also

knutwalker commented 2 months ago

Hi @Mintactus,

none of our mutate algorithm procedures have an option to overwrite existing properties, which is by design. New property values, even if produced by the same algorithm on the same data, do not necessarily have to be the same data, by values or semantically. So, we can't assume that overwriting is safe and put up some hoops to get to that outcome.

In order to overwrite a property, you have to first drop it and then run the mutate algorithm again:

CALL gds.graph.nodeProperties.drop('g', 'property')
CALL gds.<algo>.mutate('g', {'mutateProperty': 'property', ...})

I'm gonna leave this issue open for now, since we could add a failIfMissing options to the config for nodeProperties.drop, to align it with graph.drop.