neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
630 stars 161 forks source link

Cypher projection ignores orientation? #113

Closed maorzsh closed 1 year ago

maorzsh commented 3 years ago

Background It seems that a cypher-projected undirected graph is treated as if it was directed. I'm new to Neo4j so I can't tell for sure if it's a bug or I'm doing something wrong.

Environment

Reproduction

  1. Create graph (star graph centered at o1)
CREATE
  (o1:Order {name: 'o1'}),
  (o2:Order {name: 'o2'}),
  (o3:Order {name: 'o3'}),
  (o4:Order {name: 'o4'}),
  (o5:Order {name: 'o5'}),

  (o1)-[:LINKED]->(o2),
  (o3)-[:LINKED]->(o1),
  (o4)-[:LINKED]->(o1),
  (o5)-[:LINKED]->(o1);
MATCH (n)-[r]-(m)
RETURN *;

image

  1. Project undirected using native.
CALL gds.graph.create(
    'native-projection',
    'Order',
    {LINKED: {orientation: 'UNDIRECTED'}}
);
  1. Project undirected using cypher according to the documentation.
CALL gds.graph.create.cypher(
    'cypher-projection',
    'MATCH (orders:Order)
     RETURN id(orders) AS id, labels(orders) AS labels',
    'MATCH (order1:Order)-[r:LINKED]-(order2:Order)
     RETURN id(order1) AS source, id(order2) AS target, type(r) AS type'
);
  1. Compute betweenness centrality, it should be 6 for o1 node and 0 for the rest. The native projection seems to be ok:
CALL gds.betweenness.stream(
    'native-projection',
    {nodeLabels:        ['Order'],
     relationshipTypes: ['LINKED']
    })
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score;

+------+-------+
| name | score |
+------+-------+
| "o1" | 6.0   |
+------+-------+
| "o2" | 0.0   |
+------+-------+
| "o3" | 0.0   |
+------+-------+
| "o4" | 0.0   |
+------+-------+
| "o5" | 0.0   |
+------+-------+

However, the cypher projection returns a different result, more specifically, it seems it treats the edges as directed edges, e.g. o2->o1->o3 and o3->o1->o2 are counted as two distinct paths even though I asked for an undirected graph.

CALL gds.betweenness.stream(
    'cypher-projection',
    {nodeLabels:        ['Order'],
     relationshipTypes: ['LINKED']
    })
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score;

+------+-------+
| name | score |
+------+-------+
| "o1" | 12.0  |
+------+-------+
| "o2" | 0.0   |
+------+-------+
| "o3" | 0.0   |
+------+-------+
| "o4" | 0.0   |
+------+-------+
| "o5" | 0.0   |
+------+-------+
Mats-SX commented 3 years ago

@RiskyMaor You are right that a Cypher-projected graph is always directed. We do not inspect the provided queries (as these may be arbitrarily complex) but we only consume the result of the queries and build a graph based off that. As such we cannot guarantee that the graph is or is not UNDIRECTED but we simply take the relationships we get as standard directed relationships.

You are also right that we claim otherwise in the documentation through

The cypher projection can achieve the same feature by adjusting the MATCH clause of the relationship query.

Indeed, it is not quite the same because the GDS algorithm does not have access to the metadata that the projected graph is UNDIRECTED (the metadata says it's NATURAL, like all Cypher-projected graphs). So any and all algos that react to this metadata will not behave the same way.

It is still accurate that by removing the direction from a (simple) Cypher query will achieve the effect of UNDIRECTED by way of projecting twice as many relationships.

I will take this to the team and we'll make some kind of correction here to make this difference more clear.

Thanks a lot for raising this issue!

All the best Mats

maorzsh commented 3 years ago

Thanks for the prompted and detailed response!

FlorentinD commented 1 year ago

With 2.3 we introduced a undirectedRelationshipTypes parameter which allows you to load undirected graphs (https://neo4j.com/docs/graph-data-science/current/management-ops/projections/graph-project-cypher-aggregation/#_undirected_relationships). However, this is only supported for the newer Cypher Aggregation and Native Projection.

Alternatively, you can still use the old Cypher projection and convert the relationships to undirected (see https://neo4j.com/docs/graph-data-science/current/graph-catalog-relationship-ops/#catalog-graph-relationship-to-undirected-example).