neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
596 stars 157 forks source link

Get relationship IDs out of a projection #287

Open guangchen811 opened 9 months ago

guangchen811 commented 9 months ago

Is your feature request related to a problem? Please describe. After running a random walk algorithm like gds.graph.sample.rwr, I want to get which relationships are visited. The following is an example to clarify what I want.

  1. Create an example graph:
    
    CREATE
    (alice:Buyer {name: 'Alice'}),
    (instrumentSeller:Seller {name: 'Instrument Seller'}),
    (bob:Buyer {name: 'Bob'}),
    (carol:Buyer {name: 'Carol'}),
    (alice)-[:PAYS { amount: 1.0}]->(instrumentSeller),
    (alice)-[:PAYS { amount: 2.0}]->(instrumentSeller),
    (alice)-[:PAYS { amount: 3.0}]->(instrumentSeller),
    (alice)-[:PAYS { amount: 4.0}]->(instrumentSeller),
    (alice)-[:PAYS { amount: 5.0}]->(instrumentSeller),
    (alice)-[:PAYS { amount: 6.0}]->(instrumentSeller),

(bob)-[:PAYS { amount: 3.0}]->(instrumentSeller), (bob)-[:PAYS { amount: 4.0}]->(instrumentSeller), (carol)-[:PAYS { amount: 5.0}]->(bob), (carol)-[:PAYS { amount: 6.0}]->(bob)

2. project it.

MATCH (source) OPTIONAL MATCH (source)-[r]->(target) WITH gds.graph.project( 'graph_0', source, target, { sourceNodeLabels: labels(source), targetNodeLabels: labels(target), relationshipType: type(r) } ) AS g RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels

// return: // graph nodes rels // "graph_0" 4 10

3. run a random walk algorithm on the example graph

MATCH (start:Buyer {name: 'Alice'}) CALL gds.graph.sample.rwr('mySample', 'graph_0', { samplingRatio: 0.66, startNodes: [id(start)] }) YIELD nodeCount, relationshipCount RETURN nodeCount, relationshipCount // return: // nodeCount relationshipCount // 3 8


4. In this example, there are multiple relations with the same type between two nodes, e.g. PAYS between `Alice` and `InstrumentSeller`. I want to know which relationships (identified with relationship ID) are sampled in this sampling process. However, the current provided methods can only figure out which type of relations are sampled, instead of exact relationship IDs.

**Describe the solution you would like**
Current relationship operations like `gds.beta.graph.relationships` and `gds.graph.relationshipProperties` allow us to visit the relationship types and properties in a projection, however, the original relationship IDs are not allowed to visit. I know there are some scenarios in which relationship IDs cannot be provided (e.g., new relationships created in a projection), but can we provide relationship IDs that existed in the original neo4j graph?
Mats-SX commented 9 months ago

Hello @guangchen811 and thank you for reaching out.

I just want to confirm that this feature does not exist currently, but we are happy to receive your request. The best that I can offer you is to manage the disambiguation yourself, by projecting your own key via the relationship projection.

To exemplify what I mean using your example, you could do the following projection:

MATCH (source)
OPTIONAL MATCH (source)-[r]->(target)
WITH gds.graph.project(
  'graph_0',
  source,
  target,
  {
    sourceNodeLabels: labels(source),
    targetNodeLabels: labels(target),
    relationshipType: type(r),
    relationshipProperties: {key: id(r)}
  }
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels

Then your graph.sample projection should also include these keys, allowing you to disambiguate them towards the original relationships.

I hope this helps in lieu of your actual request being supported! All the best Mats