knowledge-graphs-tutorial / examples

Examples from knowledge graphs tutorial paper
Creative Commons Zero v1.0 Universal
72 stars 7 forks source link

figure10b.ttl: Should we remove extra string literals from example? #5

Closed lschmelzeisen closed 3 years ago

lschmelzeisen commented 3 years ago

Figure 10b of the paper (2.4.3) depicts an example of representing temporal context using property graphs:

property-graph

The example currently adds additional string literals to both nodes that are not part of the figure:

https://github.com/knowledge-graphs-tutorial/examples/blob/37cc7fda85450d9caec87a946acac98d46e74b5b/Section2/2_4_3_Higher_arity_representation/figure10b.cypher#L4

Should we remove those for clarity?

aidhog commented 3 years ago

I added these as I think that Santiago here just acts as a local variable in the CREATE command, rather than an ID persisted in the store. Arguably what would be removed are those variables as we don't reference the same node twice.

CREATE ({ name: 'Santiago' })-[:flight { validFrom: 1956 }]->({ name: 'Arica' })

I am not 100% sure about this though; it would be good to double check.

lschmelzeisen commented 3 years ago

It has been years since I last did Cypher. But with my limited knowledge, the syntax

CREATE ({ name: 'Santiago' })-[:flight { validFrom: 1956 }]->({ name: 'Arica' }) 

reads to me as:

  1. Create a node, add property name with value 'Santiago' to it
  2. Create a node, add property name with value 'Arica' to it
  3. Create an edge between them, add property validFrom with value 1956 to it.

Instead, intuitively, I would have expected something like this:

 CREATE (Santiago)-[:flight { validFrom: 1956 }]->(Arica) 

But I have no idea whether this is valid/sensible Cypher.

aidhog commented 3 years ago

I think the issue is that in your latter command, Santiago and Arica act like variables in the command. They are not actually stored as the IDs of the nodes in Neo4j/Cypher. Your latter command would be equivalent to:

CREATE ()-[:flight { validFrom: 1956 }]->()

The reason Cypher has these variables, I guess, is to facilitate expressing cycles like:

CREATE (a)-[:p]->(b)-[:q]->(c)-[:r]->(a)

I don't think there's a way to set node IDs in Neo4j as they are used internally for indexing. See also: https://stackoverflow.com/questions/9051442/node-identifiers-in-neo4j

lschmelzeisen commented 3 years ago

Ok, I now see what the semantic difference between RDF and Cypher is.

But technically, we never talk about identifiers in the paper (atleast the CSUR version), so the example would fit :)

Still, I concur that the example should stay practical.