opencypher / openCypher

Specification of the Cypher property graph query language
http://www.opencypher.org
Apache License 2.0
853 stars 150 forks source link

Support "acyclic" constraint on relationships #172

Open IanRogers opened 7 years ago

IanRogers commented 7 years ago

CIR-2017-172

Related to #173

I want to be able to model a DAG (Directed Acyclic Graph) in cypher. So I need to be able to ensure that when a relationship is created then it doesn't create a loop.

I guess it could be done by remembering to code all the creates in a transaction, something like

# start transaction
match (a:T {key:'val1'}), (b:T {key:'val2'}) create (a)-[:R]->(b)
match p = (a:T {key:'val1'})-[:R*]-(a) return p limit 1
# abort transaction if there's now a loop

But that feels really clumsy and prone to developer errors. NB. see also https://github.com/neo4j/neo4j/issues/8667

It would be much nicer to be able to say something like

create constraint on ()-[r:R]-() assert acyclic(r)

and have match (a:T {key:'val1'}), (b:T {key:'val2'}) create (a)-[:R]->(b) throw an error automatically if it would make a loop.

thobe commented 7 years ago

That is an interesting idea. I would say that being acyclic is a property of a path rather than of a relationship, so I think I would formulate the constraint as (see #166):

CREATE CONSTRAINT acyclic_R FOR p=()-[:R*]-() REQUIRE acyclic(p)
IanRogers-LShift commented 7 years ago

Sure, acyclic is relative to the path. But is it beneficial to get hung up on making the constraint syntax "path like"? I'd say that only the relationship label is the interesting/relevant bit, so it could be simplified to

CREATE CONSTRAINT [:R] REQUIRE ACYCLIC

square brackets denoting a relationship. It might be easily confused with a node constraint but

CREATE CONSTRAINT (:T) REQUIRE ACYCLIC

doesn't make any sense so could be written to return a helpful error message "...did you mean [:T]?" or some such.

(sorry, I just realised I have two identities depending on if I'm at work or at home :-) this request is relevant to both for me though...)

Mats-SX commented 7 years ago

It is beneficial to reuse the standard Cypher pattern syntax in the constraint definition body, I think. The shortened version [:R] would require specialised knowledge to understand that this actually represents a path of any number of relationships of exactly that type. Tobias' suggestion of p = ()-[:R*]-() tells a person who knows Cypher exactly that directly. That way the constraint expression becomes self-explanatory, which is a nice feature.

Another nice feature of putting it that way is that the constraint may be modelled with cycle length predicates as well:

CREATE CONSTRAINT no_5_length_cycles 
FOR p = ()-[:R*5]-() 
REQUIRE acyclic(p)

The above constraint would allow some long cycles.

This is a bit longer to type, but typically constraint definitions are not typed down very often, so I wouldn't see this as that large a drawback.

IanRogers-LShift commented 7 years ago

That would be fine. I just want to be able to enforce a DAG with this and #173

CREATE CONSTRAINT enforce_dag_acyclic_for_R_links 
FOR p = ()-[:R*]-() 
REQUIRE acyclic(p)
boggle commented 7 years ago

Another very similar idea would be request a relationship type to be only used in a linear way, i.e. for organizing nodes in a list. Is that a common use scenario?

petraselmer commented 7 years ago

Updated the original comment with the name of the CIR