Open pushkarnagpal opened 5 years ago
Hello @pushkarnagpal,
Are you sure you've provided a valid Cypher query?
It does not compile in Cypher for Gremlin and Neo4j gives error about multiple WHERE
cases.
Hello @pushkarnagpal,
Are you sure you've provided a valid Cypher query?
It does not compile in Cypher for Gremlin and Neo4j gives error about multiple
WHERE
cases.
Hi @dwitry
I've updated the query now. (had accidentally written not
instead of none
in the query.
match p = (base:party{node_id:'245793137123'})<-[trail:ownership*..]-(leaves:party) where none( rel in relationships(p) where startNode(rel) = endNode(rel) ) return count(p)
Please look into the issue now. Thanks!
Variable length path in loops is a complex case, which is tricky to implement in Gremlin to fully cover all the edge cases. I've invested a lot of time trying to properly implement it (1, 2) but the solution still not universal. Unfortunately can not provide any estimates when this will be improved.
Variable length path in loops is a complex case, which is tricky to implement in Gremlin to fully cover all the edge cases. I've invested a lot of time trying to properly implement it (1, 2) but the solution still not universal. Unfortunately can not provide any estimates when this will be improved.
Ahh okay. I was working on a fraud detection case.and got stuck here. I saw this query conversion from the cypher in the gremlin-server.log
file, and I see a .neq(' cypher.null')
in the query and i assume this step is occuring at the none
where clause.
I think this is the root cause of the bug.
g.V().as('base')
.hasLabel('party').has('node_id', eq('245793137123'))
.repeat(__
.inE('ownership')
.as('trail')
.aggregate(' cypher.path.edge.p')
.outV()
)
.emit()
.until(__
.path()
.from('base')
.count(local)
.is(gte(21))
)
.as('leaves')
.hasLabel('party')
.path()
.from('base')
.as('p')
.where(__
.not(__.V().hasLabel('party').outE('ownership').inV().as(' GENERATED4').where(__.select(' GENERATED4').where(eq('leaves'))))
)
.optional(__
.select(all, 'trail').as('trail')
)
.select('p')
.is(neq(' cypher.null'))
.count()
.project('count(p)')
.by(__.identity())
Hope I could help somehow with this. Thanks!
.neq(' cypher.null')
is guard to handle cypher.null
which represents null in Cypher for Gremlin.
We may call "root cause" the following snippet:
.repeat(__
.inE('ownership')
.as('trail')
.aggregate(' cypher.path.edge.p')
.outV()
)
.emit()
.until(__
.path()
.from('base')
.count(local)
.is(gte(21))
)
This is how variable length path <-[trail:ownership*..]-
is translated. Basically, it is an imperative way to traverse relationships (21
is limit to avoid infinite loops). If there is a loop, the path will be traversed repeatedly (what happens in the query you've provided).
So we need to find a better implementation of this in Gremlin, but as I've said in the previous message this is a rather complicated task, as there are other edge cases that need to be considered, as well as performance.
Trying the following query to get a path using a match query to filter out paths that contain self edges (node with edges to itself to avoid increased length of path:
But this query is unable to filter out the data and I get the entire set.