Open aalampally opened 6 years ago
Can you check how many records are returned by the node and relationship projection queries? If you replace the last lines with:
Return count (*)
That should do it
On Mon, Jun 11, 2018, 17:49 Anirudh Alampally notifications@github.com wrote:
I am getting a weird error when using shortestPath algorithms in graph-algorithms. In the graph data I have Person node with primaryEmail as once of the attributes. This contains emails of multiple domains. e.g., a@xyz.com, x@abc.com etc..
When I run the following query MATCH (source:Person{primaryEmail:'a@xyz.com'}), ( target:Person{primaryEmail:'b@xyz.com'}) CALL algo.shortestPath.stream(source, target, "cost",{ nodeQuery:'MATCH(n:Person) RETURN id(n) as id', relationshipQuery:'MATCH(n:Person)-[r:connected_to]-(m:Person) RETURN id(n) as source, id(m) as target, (1.0/(r.weight+1.0)) as weight', graph:'cypher'}) YIELD nodeId, cost MATCH (other:Person) WHERE id(other) = nodeId RETURN other.name, cost
It is giving me the following error: Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure 'algo.shortestPath.stream': Caused by: java.lang.ArrayIndexOutOfBoundsException: -1073739909
Just as a tweak, I tried to limit the nodes to a particular domain of email and it started running.The updated query is: MATCH (source:Person{primaryEmail:'a@xyz.com'}), ( target:Person{primaryEmail:'b@xyz.com'}) CALL algo.shortestPath.stream(source, target, "cost",{ nodeQuery:'MATCH(n:Person) where n.primaryEmail ends with @xyz.com RETURN id(n) as id', relationshipQuery:'MATCH(n:Person)-[r:connected_to]-(m:Person) RETURN id(n) as source, id(m) as target, (1.0/(r.weight+1.0)) as weight', graph:'cypher'}) YIELD nodeId, cost MATCH (other:Person) WHERE id(other) = nodeId RETURN other.name, cost
Is the error because the virtual graph which is created by filtering the relationships, creating a graph which is very big and exceeding the memory? Any help is much appreciated!!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/neo4j-contrib/neo4j-graph-algorithms/issues/659, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAzpMqtc1g9I203SV0rDQP8rTzXyB9kks5t7p-lgaJpZM4UjCiG .
Hey @mneedham The no.of nodes after node filtering are: 77974 and the no.of relationships after relationship filtering are: 726748.
I am assuming that creating a virtual graph is the issue here. So, to simplify the query, I created a relationship property "sssp_distance", which is essentially (1.0/(r.weight+1.0))
. So, I could now just mention the weightProperty and relationship in which I have to look for.
My updated query would look like this:
MATCH (start:Person{primaryEmail:'a@xyz.com'}), (end:Person{primaryEmail:'b@xyz.com'}) CALL algo.shortestPath.stream(start, end, "sssp_distance", {relationshipQuery: "connected_to"}) YIELD nodeId, cost MATCH (other) WHERE id(other) = nodeId RETURN other, cost
Even after this simplification, I get the following error Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure 'algo.shortestPath.stream': Caused by: java.lang.ArrayIndexOutOfBoundsException: -2147482401
So, I tried APOC's dijkstra algorithm and it was able to give the shortest path using the following query:
MATCH (start:Person{primaryEmail: 'a@xyz.com'}), (end:Person {primaryEmail: 'b@xyz.com'}) CALL apoc.algo.dijkstra(start, end, "connected_to", "sssp_distance", 1.0, 1) YIELD path, weight RETURN path, weight
The main reason, I chose graph-algorithms over APOC was that, I could create a virtual graph, this way I wouldn't have to create a new relationship property sssp_distance
.
Please Let me know if I still have any workaround for this
@aalampally seeing as you got the error when using the non Cypher projection approach it sounds like the error is happening somewhere else.
Are you able to narrow the error down on a smaller dataset that you'd be able to share with us so we can reproduce it and figure out what's going on?
I have tried the same on a smaller dummy dataset
CREATE (a:Loc{name:'A'}), (b:Loc{name:'B'}), (c:Loc{name:'C'}), (d:Loc{name:'D'}), (e:Loc{name:'E'}), (f:Loc{name:'F'}), (a)-[:ROAD {cost:50}]->(b), (a)-[:ROAD {cost:50}]->(c), (a)-[:ROAD {cost:100}]->(d), (a)-[:RAIL {cost:50}]->(d), (b)-[:ROAD {cost:40}]->(d), (c)-[:ROAD {cost:40}]->(d), (c)-[:ROAD {cost:80}]->(e), (d)-[:ROAD {cost:30}]->(e), (d)-[:ROAD {cost:80}]->(f), (e)-[:ROAD {cost:40}]->(f), (e)-[:RAIL {cost:20}]->(f);
It works fine both with Cypher projection and non-Cypher projection. I am assuming that if similar data is multiplied to ~0.7 million edges and 70k nodes (data size similar to what I have), the error could be reproduced.
I am getting a weird error when using shortestPath algorithms in
graph-algorithms
. In the graph data I have Person node with primaryEmail as once of the attributes. This contains emails of multiple domains. e.g., a@xyz.com, x@abc.com etc..When I run the following query
MATCH (source:Person{primaryEmail:'a@xyz.com'}), (target:Person{primaryEmail:'b@xyz.com'}) CALL algo.shortestPath.stream(source, target, "cost",{ nodeQuery:'MATCH(n:Person) RETURN id(n) as id', relationshipQuery:'MATCH(n:Person)-[r:connected_to]-(m:Person) RETURN id(n) as source, id(m) as target, (1.0/(r.weight+1.0)) as weight', graph:'cypher'}) YIELD nodeId, cost MATCH (other:Person) WHERE id(other) = nodeId RETURN other.name, cost
It is giving me the following error:
Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure 'algo.shortestPath.stream': Caused by: java.lang.ArrayIndexOutOfBoundsException: -1073739909
Just as a tweak, I tried to limit the nodes to a particular domain of email and it started running.The updated query is:
MATCH (source:Person{primaryEmail:'a@xyz.com'}), (target:Person{primaryEmail:'b@xyz.com'}) CALL algo.shortestPath.stream(source, target, "cost",{ nodeQuery:'MATCH(n:Person) where n.primaryEmail ends with @xyz.com RETURN id(n) as id', relationshipQuery:'MATCH(n:Person)-[r:connected_to]-(m:Person) RETURN id(n) as source, id(m) as target, (1.0/(r.weight+1.0)) as weight', graph:'cypher'}) YIELD nodeId, cost MATCH (other:Person) WHERE id(other) = nodeId RETURN other.name, cost
Is the error because the virtual graph which is created by filtering the relationships, creating a graph which is very big and exceeding the memory? Any help is much appreciated!!