Open kcrimson opened 6 years ago
Hi,
The paths are collected in a list before being streamed so I wonder if that's causing an issue. The int[] that you're seeing would likely be the collection of nodeIds that get returned.
To try and narrow down the problem could you try running the non streaming version and see if that works?
MATCH (start:Airport{IATA:'KRK'}), (end:Airport{IATA:'DFW'})
CALL algo.kShortestPaths(start, end, 4, 'distance' ,{})
YIELD index, nodeIds, path, costs
RETURN count(*)
Also are you using a public dataset / can you share it with me so I can debug further?
I run non streaming version and it ended up with the same OutOfMemory error.
I have attached datasets airports.txt routes.txt
Here are queries I used to import these datasets:
LOAD CSV FROM "file:///airports.dat" as airport
CREATE (:Airport {
AirportID : airport[0],
Name : airport[1],
City : airport[2],
Country : airport[3],
IATA: airport[4],
ICAO: airport[5],
Latitude: toFloat(airport[6]),
Longitude: toFloat(airport[7]),
Altitude: toInteger(airport[8]),
Timezone: airport[9],
DST: airport[10],
TZ: airport[11],
Type: airport[12],
Source: airport[13]
});
CREATE CONSTRAINT ON (a:Airport) ASSERT a.IATA IS UNIQUE;
LOAD CSV FROM "file:///routes.dat" as route
MATCH (s:Airport {AirportID: route[3]})
MATCH (d:Airport {AirportID: route[5]})
CREATE (s)-[:Route {Airline:route[1],Codeshare : [6]}]->(d);
MATCH (a:Airport)-[r:Route]->(b:Airport)
SET r.distance=distance(point({longitude: a.Longitude, latitude : a.Latitude}),point({longitude: b.Longitude, latitude : b.Latitude}))
Quick update, The problem is in number of routes between airports, when I limit traversal depth, it runs, but result is empty, and it shouldn't be. Let me give you an example
match p=shortestPath((:Airport {IATA : "KRK"})-[:Route*..4]->(:Airport {IATA : "CFU"})) return p
It returns path through Frankfurt airport. When I run the same, with kShortestPath
MATCH (start:Airport{IATA:'KRK'}), (end:Airport{IATA:'CFU'})
CALL algo.kShortestPaths.stream(start, end, 3, 'distance' ,{maxDepth: 4})
YIELD index, nodeIds, path, costs
RETURN [node in algo.getNodesById(nodeIds) | node.City] AS places,
costs,
reduce(acc = 0.0, cost in costs | acc + cost) AS totalCost
returns empty result set. Am I missing something from kShortestPath?
Hi,
I'm able to reproduce although I had to tweak the loading instructions:
CREATE CONSTRAINT ON (a:Airport) ASSERT a.IATA IS UNIQUE;
LOAD CSV FROM "file:///airports.txt" as airport
MERGE (a:Airport {IATA: airport[4]})
set
a.AirportID = airport[0],
a.Name = airport[1],
a.City = airport[2],
a.Country = airport[3],
a.ICAO = airport[5],
a.Latitude = toFloat(airport[6]),
a.Longitude = toFloat(airport[7]),
a.Altitude = toInteger(airport[8]),
a.Timezone = airport[9],
a.DST= airport[10],
a.TZ = airport[11],
a.Type = airport[12],
a.Source = airport[13]
;
create index on :Airport(AirportID);
LOAD CSV FROM "file:///routes.txt" as route
MATCH (s:Airport {AirportID: route[3]})
MATCH (d:Airport {AirportID: route[5]})
CREATE (s)-[:Route {Airline:route[1],Codeshare : [6]}]->(d);
MATCH (a:Airport)-[r:Route]->(b:Airport)
SET r.distance=distance(point({longitude: a.Longitude, latitude : a.Latitude}),point({longitude: b.Longitude, latitude : b.Latitude}))
Will look at how to fix it now
Hi @kcrimson,
Sorry for the delay in replying again. We have a fix for the OOM on this PR -https://github.com/neo4j-contrib/neo4j-graph-algorithms/pull/712 - but when we set maxDepth it's returning no results so we're still investigating that.
Will let you know when we've got both fixes merged in.
Cheers, Mark
Hi, I faced the situation on release 3.5 when had to set maxDepth=10 in order to get the paths of 5 nodes only. I think the issue is in the verification order : package org.neo4j.graphalgo.impl.yens.Dijkstra
if (d >= maxDepth) {
continue;
}
if (node == target) {
return true;
}
By changing the order it returns the right results !
Please confirm
Hi, I was playing recently with kShortestPath.stream (build from master), on rather small graph of 7184 nodes and 66067 relationships (airports and routes imported from here,
I am running following query (copy pasted from docs, with minor modifications):
It always ends up with OutOfMemory, even with 12G max heap size. I have managed to dump heap before full GC and found out that there is a local variable of type int[] and huge size. Call stack of this thread:
Any clues what went wrong? I am running neo4j 34.5, graph algo build from master c92bc290ecf9944e2cdf93e65bc1b161cf504876 and java 1.8.0_171.