neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
641 stars 161 forks source link

Limit max path length for Yens path finding algorithm #334

Open lvijnck opened 1 month ago

lvijnck commented 1 month ago

Is your feature request related to a problem? Please describe.

Yes, we're aiming to compute x shortest paths in a huge graph. Yens' algorithm seems to run quite long for our use-case and produces paths that are "too long" in certain cases too.

Describe the solution you would like

It would be nice to be able to cap the max length of the algorithm, and not return any paths if the shortest path exceeds more than x hops.

Describe alternatives you have considered

Neo4Js ExpandConfig statement, but this crashes our Neo4J instance.

IoannisPanagiotas commented 1 month ago

Hi @lvijnck,

We will have a look at your request, but I am not sure if we will be able on it directly unfortunately due to other tasks having higher priority currently.

Might I ask the value of concurrency that you are running the algorithm with and the lengths of the found paths?

If the paths are sufficiently large, allocating more cores could potentially help as the work in spur nodes is divided among cores.

Best regards, Ioannis.

lvijnck commented 1 month ago

Hi @IoannisPanagiotas,

Let's say the max. path length I want to consider is 3, then finding paths of length 5 takes waaay to long due to graph size (5+M nodes and 20+M edges).

We're currently running concurrency 4 due to Neo4J licensing

IoannisPanagiotas commented 1 month ago

Hi again @lvijnck ,

Thank you for your input! As I said we will put it on the backlog, but I cannot guarantee yet when or whether it will be picked up.

Since 3 is relatively small, have you tried tackling it with cypher? I would look into this document for suggestions, namely the quantified patterns section, or shortest.

Best, Ioannis.

IoannisPanagiotas commented 1 month ago

Also, perhaps you could try some some heuristics, if you can think about pre-processing your graph and replacing the weight of all edges, as follows: for edge e with weight w, replace w with w + LARGE_VALUE where LARGE_VALUE is a sufficient large value such that any path with length 5 ends up having a worst cost than any path with length 3.

Of course this has the effect that it will probably find all length-1 paths first, then all length-2, then all length-3, so you might need a larger k. But for each fixed length, the discovered paths should be ordered from best to worst.