ecohealthalliance / yenpathy

Yen's K Shortest Paths in R, Quickly
Other
7 stars 3 forks source link

Graphs directed by default? #10

Open scottishaccent opened 4 years ago

scottishaccent commented 4 years ago

For me, at least, k_shortest_paths appears to assume that graphs are directed by default. Thus, for a very simple graph:

start end 1 2 2 3

The following works as expected: k_shortest_paths(graph, from = 1, to = 3, k = 1) [1] 1 2 3

However, the reverse returns a null set: k_shortest_paths(graph, from = 3, to = 1, k = 1) list()

I did not see a note regarding directed vs. undirected graphs in the documentation. Is this the expected behavior of k_shortest_paths?

scottishaccent commented 4 years ago

Addendum:

I have attempted a workaround in order to use yenpathy with undirected graphs by creating a new graph with reverse links added (i.e. if a link from 1 to 2 was present in the original graph, add a link from 2 to 1 with the same weight). This unfortunately does not work consistently, for reasons that are unclear.

For example, consider the following graph: df <- data.frame( start = c(1, 4, 5, 1, 8, 1), end = c(4, 5, 6, 8, 6, 6), weight = c(1, 1, 1.5, 1.5, 2.5, 5) ) k_shortest_paths(df, from = 1, to = 6, k=3)

The output, as expected: [[1]] [1] 1 4 5 6

[[2]] [1] 1 8 6

[[3]] [1] 1 6

Now, consider a new graph that contains the links from the first graph, with reverse links added to the second half of the data frame: rows <- nrow(df) newdf <- data.frame(start = numeric(rows*2), end = numeric(rows*2), weight = numeric(rows*2)) for(i in 1:rows) { newdf$start[i] <- df$start[i] newdf$end[i] <- df$end[i] newdf$weight[i] <- df$weight[i] newdf$start[i+rows] <- df$end[i] newdf$end[i+rows] <- df$start[i] newdf$weight[i+rows] <- df$weight[i] } k_shortest_paths(newdf, from = 1, to = 6, k=3)

Output (incorrect): [[1]] [1] 1 8 6

[[2]] [1] 1 6

[[3]] [1] 1 4 5 6

I wonder if this could be related to the multigraph issue raised by @szhorvat in issue #9 .

Oddly, when changing the names of the nodes in the above graph to a sequential series without changing the graph structure or weights, the output is correct. For example, this graph is identical to the graph above, except the node names now run sequentially from 1 to 5:

df <- data.frame( start = c(1, 2, 3, 1, 5, 1), end = c(2, 3, 4, 5, 4, 4), weight = c(1, 1, 1.5, 1.5, 2.5, 5) ) k_shortest_paths(newdf, from = 1, to = 4, k=3)

Output (correct): [[1]] [1] 1 2 3 4

[[2]] [1] 1 5 4

[[3]] [1] 1 4

@toph-allen , do you have any thoughts on what could be causing this behavior?

toph-allen commented 4 years ago

Hi @scottishaccent; sorry for my slow reply.

It's possible that this is a bug or assumption in the underlying C++ implementation, which I didn't write, but can be found at https://github.com/yan-qi/k-shortest-paths-cpp-version.

The original problem I wrote the R wrapper to tackle involved a graph which was directed by had many bidirectional edges, with different weights, so I would have assumed that adding a second edge would have worked. Indeed, I think used this technique to treat a directed graph as bidirectional in the past and didn't see any anomalies, but the graph I was using had sequentially-named nodes (or rather, used a character vector for node names, which the function replaces with sequential integers).

I stepped through the wrapper function and there were no issues with, for instance, the example graph being transformed incorrectly before being passed to the underlying C++ code.

I don't know enough C++ to adequately debug that part of the code, unfortunately. If relabeling nodes to be sequential and then reverting them at the end (like the k_shortest_paths function currently treats nodes with character names) works, that could be a reasonable patch.

It also seems like some of the underlying C++ code could be rewritten to solve this but that's beyond my capabilities.

scottishaccent commented 4 years ago

@toph-allen, thanks for the reply. I've developed a workaround using another implementation for the time being, but I will continue to follow this thread in case other updates are made.