Implementing the Yen's algorithm and its improved versions

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

This ticket aims at implementing the Yen's algorithm and its improved version for k shortest simple path enumeration between the source and the target. The implementation will be compared to the original yen's algorithm.

CC: @dcoudert

Component: graph theory

Keywords: gsoc19

Author: Rajat Mittal

Branch/Commit: fcfa3e6

Reviewer: David Coudert

Issue created by migration from https://trac.sagemath.org/ticket/27859

dcoudert commented 5 years ago

Changed keywords from none to gsoc19

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

Author: Rajat Mittal

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

Branch: u/gh-rajat1433/27859_yen_algorithm_improved

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`2afbaef`	`yen's algorithm implementation`
`ede0d00`	`improved`
`ac4ff86`	`revert`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Commit: ac4ff86

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:4

This is the implementation of original yen's algorithm for k shortest paths which the method will return as an iterator. This will be compared with the improved yen's(Feng) algorithm which will be implemented shortly.

shortest_path methods are also enhanced by adding exclude_vertices and exclude_edges parameter which are required for Yen's or improved Yen's algorithm. If some improvements are possible please let me know so I can improve the above method.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from ac4ff86 to b31f288

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b31f288 added one line

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9da4482 removed all_paths

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from b31f288 to 9da4482

dcoudert commented 5 years ago

comment:8

I'm a bit reluctant to the changes you did in shortest_path and bidirectional_dijkstra as it will slowdown these methods when exclude_edges and exclude_vertices are false. A significant effort has been put in optimizing these methods, so slowing them is not clever.

You must try to have the minimum possible impact on them. For instance do:

if not exclude_edges and not exclude_vertices:
    neighbors = nbr
else:
    neighbors = []
    for w in nbr:
        ...

you could also create exclude_vertices_int and exclude_edges_int to avoid code like (self.vertex_label(u), self.vertex_label(w)) not in exclude_edges.

Is method yen_k_shortest_paths only for undirected graphs ?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 9da4482 to 3ca1945

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`b7a49b4`	`improved the shortest_paths`
`3ca1945`	`removed xtra line`

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:10

I have made the changes in shortest_paths methods to have a minimal effect of exclude_edges and exclude_vertices.

yen_k_shortest_paths is for both the directed as well as undirected graphs as evident from the examples in the method.

            sage: g = DiGraph([(1, 2, 20), (1, 3, 10), (1, 4, 30), (2, 5, 20), (3, 5, 10), (4, 5, 30)])
            sage: list(g.yen_k_shortest_paths(1, 5, by_weight=True))
            [[1, 3, 5], [1, 2, 5], [1, 4, 5]]
            sage: list(g.yen_k_shortest_paths(1, 5))
            [[1, 2, 5], [1, 3, 5], [1, 4, 5]]
            sage: list(g.yen_k_shortest_paths(1, 1))
            [[1]]

            sage: g = Graph([(1, 2, 20), (1, 3, 10), (1, 4, 30), (2, 5, 20), (3, 5, 10), (4, 5, 30), (1, 6, 100), (5, 6, 5)])
            sage: list(g.yen_k_shortest_paths(1, 6, by_weight = True))
            [[1, 3, 5, 6], [1, 2, 5, 6], [1, 4, 5, 6], [1, 6]]
            sage: list(g.yen_k_shortest_paths(1, 6))
            [[1, 6], [1, 2, 5, 6], [1, 3, 5, 6], [1, 4, 5, 6]]

dcoudert commented 5 years ago

comment:11

Replying to @rajat1433:

I have made the changes in shortest_paths methods to have a minimal effect of exclude_edges and exclude_vertices.

you were too fast. You still have if not exclude_edges and not exclude_vertices: inside the for w in nbr:. Also, check if the order of the other tests could be improved.

yen_k_shortest_paths is for both the directed as well as undirected graphs as evident from the examples in the method.

The first lines of description are not clear. That's why I asked.

Plus it would be better to have a single line of description.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

70e86c1 improved

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 3ca1945 to 70e86c1

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:13

Oops! I have made the description more clear and also concised the checks and removed "if not exclude_edges and not exclude_vertices:" these from wherever unnecessary.

If there is further any improvement possible please let me know.

dcoudert commented 5 years ago

comment:14

What about:

-                        if exclude_edges and ((out == 1 and (u, w) not in exclude_edges_int) or (out == -1 and (w, u) not in exclude_edges_int)):
-                            if not exclude_vertices or (exclude_vertices and w not in exclude_vertices_int):
-                                neighbors.append(w)
-                        elif not exclude_edges and exclude_vertices and w not in exclude_vertices_int:
-                            neighbors.append(w)
+                        if exclude_vertices and w not in exclude_vertices_int:
+                            neighbors.append(w)
+                        elif (exclude_edges and 
+                              ((out == 1 and (u, w) not in exclude_edges_int) or 
+                               (out == -1 and (w, u) not in exclude_edges_int))):
+                            neighbors.append(w)

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:15

I think there may be a problem in that, consider that w is not in exclude_vertices(there are other vertices though) but (u,w) is present in exclude_edges then w will be appended as a neighbor, but it should be not appended.

dcoudert commented 5 years ago

comment:16

+                        if exclude_vertices and w in exclude_vertices_int:
+                            continue
+                        if (exclude_edges and 
+                              ((out == 1 and (u, w) in exclude_edges_int) or 
+                               (out == -1 and (w, u) in exclude_edges_int))):
+                            continue
+                        neighbors.append(w)

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 70e86c1 to 481ea41

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

481ea41 improvedcode

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 481ea41 to 5b14b84

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

5b14b84 improved method name and description

dcoudert commented 5 years ago

comment:19

In fact, exclude_vertices and exclude_edges could be any kind of iterable containers since we turn them directly to something else.
You can do:

-        if exclude_vertices:
-            exclude_vertices_int = set()
-            for u in exclude_vertices:
-                exclude_vertices_int.add(self.get_vertex(u))
+        cdef set exclude_vertices_int = {self.get_vertex(u) for u in exclude_vertices}
+        cdef set exclude_edges_int = {(self.get_vertex(u), self.get_vertex(v)) for u, v in exclude_edges}

You may also define cdef bint exclude = exclude_vertice or exclude_edges and then do a simpler test if not exclude...
Note that the set of vertices and edges that you exclude is sometimes called forbidden.
pep8

-            raise LookupError("No path between %s and %s" % (x, y))
+            raise LookupError("no path between %s and %s" % (x, y))

in fact the error message could be no path from/to an excluded vertex or something like that.

in Yen's algorithm, could you document the code
I think that case if not prev_path: should be done before the while loop and that the while loop should be while heap_paths:.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 5b14b84 to 5164f34

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

5164f34 changes in c_graph.pyx

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:21

I have made the container description as an iterable and made the improvements in c_graph file in the above commit.

I think the positioning of if not prev_path: is correct still I will check it, I will be documenting the Yen's code soon in my next commit.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 5164f34 to d22322c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d22322c documenting the yen's algorithm

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from d22322c to 55c187c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

55c187c documenting the yen's algorithm

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

beb67d6 documenting the yen's algorithm

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 55c187c to beb67d6

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:25

Replying to @dcoudert:

in Yen's algorithm, could you document the code

Done

If any improvement is possible let me know.

I think that case if not prev_path: should be done before the while loop and that the while loop should be while heap_paths:.

The thing is that it will enter this if statement only once and that too at the start. So it can be placed outside the while loop but placing it inside seems to give a kind of structure and avoid redundancy as the part after if else condition(line 16097) will be redundant if it is placed outside the while loop so I place it inside the while(True) loop.

dcoudert commented 5 years ago

comment:27

You must reorder the blocks like that:

# will enter here for the first time to compute the shortest path between source and target
if by_weight:
    path = shortest_path_func(source, target, weight_function=weight_function)
...
heap_paths.add(hash_path) # adding the path to the heap_paths set

while heap_paths:
    (cost, path1) = heappop(heap_sorted_paths) # extracting the next best path from the heap
    hash_path = tuple(path1)
    ...
    prev_path = path1

    exclude_vertices = []
    exclude_edges = []
    for i in range(1, len(prev_path)): # deviating from the previous path to find the candidate paths
        root = prev_path[:i] # root part of the previous path
        ....

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from beb67d6 to 353b3c3

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

353b3c3 reordered

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:29

For feng's algorithm should I implement it here in this ticket or create a new ticket ? Actually Feng's algorithm is only for directed graphs so this algorithm in this tkt is important for undirected graphs and will be used to compare timings with Feng's algorithm. So I guess its best to implement a new method.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

0528edc added tests

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 353b3c3 to 0528edc

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

20515f9 revert back to set

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 0528edc to 20515f9

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:32

Replying to @sagetrac-git:

Branch pushed to git repo; I updated commit sha1. New commits:

20515f9 revert back to set

I hv reverted back to set here as there can be multiple instances of same edges and vertices in exclude_edges and exclude_vertices.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 20515f9 to f7abe84

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f7abe84 added more tests

dcoudert commented 5 years ago

comment:34

So I guess its best to implement a new method.

Yes, go for it.

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:35

I am planning to include some new parameters like

1.) include_vertices in shortest path functions

2.) reduced cost dictionary to shortest path functions

first one is for finding the all yellow node path in the graph as per the feng's algorithm since these number will be less as compared to exclude_vertices so this seemed a suitable parameter.

second one can be avoided but that will require to make a duplicate copy of the graph with the edge weights as reduced weights but not sure if it is good to make a copy of the graph is ok or not but using this parameter can help us in retrieving the edge weights.

Any suggestions on if anything above can be done in a better way will be helpful.

dcoudert commented 5 years ago

comment:36

I'm reluctant to add new parameters to the shortest path methods. The methods will become more complicated and difficult to maintain, and can be slow down.

An alternative is to create a new .pyx file dedicated to paths enumerations, and to put inside specialized shortest paths methods that are needed.

sagemath / sage

Implementing the Yen's algorithm and its improved versions #27859