sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.41k stars 475 forks source link

Implementing the Yen's algorithm and its improved versions #27859

Closed 2460b664-5ecd-43fc-9bbe-d0f333762988 closed 5 years ago

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

This ticket aims at implementing the Yen's algorithm and its improved version for k shortest simple path enumeration between the source and the target. The implementation will be compared to the original yen's algorithm.

CC: @dcoudert

Component: graph theory

Keywords: gsoc19

Author: Rajat Mittal

Branch/Commit: fcfa3e6

Reviewer: David Coudert

Issue created by migration from https://trac.sagemath.org/ticket/27859

dcoudert commented 5 years ago

Changed keywords from none to gsoc19

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

Author: Rajat Mittal

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

Branch: u/gh-rajat1433/27859_yen_algorithm_improved

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2afbaefyen's algorithm implementation
ede0d00improved
ac4ff86revert
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Commit: ac4ff86

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:4

This is the implementation of original yen's algorithm for k shortest paths which the method will return as an iterator. This will be compared with the improved yen's(Feng) algorithm which will be implemented shortly.

shortest_path methods are also enhanced by adding exclude_vertices and exclude_edges parameter which are required for Yen's or improved Yen's algorithm. If some improvements are possible please let me know so I can improve the above method.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from ac4ff86 to b31f288

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b31f288added one line
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9da4482removed all_paths
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from b31f288 to 9da4482

dcoudert commented 5 years ago
comment:8

I'm a bit reluctant to the changes you did in shortest_path and bidirectional_dijkstra as it will slowdown these methods when exclude_edges and exclude_vertices are false. A significant effort has been put in optimizing these methods, so slowing them is not clever.

You must try to have the minimum possible impact on them. For instance do:

if not exclude_edges and not exclude_vertices:
    neighbors = nbr
else:
    neighbors = []
    for w in nbr:
        ...

you could also create exclude_vertices_int and exclude_edges_int to avoid code like (self.vertex_label(u), self.vertex_label(w)) not in exclude_edges.

Is method yen_k_shortest_paths only for undirected graphs ?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 9da4482 to 3ca1945

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b7a49b4improved the shortest_paths
3ca1945removed xtra line
2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:10

I have made the changes in shortest_paths methods to have a minimal effect of exclude_edges and exclude_vertices.

yen_k_shortest_paths is for both the directed as well as undirected graphs as evident from the examples in the method.

            sage: g = DiGraph([(1, 2, 20), (1, 3, 10), (1, 4, 30), (2, 5, 20), (3, 5, 10), (4, 5, 30)])
            sage: list(g.yen_k_shortest_paths(1, 5, by_weight=True))
            [[1, 3, 5], [1, 2, 5], [1, 4, 5]]
            sage: list(g.yen_k_shortest_paths(1, 5))
            [[1, 2, 5], [1, 3, 5], [1, 4, 5]]
            sage: list(g.yen_k_shortest_paths(1, 1))
            [[1]]

            sage: g = Graph([(1, 2, 20), (1, 3, 10), (1, 4, 30), (2, 5, 20), (3, 5, 10), (4, 5, 30), (1, 6, 100), (5, 6, 5)])
            sage: list(g.yen_k_shortest_paths(1, 6, by_weight = True))
            [[1, 3, 5, 6], [1, 2, 5, 6], [1, 4, 5, 6], [1, 6]]
            sage: list(g.yen_k_shortest_paths(1, 6))
            [[1, 6], [1, 2, 5, 6], [1, 3, 5, 6], [1, 4, 5, 6]]
dcoudert commented 5 years ago
comment:11

Replying to @rajat1433:

I have made the changes in shortest_paths methods to have a minimal effect of exclude_edges and exclude_vertices.

you were too fast. You still have if not exclude_edges and not exclude_vertices: inside the for w in nbr:. Also, check if the order of the other tests could be improved.

yen_k_shortest_paths is for both the directed as well as undirected graphs as evident from the examples in the method.

The first lines of description are not clear. That's why I asked.

Plus it would be better to have a single line of description.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

70e86c1improved
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 3ca1945 to 70e86c1

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:13

Oops! I have made the description more clear and also concised the checks and removed "if not exclude_edges and not exclude_vertices:" these from wherever unnecessary.

If there is further any improvement possible please let me know.

dcoudert commented 5 years ago
comment:14

What about:

-                        if exclude_edges and ((out == 1 and (u, w) not in exclude_edges_int) or (out == -1 and (w, u) not in exclude_edges_int)):
-                            if not exclude_vertices or (exclude_vertices and w not in exclude_vertices_int):
-                                neighbors.append(w)
-                        elif not exclude_edges and exclude_vertices and w not in exclude_vertices_int:
-                            neighbors.append(w)
+                        if exclude_vertices and w not in exclude_vertices_int:
+                            neighbors.append(w)
+                        elif (exclude_edges and 
+                              ((out == 1 and (u, w) not in exclude_edges_int) or 
+                               (out == -1 and (w, u) not in exclude_edges_int))):
+                            neighbors.append(w)
2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:15

I think there may be a problem in that, consider that w is not in exclude_vertices(there are other vertices though) but (u,w) is present in exclude_edges then w will be appended as a neighbor, but it should be not appended.

dcoudert commented 5 years ago
comment:16
+                        if exclude_vertices and w in exclude_vertices_int:
+                            continue
+                        if (exclude_edges and 
+                              ((out == 1 and (u, w) in exclude_edges_int) or 
+                               (out == -1 and (w, u) in exclude_edges_int))):
+                            continue
+                        neighbors.append(w)
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 70e86c1 to 481ea41

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

481ea41improvedcode
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 481ea41 to 5b14b84

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

5b14b84improved method name and description
dcoudert commented 5 years ago
comment:19
-        if exclude_vertices:
-            exclude_vertices_int = set()
-            for u in exclude_vertices:
-                exclude_vertices_int.add(self.get_vertex(u))
+        cdef set exclude_vertices_int = {self.get_vertex(u) for u in exclude_vertices}
+        cdef set exclude_edges_int = {(self.get_vertex(u), self.get_vertex(v)) for u, v in exclude_edges}
-            raise LookupError("No path between %s and %s" % (x, y))
+            raise LookupError("no path between %s and %s" % (x, y))

in fact the error message could be no path from/to an excluded vertex or something like that.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 5b14b84 to 5164f34

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

5164f34changes in c_graph.pyx
2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:21

I have made the container description as an iterable and made the improvements in c_graph file in the above commit.

I think the positioning of if not prev_path: is correct still I will check it, I will be documenting the Yen's code soon in my next commit.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 5164f34 to d22322c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d22322cdocumenting the yen's algorithm
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from d22322c to 55c187c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

55c187cdocumenting the yen's algorithm
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

beb67d6documenting the yen's algorithm
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 55c187c to beb67d6

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:25

Replying to @dcoudert:

  • in Yen's algorithm, could you document the code

Done

If any improvement is possible let me know.

  • I think that case if not prev_path: should be done before the while loop and that the while loop should be while heap_paths:.

The thing is that it will enter this if statement only once and that too at the start. So it can be placed outside the while loop but placing it inside seems to give a kind of structure and avoid redundancy as the part after if else condition(line 16097) will be redundant if it is placed outside the while loop so I place it inside the while(True) loop.

dcoudert commented 5 years ago
comment:27

You must reorder the blocks like that:

# will enter here for the first time to compute the shortest path between source and target
if by_weight:
    path = shortest_path_func(source, target, weight_function=weight_function)
...
heap_paths.add(hash_path) # adding the path to the heap_paths set

while heap_paths:
    (cost, path1) = heappop(heap_sorted_paths) # extracting the next best path from the heap
    hash_path = tuple(path1)
    ...
    prev_path = path1

    exclude_vertices = []
    exclude_edges = []
    for i in range(1, len(prev_path)): # deviating from the previous path to find the candidate paths
        root = prev_path[:i] # root part of the previous path
        ....
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from beb67d6 to 353b3c3

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

353b3c3reordered
2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:29

For feng's algorithm should I implement it here in this ticket or create a new ticket ? Actually Feng's algorithm is only for directed graphs so this algorithm in this tkt is important for undirected graphs and will be used to compare timings with Feng's algorithm. So I guess its best to implement a new method.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

0528edcadded tests
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 353b3c3 to 0528edc

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

20515f9revert back to set
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 0528edc to 20515f9

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:32

Replying to @sagetrac-git:

Branch pushed to git repo; I updated commit sha1. New commits:

20515f9revert back to set

I hv reverted back to set here as there can be multiple instances of same edges and vertices in exclude_edges and exclude_vertices.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 20515f9 to f7abe84

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f7abe84added more tests
dcoudert commented 5 years ago
comment:34

So I guess its best to implement a new method.

Yes, go for it.

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago
comment:35

I am planning to include some new parameters like

1.) include_vertices in shortest path functions

2.) reduced cost dictionary to shortest path functions

first one is for finding the all yellow node path in the graph as per the feng's algorithm since these number will be less as compared to exclude_vertices so this seemed a suitable parameter.

second one can be avoided but that will require to make a duplicate copy of the graph with the edge weights as reduced weights but not sure if it is good to make a copy of the graph is ok or not but using this parameter can help us in retrieving the edge weights.

Any suggestions on if anything above can be done in a better way will be helpful.

dcoudert commented 5 years ago
comment:36

I'm reluctant to add new parameters to the shortest path methods. The methods will become more complicated and difficult to maintain, and can be slow down.

An alternative is to create a new .pyx file dedicated to paths enumerations, and to put inside specialized shortest paths methods that are needed.