sagemath / sage

Main repository of SageMath. Now open for Issues and Pull Requests.
https://www.sagemath.org
Other
1.09k stars 394 forks source link

Triconnectivity linear time algorithm #25598

Closed meghanamreddy closed 5 years ago

meghanamreddy commented 5 years ago

Addition of the linear time algorithm for building triconnected components of a biconnected graph. This is a huge algorithm and we will be coding it one function at a time.

Depends on #22157

CC: @dcoudert @dimpase @sagetrac-saiharsh

Component: graph theory

Keywords: connectivity, decomposition, gsoc2018

Author: Meghana M Reddy, Sai Harsh Tondomker, David Coudert

Branch: 65f2da3

Reviewer: David Coudert, Dima Pasechnik

Issue created by migration from https://trac.sagemath.org/ticket/25598

meghanamreddy commented 5 years ago
comment:1

I have added the basic class structure of the Triconnectivity module. As discussed, we will first code it in Python and then convert to Cython. Harsh and I will together add functions to the module.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from bb9c543 to c85b5e8

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

c85b5e8Split_multiple_edges function is added.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f1cfce2Added the function dfs1() which builds the palm tree.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from c85b5e8 to f1cfce2

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

bcc5aeeFixed a small error in split_multi_edges()
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from f1cfce2 to bcc5aee

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from bcc5aee to 078067e

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

078067eModified dfs1() to check biconnectivity
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 078067e to 9875d50

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9875d50Added the function build_acceptable_adjacency_structure()
meghanamreddy commented 5 years ago
comment:7

For the function build_acceptable_adjacency_structure(), but we require the orientation of the edges of the palm tree. Hence, in the graph_copy we are storing, we iterate through every edge, and if the orientation in graph is different from the palm tree, we reverse it. But I discovered the following -

g = Graph()
g.add_edges([(2,1)])
g.edges()
[(1, 2, None)]
g = Graph()
g.add_edges([(1,2)])
g.edges()
[(1, 2, None)]

Irrespective of the order of source and target given in the parameter of add_edges(), the edge is stored as (1, 2, None). In such a situation, the other option of reversing an edge is possible if the graph is directed. But I do not want to convert the graph to a digraph, one of the reasons being that while converting, two edges in either direction are added to digraph corresponding to every edge in the original graph.

Hence, as of now, I have stored an additional dictionary named edge_reversed, and I'm storing all the edges as keys of the dictionary and the value as True or False to denote if the edge is in the reverse direction or not. Every place where the orientation of the edge matters, I have an if and else blocks.

I don't think this is the best method. I am still looking for other methods. Please advice if SageMath has some feature which can be used here.

Also, the function dfs1() has been thoroughly tested. I am still testing the function build_acceptable_adjacency_structure(), it is hard to test without the dfs2() function I think.

dcoudert commented 5 years ago
comment:8

I don't think this is the best method. I am still looking for other methods. Please advice if SageMath has some feature which can be used here.

Using dictionary here is a good option. Another solution could be to use a directed graph. I'll let you know if I think about a better solution.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 9875d50 to 49293bf

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

49293bfAdded adjacency list to make adjacency queries linear time.
meghanamreddy commented 5 years ago
comment:10

Instead of using the O(n) edges_incident() to get the incident edges of a vertex during dfs, added a list graph_copy_adjacency which stores the incident edges of vertex i in graph_copy_adjacency[i]. Harsh, please use this list while coding the function dfs2().

dcoudert commented 5 years ago
comment:11

Just a few comments on coding style

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4b79b27Some changes in coding style
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 49293bf to 4b79b27

meghanamreddy commented 5 years ago
comment:13

I have made the changes. We will follow the same style from now on.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9e69d10Merge to version 8.3beta7.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 4b79b27 to 9e69d10

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 9e69d10 to 4d7cbe5

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4d7cbe5Added DFS2 and Path Finder
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 4d7cbe5 to 84cc2cf

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

84cc2cfUpdated dfs2 and path finder. Added some helper functions for path_search().
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a1d1336Added pathsearch() function.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 84cc2cf to a1d1336

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from a1d1336 to c5a73e9

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

c5a73e9Fixed a small error
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from c5a73e9 to 79b8cb8

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b222089Update to 8.3.rc0
6ff2b41Fixed a bug with pathsearch()
79b8cb8Removed extra print statements. Error with edge_status dictionary.
meghanamreddy commented 5 years ago
comment:20

@sagetrac-saiharsh, I was debugging and fixed some bugs with pathsearch() but I found an error with the function split_multi_edges() and with the edge_status dictionary.

Run the code for a cycle on 4 vertices with one extra edge (the multi-edge).

61453475-51f7-4d25-8062-038d85d714fa commented 5 years ago
comment:21

sorted_edges = sorted(self.graph_copy.multiple_edges(labels=True)), need to check.

I feel it won't work if all the edges have same labels.

The error in split_multi_edges() because we are unable to differentiate between multiple edges, so I thought to label each edge according to entry in G.edges.

        from sage.graphs.graph import Graph
        self.graph_copy = Graph(multiedges=True)
        edges = G.edges()
        # dict to map new edges with the old edges
        self.edge_label_dict = {}
        for i in range(len(edges)):
            newEdge = tuple([edges[i][0], edges[i][1], i])
            self.graph_copy.add_edge(newEdge)
            self.edge_label_dict[newEdge] = edges[i]
        self.graph_copy = self.graph_copy.copy(implementation='c_graph')

Now edge_status will have entry of all multiple edges.

The only one problem is while returning the final answer we need to map the graph edge labels to it's previous labels using edge_label_dict. which will increase the time with O(m).

edge_label_dict will take new edge and return it's previous form.

I have manually checked split_multiple_edges is working as expected.

Please suggest me if there is a better way to solve this problem.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d0b9631Updated graph_copy and split_multiple_edges function
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 79b8cb8 to d0b9631

meghanamreddy commented 5 years ago
comment:23

There is a bug in your approach. You are modifying the graph while building graph_copy. Since you are going through the edges and only adding the modified edges, all isolated vertices will be deleted, which is not desired.

split_multi_edges is still giving the wrong output. I tried the following example:

sage: G = Graph()
sage: G.allow_multiple_edges(True)
sage: G.add_edges([(1,2),(2,3),(3,4),(4,5),(1,5),(1,5)])
sage: tric = Triconnectivity(G)

The connected component corresponding to the multi edges is being done correctly. But both the edges corresponding to (1,5) edge - (0,4,1) and (0,4,2) are given a value of 3 after split_multi_edges. Only one edge must be deleted.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a5c4ea7Fixed all bugs in path_search.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from d0b9631 to a5c4ea7

meghanamreddy commented 5 years ago
comment:25

The path_search function is working for simple graphs, and the components are being computed correctly. It will work for multi-graphs also once the bug in split_multi_edges is fixed (since the graph becomes a simple graph after the preprocessing in split_multi_edges).

The function assemble_triconnected_components has to be coded now, which seems to be simple since the components are already computed and a few of them have to be merged in this function.

Once this function is done, we can thoroughly test the code by comparing the output with #22157 and OGDF. We can then work on cythonizing the code.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

1ea9c1bFixed the bug in split_multiple_edges
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from a5c4ea7 to 1ea9c1b

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2375d00Fixed a minor bug related to multi-graphs
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 1ea9c1b to 2375d00

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

6b357c7Added `assemble_triconnected_components` function.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 2375d00 to 6b357c7

meghanamreddy commented 5 years ago
comment:29

The triconnectivity is completed. The triconnected components are now correctly constructed. The triconnected components are being printed when we call the Triconnectivity function. An example-

sage: from sage.graphs.connectivity import Triconnectivity
sage: G = Graph()
sage: G.add_edges([(1,2),(1,4),(1,8),(1,12),(1,13),(2,3),(2,13),(3,4)])
sage: G.add_edges([(3,13),(4,5),(4,7),(5,6),(5,7),(5,8),(6,7),(8,9),(8,11)])
sage: G.add_edges([(8,12),(9,10),(9,11),(9,12),(10,11),(10,12)])
sage: tric = Triconnectivity(G)

In the output, the current format of vertices is numbers starting from 0 to n. Hence the edges printed will use these numberings. I am currently formatting the output to change the vertex labels to original labels. Will finish that in some time.

I am also adding comments to the code to make it more readable. Please let me know if you find any discrepancies in the output.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 6b357c7 to b1d9355

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

7f14108Update to 8.3.rc2.
b1d9355Formatted output to change vertices/edges to original labels.
meghanamreddy commented 5 years ago
comment:31

The output is now in terms of original labels. The triconnected components are printed and also stored in the variables comp_list_new and comp_type. comp_list_new[i] contains the list of edges belonging to the ith component. comp_type[i] contains the type of ith component (0=bond, 1=polygon, 2=triconnected component). These can also be accessed as -

tric = Triconnectivity(G)
tric.comp_list_new
tric.comp_type

Also, in the triconnected components, all the original edges have labels as per the input graph G. However, virtual edges have a label starting with newVEdge followed by the number of the virtual edge, i.e., newVEdge5 or newVEdge14.