Triconnectivity linear time algorithm

meghanamreddy commented 5 years ago

Addition of the linear time algorithm for building triconnected components of a biconnected graph. This is a huge algorithm and we will be coding it one function at a time.

Depends on #22157

CC: @dcoudert @dimpase @sagetrac-saiharsh

Component: graph theory

Keywords: connectivity, decomposition, gsoc2018

Author: Meghana M Reddy, Sai Harsh Tondomker, David Coudert

Branch: 65f2da3

Reviewer: David Coudert, Dima Pasechnik

Issue created by migration from https://trac.sagemath.org/ticket/25598

meghanamreddy commented 5 years ago

comment:1

I have added the basic class structure of the Triconnectivity module. As discussed, we will first code it in Python and then convert to Cython. Harsh and I will together add functions to the module.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from bb9c543 to c85b5e8

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

c85b5e8 Split_multiple_edges function is added.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f1cfce2 Added the function dfs1() which builds the palm tree.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from c85b5e8 to f1cfce2

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

bcc5aee Fixed a small error in split_multi_edges()

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from f1cfce2 to bcc5aee

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from bcc5aee to 078067e

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

078067e Modified dfs1() to check biconnectivity

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 078067e to 9875d50

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9875d50 Added the function build_acceptable_adjacency_structure()

meghanamreddy commented 5 years ago

comment:7

For the function build_acceptable_adjacency_structure(), but we require the orientation of the edges of the palm tree. Hence, in the graph_copy we are storing, we iterate through every edge, and if the orientation in graph is different from the palm tree, we reverse it. But I discovered the following -

g = Graph()
g.add_edges([(2,1)])
g.edges()
[(1, 2, None)]

g = Graph()
g.add_edges([(1,2)])
g.edges()
[(1, 2, None)]

Irrespective of the order of source and target given in the parameter of add_edges(), the edge is stored as (1, 2, None). In such a situation, the other option of reversing an edge is possible if the graph is directed. But I do not want to convert the graph to a digraph, one of the reasons being that while converting, two edges in either direction are added to digraph corresponding to every edge in the original graph.

Hence, as of now, I have stored an additional dictionary named edge_reversed, and I'm storing all the edges as keys of the dictionary and the value as True or False to denote if the edge is in the reverse direction or not. Every place where the orientation of the edge matters, I have an if and else blocks.

I don't think this is the best method. I am still looking for other methods. Please advice if SageMath has some feature which can be used here.

Also, the function dfs1() has been thoroughly tested. I am still testing the function build_acceptable_adjacency_structure(), it is hard to test without the dfs2() function I think.

dcoudert commented 5 years ago

comment:8

I don't think this is the best method. I am still looking for other methods. Please advice if SageMath has some feature which can be used here.

Using dictionary here is a good option. Another solution could be to use a directed graph. I'll let you know if I think about a better solution.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 9875d50 to 49293bf

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

49293bf Added adjacency list to make adjacency queries linear time.

meghanamreddy commented 5 years ago

comment:10

Instead of using the O(n) edges_incident() to get the incident edges of a vertex during dfs, added a list graph_copy_adjacency which stores the incident edges of vertex i in graph_copy_adjacency[i]. Harsh, please use this list while coding the function dfs2().

dcoudert commented 5 years ago

comment:11

Just a few comments on coding style

if (self.dfs_number[w] < self.lowpt1[v]):. You don't need to add the brackets.
if check == True: -> if check:
if len(comp): -> if comp:. Indeed, if comp == [], the test is false, and if comp is not empty, the test will be true.
if self.edge_status[e] != 0 : -> if self.edge_status[e]:. It's False only if the status is 0.
first_son == None -> first_son is None is better I think.
use range and not xrange. In Python 3, range becomes an iterator.
edge_reverse. Instead of a dictionary, you could use a set and consider that an edge is reversed if it belongs to that set. It's not very important.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4b79b27 Some changes in coding style

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 49293bf to 4b79b27

meghanamreddy commented 5 years ago

comment:13

I have made the changes. We will follow the same style from now on.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9e69d10 Merge to version 8.3beta7.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 4b79b27 to 9e69d10

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 9e69d10 to 4d7cbe5

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4d7cbe5 Added DFS2 and Path Finder

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 4d7cbe5 to 84cc2cf

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

84cc2cf Updated dfs2 and path finder. Added some helper functions for path_search().

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a1d1336 Added pathsearch() function.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 84cc2cf to a1d1336

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from a1d1336 to c5a73e9

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

c5a73e9 Fixed a small error

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from c5a73e9 to 79b8cb8

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`b222089`	`Update to 8.3.rc0`
`6ff2b41`	`Fixed a bug with pathsearch()`
`79b8cb8`	`Removed extra print statements. Error with edge_status dictionary.`

meghanamreddy commented 5 years ago

comment:20

@sagetrac-saiharsh, I was debugging and fixed some bugs with pathsearch() but I found an error with the function split_multi_edges() and with the edge_status dictionary.

In split_multi_edges(), sorted_edges = sorted(self.graph_copy.multiple_edges(labels=False)) removes the edge labels, however in edge_status dictionary, we store the edges along with the labels. Hence, if an edge (3,4,None) is removed and updated in split_multi_edges(), a new item (3,4) is added to edge_status with label 3 (i.e., denoting removed), whereas the item (3,4,None) remains with old label. In the rest of the algorithm, (3,4,None) is used, which leads to an error. For now I have changed that line to sorted_edges = sorted(self.graph_copy.multiple_edges(labels=True)), need to check.
A more fundamental error is with the dictionary edge_status itself. If we have multiedges (1,2) and (1,2) in the input graph, in split_multi_edges() we intend to change the status of one of the edges to removed and the other remains as a palm tree arc. But since the two edges are identical, one over-writes the other in the dictionary edge_status. The dictionary edge_status has only one copy of the edge for every set of multi-edges. I had assumed a dictionary would be a good way to store status of the edges but did not think about this error. Please suggest a way in which we can resolve this error. I will also update if I find some fix.

Run the code for a cycle on 4 vertices with one extra edge (the multi-edge).

61453475-51f7-4d25-8062-038d85d714fa commented 5 years ago

comment:21

sorted_edges = sorted(self.graph_copy.multiple_edges(labels=True)), need to check.

I feel it won't work if all the edges have same labels.

The error in split_multi_edges() because we are unable to differentiate between multiple edges, so I thought to label each edge according to entry in G.edges.

        from sage.graphs.graph import Graph
        self.graph_copy = Graph(multiedges=True)
        edges = G.edges()
        # dict to map new edges with the old edges
        self.edge_label_dict = {}
        for i in range(len(edges)):
            newEdge = tuple([edges[i][0], edges[i][1], i])
            self.graph_copy.add_edge(newEdge)
            self.edge_label_dict[newEdge] = edges[i]
        self.graph_copy = self.graph_copy.copy(implementation='c_graph')

Now edge_status will have entry of all multiple edges.

The only one problem is while returning the final answer we need to map the graph edge labels to it's previous labels using edge_label_dict. which will increase the time with O(m).

edge_label_dict will take new edge and return it's previous form.

I have manually checked split_multiple_edges is working as expected.

Please suggest me if there is a better way to solve this problem.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d0b9631 Updated graph_copy and split_multiple_edges function

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 79b8cb8 to d0b9631

meghanamreddy commented 5 years ago

comment:23

There is a bug in your approach. You are modifying the graph while building graph_copy. Since you are going through the edges and only adding the modified edges, all isolated vertices will be deleted, which is not desired.

split_multi_edges is still giving the wrong output. I tried the following example:

sage: G = Graph()
sage: G.allow_multiple_edges(True)
sage: G.add_edges([(1,2),(2,3),(3,4),(4,5),(1,5),(1,5)])
sage: tric = Triconnectivity(G)

The connected component corresponding to the multi edges is being done correctly. But both the edges corresponding to (1,5) edge - (0,4,1) and (0,4,2) are given a value of 3 after split_multi_edges. Only one edge must be deleted.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a5c4ea7 Fixed all bugs in path_search.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from d0b9631 to a5c4ea7

meghanamreddy commented 5 years ago

comment:25

The path_search function is working for simple graphs, and the components are being computed correctly. It will work for multi-graphs also once the bug in split_multi_edges is fixed (since the graph becomes a simple graph after the preprocessing in split_multi_edges).

The function assemble_triconnected_components has to be coded now, which seems to be simple since the components are already computed and a few of them have to be merged in this function.

Once this function is done, we can thoroughly test the code by comparing the output with #22157 and OGDF. We can then work on cythonizing the code.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

1ea9c1b Fixed the bug in split_multiple_edges

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from a5c4ea7 to 1ea9c1b

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2375d00 Fixed a minor bug related to multi-graphs

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 1ea9c1b to 2375d00

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

6b357c7 Added `assemble_triconnected_components` function.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 2375d00 to 6b357c7

meghanamreddy commented 5 years ago

comment:29

The triconnectivity is completed. The triconnected components are now correctly constructed. The triconnected components are being printed when we call the Triconnectivity function. An example-

sage: from sage.graphs.connectivity import Triconnectivity
sage: G = Graph()
sage: G.add_edges([(1,2),(1,4),(1,8),(1,12),(1,13),(2,3),(2,13),(3,4)])
sage: G.add_edges([(3,13),(4,5),(4,7),(5,6),(5,7),(5,8),(6,7),(8,9),(8,11)])
sage: G.add_edges([(8,12),(9,10),(9,11),(9,12),(10,11),(10,12)])
sage: tric = Triconnectivity(G)

In the output, the current format of vertices is numbers starting from 0 to n. Hence the edges printed will use these numberings. I am currently formatting the output to change the vertex labels to original labels. Will finish that in some time.

I am also adding comments to the code to make it more readable. Please let me know if you find any discrepancies in the output.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 6b357c7 to b1d9355

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`7f14108`	`Update to 8.3.rc2.`
`b1d9355`	`Formatted output to change vertices/edges to original labels.`

meghanamreddy commented 5 years ago

comment:31

The output is now in terms of original labels. The triconnected components are printed and also stored in the variables comp_list_new and comp_type. comp_list_new[i] contains the list of edges belonging to the ith component. comp_type[i] contains the type of ith component (0=bond, 1=polygon, 2=triconnected component). These can also be accessed as -

tric = Triconnectivity(G)
tric.comp_list_new
tric.comp_type

Also, in the triconnected components, all the original edges have labels as per the input graph G. However, virtual edges have a label starting with newVEdge followed by the number of the virtual edge, i.e., newVEdge5 or newVEdge14.

sagemath / sage

Triconnectivity linear time algorithm #25598