Implement algorithms for tree specifically which have better time complexity than for general graph

abhayptp commented 5 years ago

Currently, finding diameter and center of tree,etc. is only possible using method implemented for general graphs(having way higher time complexity than the ones which are specifically for trees).

I am proposing to implement a class for tree and implement tree specific algorithms there. And whenever we encounter a graph which is tree and there is some better algorithm to compute the asked property for tree than general graph, we convert that graph into instance of Tree and call the relevant algorithm.

There are a large number of algorithms for tree which have significantly less time complexity than algorithms for general graph. So, this can benefit a lot.

If the idea is agreed upon, I will be happy to send a diff.

CC: @dcoudert

Component: graph theory

Branch/Commit: u/gh-abhayptp/implement_algorithms_for_tree_specifically_which_have_better_time_complexity_than_for_general_graph @ 0119f28

Issue created by migration from https://trac.sagemath.org/ticket/27564

dcoudert commented 5 years ago

comment:2

You propose to make a class Tree with a specific backend like we have for bipartite graphs, or just to implement a collection of algorithms optimized for trees ?

abhayptp commented 5 years ago

comment:3

Till now, I was only thinking to implement a collection of algorithms optimized for tree at one place to prevent code clutter in the current algorithms. I can not think right now how having a specific backend for trees will prove to be helpful because data structure of Tree is same as Graph unlike Bipartite graph.

abhayptp commented 5 years ago

comment:4

For example, another possible use case I can think of is this:

Finding if two trees are isomorphic is possible in O(V) time, which is way better than general algorithm for graph isomorphism(no polynomial time solution for graph isomorphism yet found).

Reference:

http://wwwmayr.in.tum.de/konferenzen/Jass08/courses/1/smal/Smal_Paper.pdf

But implementing this algorithm in is_isomorphic() function will introduce lot of code clutter.

2460b664-5ecd-43fc-9bbe-d0f333762988 commented 5 years ago

comment:5

I think that the tree needs to be rooted for checking the isomorphism in O(V) time. But in graph module rooted trees are not defined. But I think in combinat module rooted trees are defined.

dcoudert commented 5 years ago

comment:6

I do not recommend to use the rooted trees of combinat here. If you need a root, you can for instance find the center of the tree in linear time.

abhayptp commented 5 years ago

comment:7

Yes, the paper describes the same trick for checking isomorphism for unrooted trees.

We do not need to define a separate class for tree I think. Just a separate file will suffice which will contain all the tree related algorithms. @dcoudert opinions?

dcoudert commented 5 years ago

comment:8

Feel free to initialize.

abhayptp commented 5 years ago

Branch: u/gh-abhayptp/implement_algorithms_for_tree_specifically_which_have_better_time_complexity_than_for_general_graph

abhayptp commented 5 years ago

Commit: 68dadad

abhayptp commented 5 years ago

comment:10

Tests and documentation is yet to be added and some changes are yet to be done. I will update status to 'Needs review' once it's done.

New commits:

`2b0d55b`	`Add algorithms(center, diameter, isomorphism) for tree`
`68dadad`	`Add generateMapping for isomorphism on trees`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 68dadad to 41ee56b

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

41ee56b Add documentation and examples

abhayptp commented 5 years ago

comment:12

Added documentation and examples. I have included tree_isomorphism in is_isomorphic function. So, whenever G.is_isomorphic(H) is called and G and H are trees(and edge labels is false), isomorphism is checked using tree_isomorphism function.

I have tested the function by writing a script of generating some small random trees and comparing the output of tree_isomorphism and is_isomorphic. If anyone has a better idea, please suggest.

edge_labels=True case can also be handled using tree_isomorphism I think. I will send another diff once I implement it.

I am not sure, if I should include tree_diameter() in diameter() functions, since currently it works on only unweighted trees. Same for tree_center.

dcoudert commented 5 years ago

comment:13

-        if self.is_tree() and other.is_tree() and not edge_labels:
+        if not edge_labels and self.is_tree() and other.is_tree():

Method tree_diameter

some code style improvements needed like

-    Returns diameter of unweighted tree.
-    Input:
+    Return the diameter of the unweighted tree.
+
+    INPUT:

-    - ``endpoints`` -- boolean(default: ``False``); whether to return
-    enpoints of diameter or not
+    - ``endpoints`` -- boolean (default: ``False``); whether to return
+      enpoints of diameter or not

it would be nice to document the algorithm used here ( a few sentences)
this will break if G is empty: u = next(G.vertex_iterator())
and also if the input graph is not a tree, is not connected, etc.
instead of shortest_path_lengths, you must use G.breadth_first_search(u, report_distance=True). You will then observe that the last reported distance is also the largest one.
Why are you only considering unweighted graphs ?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 41ee56b to 57fcd37

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`c1e1b26`	`WIP 1`
`57fcd37`	`Add compatibility for weighted trees and fix code style`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 57fcd37 to 0119f28

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

0119f28 WIP 3

abhayptp commented 5 years ago

comment:16

some code style improvements needed like

done

it would be nice to document the algorithm used here ( a few sentences)

done

this will break if G is empty: u = next(G.vertex_iterator()) and also if the input graph is not a tree, is not connected, etc.

updated.

instead of shortest_path_lengths, you must use G.breadth_first_search(u, report_distance=True). You will then observe that the last reported distance is also the largest one.

Now, I have also added the compatibility for weighted trees. To avoid redundant code for finding shortest paths, I am using Dijkstra algorithm(so, using shortest_path_lengths). I know we can find it in linear time. But I don't think using dijkstra algorithm will affect performance that much. Or shall I write function for finding shortest_path_lenghts in trees?

Why are you only considering unweighted graphs ?

Now, it is compatible for weighted graphs. But the given algorithm works only for non-negative weights for diameter. In case of center, positive weights since including 0 weight will imply that centers can lie on vertex other than diameter.

abhayptp commented 5 years ago

comment:17

Now, I see that diameter and center can also be found out for negative weighted trees using a different approach.

dcoudert commented 5 years ago

comment:18

Replying to @abhayptp:

some code style improvements needed like

done

You should check the developer manual http://doc.sagemath.org/html/en/developer/ and in particular http://doc.sagemath.org/html/en/developer/coding_basics.html

When we define a method, we have first a short (1 line) description of what the method returns. Then, separated by an empty line, we have a longer description, the input block, etc.

def tree_diameter(T, endpoints=False, path=False, by_weight=False):
    r"""
    Return the diameter of the tree `T`.

    This method use 2-sweep to compute the diameter of the tree `T`. That is,
    it first chooses some random vertex `x` and finds the most distant vertex
    `u` from `x`. The diameter of `T` is the largest distance from `u`.

    It works only on non-negative weighted trees.

    INPUT:

    - ``T`` -- a Graph

Note that I use r""" instead of """ to indicate the use of latex. Indeed, you can ut latex code like `G` or `\sqrt{x^2 + 1}`.
when an input is described on multiple lines, you must add 2 spaces

-    - ``endpoints`` -- boolean (default: ``False``); whether to return
-    enpoints of diameter or not
+    - ``endpoints`` -- boolean (default: ``False``); whether to return
+      enpoints of diameter or not

Concerning the usage of weights, you can for instance check the documentation of method shortest_path. We usually allow to pass a weight_function.
You can simplify you code as follows

+    if by_weight:
+        from operator import itemgetter
+        path_lengths = G.shortest_path_lengths(x, algorithm='Dijkstra_Boost', by_weight=True)
+        u = max(path_lengths.items(), key=itemgetter(1))[0]
+        path_lengths = G.shortest_path_lengths(u, algorithm='Dijkstra_Boost', by_weight=True)
+        (v, dist) = max(path_lengths.items(), key=itemgetter(1))
+    else:
+        for u in G.breadth_first_search(x):
+            pass
+        for v, dist in G.breadth_first_search(u, report_distance=True):
+            pass

Note that it's better to use .items() instead of .iteritems() for Python 3 compatibility

I let you do the same for the other methods.

embray commented 5 years ago

comment:19

Moving tickets from the Sage 8.8 milestone that have been actively worked on in the last six months to the next release milestone (optimistically).

embray commented 4 years ago

comment:20

Ticket retargeted after milestone closed

mkoeppe commented 4 years ago

comment:21

Batch modifying tickets that will likely not be ready for 9.1, based on a review of the ticket title, branch/review status, and last modification date.

mkoeppe commented 3 years ago

comment:24

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

mkoeppe commented 3 years ago

comment:25

Setting a new milestone for this ticket based on a cursory review.

sagemath / sage

Implement algorithms for tree specifically which have better time complexity than for general graph #27564