Open abhayptp opened 5 years ago
You propose to make a class Tree with a specific backend like we have for bipartite graphs, or just to implement a collection of algorithms optimized for trees ?
Till now, I was only thinking to implement a collection of algorithms optimized for tree at one place to prevent code clutter in the current algorithms. I can not think right now how having a specific backend for trees will prove to be helpful because data structure of Tree is same as Graph unlike Bipartite graph.
For example, another possible use case I can think of is this:
Finding if two trees are isomorphic is possible in O(V) time, which is way better than general algorithm for graph isomorphism(no polynomial time solution for graph isomorphism yet found).
Reference:
http://wwwmayr.in.tum.de/konferenzen/Jass08/courses/1/smal/Smal_Paper.pdf
But implementing this algorithm in is_isomorphic() function will introduce lot of code clutter.
I think that the tree needs to be rooted for checking the isomorphism in O(V) time. But in graph module rooted trees are not defined. But I think in combinat module rooted trees are defined.
I do not recommend to use the rooted trees of combinat here. If you need a root, you can for instance find the center of the tree in linear time.
Yes, the paper describes the same trick for checking isomorphism for unrooted trees.
We do not need to define a separate class for tree I think. Just a separate file will suffice which will contain all the tree related algorithms. @dcoudert opinions?
Feel free to initialize.
Branch pushed to git repo; I updated commit sha1. New commits:
41ee56b | Add documentation and examples |
Added documentation and examples. I have included tree_isomorphism in is_isomorphic function. So, whenever G.is_isomorphic(H) is called and G and H are trees(and edge labels is false), isomorphism is checked using tree_isomorphism function.
I have tested the function by writing a script of generating some small random trees and comparing the output of tree_isomorphism and is_isomorphic. If anyone has a better idea, please suggest.
edge_labels=True case can also be handled using tree_isomorphism I think. I will send another diff once I implement it.
I am not sure, if I should include tree_diameter() in diameter() functions, since currently it works on only unweighted trees. Same for tree_center.
- if self.is_tree() and other.is_tree() and not edge_labels:
+ if not edge_labels and self.is_tree() and other.is_tree():
Method tree_diameter
- Returns diameter of unweighted tree.
- Input:
+ Return the diameter of the unweighted tree.
+
+ INPUT:
- - ``endpoints`` -- boolean(default: ``False``); whether to return
- enpoints of diameter or not
+ - ``endpoints`` -- boolean (default: ``False``); whether to return
+ enpoints of diameter or not
it would be nice to document the algorithm used here ( a few sentences)
this will break if G
is empty: u = next(G.vertex_iterator())
and also if the input graph is not a tree, is not connected, etc.
instead of shortest_path_lengths
, you must use G.breadth_first_search(u, report_distance=True)
. You will then observe that the last reported distance is also the largest one.
Why are you only considering unweighted graphs ?
Branch pushed to git repo; I updated commit sha1. New commits:
0119f28 | WIP 3 |
some code style improvements needed like
done
it would be nice to document the algorithm used here ( a few sentences)
done
this will break if G is empty: u = next(G.vertex_iterator()) and also if the input graph is not a tree, is not connected, etc.
updated.
instead of shortest_path_lengths, you must use G.breadth_first_search(u, report_distance=True). You will then observe that the last reported distance is also the largest one.
Now, I have also added the compatibility for weighted trees. To avoid redundant code for finding shortest paths, I am using Dijkstra algorithm(so, using shortest_path_lengths). I know we can find it in linear time. But I don't think using dijkstra algorithm will affect performance that much. Or shall I write function for finding shortest_path_lenghts in trees?
Why are you only considering unweighted graphs ?
Now, it is compatible for weighted graphs. But the given algorithm works only for non-negative weights for diameter. In case of center, positive weights since including 0 weight will imply that centers can lie on vertex other than diameter.
Now, I see that diameter and center can also be found out for negative weighted trees using a different approach.
Replying to @abhayptp:
some code style improvements needed like
done
You should check the developer manual http://doc.sagemath.org/html/en/developer/ and in particular http://doc.sagemath.org/html/en/developer/coding_basics.html
When we define a method, we have first a short (1 line) description of what the method returns. Then, separated by an empty line, we have a longer description, the input block, etc.
def tree_diameter(T, endpoints=False, path=False, by_weight=False):
r"""
Return the diameter of the tree `T`.
This method use 2-sweep to compute the diameter of the tree `T`. That is,
it first chooses some random vertex `x` and finds the most distant vertex
`u` from `x`. The diameter of `T` is the largest distance from `u`.
It works only on non-negative weighted trees.
INPUT:
- ``T`` -- a Graph
Note that I use r"""
instead of """
to indicate the use of latex. Indeed, you can ut latex code like `G`
or `\sqrt{x^2 + 1}`
.
when an input is described on multiple lines, you must add 2 spaces
- - ``endpoints`` -- boolean (default: ``False``); whether to return
- enpoints of diameter or not
+ - ``endpoints`` -- boolean (default: ``False``); whether to return
+ enpoints of diameter or not
Concerning the usage of weights, you can for instance check the documentation of method shortest_path
. We usually allow to pass a weight_function
.
You can simplify you code as follows
+ if by_weight:
+ from operator import itemgetter
+ path_lengths = G.shortest_path_lengths(x, algorithm='Dijkstra_Boost', by_weight=True)
+ u = max(path_lengths.items(), key=itemgetter(1))[0]
+ path_lengths = G.shortest_path_lengths(u, algorithm='Dijkstra_Boost', by_weight=True)
+ (v, dist) = max(path_lengths.items(), key=itemgetter(1))
+ else:
+ for u in G.breadth_first_search(x):
+ pass
+ for v, dist in G.breadth_first_search(u, report_distance=True):
+ pass
.items()
instead of .iteritems()
for Python 3 compatibilityI let you do the same for the other methods.
Moving tickets from the Sage 8.8 milestone that have been actively worked on in the last six months to the next release milestone (optimistically).
Ticket retargeted after milestone closed
Batch modifying tickets that will likely not be ready for 9.1, based on a review of the ticket title, branch/review status, and last modification date.
Setting new milestone based on a cursory review of ticket status, priority, and last modification date.
Setting a new milestone for this ticket based on a cursory review.
Currently, finding diameter and center of tree,etc. is only possible using method implemented for general graphs(having way higher time complexity than the ones which are specifically for trees).
I am proposing to implement a class for tree and implement tree specific algorithms there. And whenever we encounter a graph which is tree and there is some better algorithm to compute the asked property for tree than general graph, we convert that graph into instance of Tree and call the relevant algorithm.
There are a large number of algorithms for tree which have significantly less time complexity than algorithms for general graph. So, this can benefit a lot.
If the idea is agreed upon, I will be happy to send a diff.
CC: @dcoudert
Component: graph theory
Branch/Commit: u/gh-abhayptp/implement_algorithms_for_tree_specifically_which_have_better_time_complexity_than_for_general_graph @
0119f28
Issue created by migration from https://trac.sagemath.org/ticket/27564