Open darwinrlo opened 4 years ago
Version v0.1
During a traversal of a graph, when you encounter a node again it doesn't necessarily mean there's a cycle. Perhaps it is a descendant of two different nodes.
If there's a cycle, it means that there are vertices whose in-degrees will never reach 0. Hence, they will not be added to the sources queue, and hence, these vertices are never added to the result. So we can check the result list to see if it's smaller than the number of vertices.
This problem could be presented in two different ways: (u, v) could imply u must come before v or it could imply that v must come before u.
O(V + E). It's linear in the number of vertices and edges. Usually, you can't do better than linear on a graph problem like this, especially one that involves touching each edge and vertex once.
Note that, at minimum, each vertex must be added to the result list. So it is impossible to do better than O(V).
Vertices from the same level in the dag can go into the ordering in any order w.r.t. each other.
It doesn't matter in what order vertices from the same level are added to the ordering, and vertices can be added to the ordering level-by-level. But there are more possible ways to construct orderings.
It is possible that the best we can do is say that this takes O(V!) time and O(V!) space.
In the generation of a single ordering, each vertex is inserted into the queue exactly once and each edge is processed exactly once (as part of processing its source vertex). There are V! possible orderings -- admittedly not all of which are going to be valid orderings. Nonetheless, the worst-case time complexity is O(V! x E).
The height of a tree is dominated by the maximum root-leaf path length. So we have to focus on making the maximum root-leaf path length as small as possible.
There might be several, but consider the case in which there is only one path through the graph with the maximum length. If we choose a vertex not on the path to be the tree root, then the path will be present in the tree in its entirety ー along with some additional vertices, which would most certainly make the root-leaf path longer than it can be. So we have to choose a vertex on the longest path to be the root.
Choosing a vertex on the longest path will break up the path into two root-leaf paths. To get two root-leaf paths that are as short as possible, we need to choose the middle vertex, or if the path is of even length, one of the two middle vertices.
In regard to graphs in which more than one path has the maximum length, I will make the following claim: If there are two paths through the graph with the maximum length, then they have the same middle vertex, or if the length is even, two middle vertices. If two paths have the same length and have a different middle vertex, or if the path is even, two middle vertices, then these paths do not have the maximum length. This will be expanded upon in a future draft.
A leaf vertex is most definitely not going to be the tree root of a minimum height tree. If you simply select its adjacent vertex as a tree root, you've already reduced the tree height by 1. So remove leaf vertices from the graph. Once you do that, you'll have new leaf vertices and the same reasoning applies. Remove leaf vertices repeatedly until there is one vertex or two vertices with one edge between them.
Keep a count of the vertices remaining in the graph. Every time a vertex is "removed" from the graph, decrement this count. Once the number of vertices drops below 2, instead of proceeding to process the leaf vertices in the queue, just go ahead and return the remaining vertex or vertices.
Repl
When there is a partial ordering, topological sort is likely the way to go.
Create a list of the vertices such that vertices do not appear before their ancestors.
Can we make progress on the problem by repeatedly removing the leaf vertices?