ellson / MOTHBALLED-graphviz

Moved to https://gitlab.com/graphviz/graphviz
Eclipse Public License 1.0
1.29k stars 256 forks source link

[Dot] "Error: trouble in init_rank" puzzle finally solved! #1173

Open ens-lg4 opened 8 years ago

ens-lg4 commented 8 years ago

Dear GraphViz developers,

We were migrating our software that uses "dot" for graph generation to another RedHat Linux machine. I was puzzled by the fact that the same code behaved differently on two computers, although we made sure all the installed packages/libraries were the same. The same version of "dot" (actually the same binary living on a shared mounted filesystem) with the same input file was working fine on one computer and throwing the infamous "trouble in init_rank" error on another.

Today we finally tracked it down to a single culprit: the availability of Type1 fonts in /usr/share/fonts/ ! There is a clear 100% correlation: adding the files solves the problem, removing the files causes the "trouble in init_rank" kind of crash.

Apparently, before laying out the graph "dot" uses (possibly indirectly, via png/svg libraries?) system's font files to estimate sizes of text-containing elements. For some reason, the failure to find those files is not properly caught/reported, but the missing data somehow causes that "trouble in init_rank".

It was very confusing that for smaller graphs there was no crash. Probably, this dependence on font files is masked by some other, stronger conditions.

The same behaviour has been seen in "dot" versions from 2.28.0 through 2.38.0 (under Ubuntu and RedHat). The test file has been attached. To replicate you can either move files out of and back into /usr/share/fonts or simply change font paths in /etc/fonts/fonts.conf to temporarily mask the actual font files.

Thank you in advance for looking into this!

trouble_in_init_rank.zip

bugfood commented 8 years ago

I cannot test the example provided--my distro-provided graphviz (2.38.0 on Debian) works fine, and my self-compiled graphviz from git actually segfaults, which I did not look into.

Unfortunately, I do not think this is a generic solution to init_rank trouble. I've been trying to debug an init_rank issue I have, and I'm pretty much at the point where I have to give up and admit I can't fix it in the time I have available. I'll write up more of my findings later, but, in brief, here is my understanding of the trouble in init_rank. There may be errors in the following--I don't have historical knowledge of graphviz or its techniques.

  1. Dot cannot directly handle cyclic graphs. Cyclic graphs are made acyclic by selectively reversing the direction of edges until the graph is made acyclic, while keeping track of the original direction so as to be able to properly draw arrows later. There is an early processing stage that handles this, after which the internal representation of the graph must remain acyclic.

  2. Graphviz inserts many virtual nodes and edges into the internal graph to do things like control layout, draw labels, and route edges. Graphviz creates edges between virtual nodes, and, as mentioned, the graph must remain acyclic.

  3. init_rank checks all nodes (including virtual nodes) by using a method that relies on the graph being acyclic. Cycles prevent init_rank from checking all nodes--the "trouble in init_rank" warning shows up when the number of nodes checked is less than the total nodes in the graph. Thus, when init_rank complains, the trouble actually happened earlier.

In the case I am examining, the trouble occurs when graphviz decides to draw a labeled edge originating in a cluster and terminating outside the cluster as a flat edge. Something about that, in conjunction with surrounding virtual nodes, creates a cyclical graph. I have been unable to determine why.

One thing that is clear is that subtle changes to the input dot file can dramatically change the layout of the resulting graph. This means that sometimes an unrelated change can "fix" init_rank trouble simply by avoiding the layout that had a problem.

-Corey