Closed visubesy closed 1 year ago
I found newer builds of graphviz. I tried the build 2.41.314, "Environment: build_system=msbuild; Configuration: Release".
There is no neato.exe anymore but I can reproduce an identical crash also with dot.exe.
Program arguments:
dot.exe -Tdot -n -Gsplines=ortho -Kneato
Input data:
Same as in my initial post.
Note: This program call doesn't work at all with the build 2.41.314, "Environment: build_system=cmake, generator=Visual Studio 14 2015; Configuration: Release". Instead I get the error message:
Format: "dot" not recognized. Use one of:
I'm able to reproduce the issue, it happens with the 32bit and 64bit version of Graphviz on Windows but not on Linux. The issue is located in the function newnode
and newtrap
in lib/ortho/trapezoid.c
:
/* static int chain_idx, op_idx, mon_idx; */
static int q_idx;
static int tr_idx;
static int QSIZE;
static int TRSIZE;
/* Return a new node to be added into the query tree */
static int newnode(void)
{
if (q_idx < QSIZE)
return q_idx++;
else {
fprintf(stderr, "newnode: Query-table overflow\n");
assert(0);
return -1;
}
}
/* Return a free trapezoid */
static int newtrap(trap_t* tr)
{
if (tr_idx < TRSIZE) {
fprintf(stderr, "calling newtrap: %d\n", tr_idx);
tr[tr_idx].lseg = -1;
tr[tr_idx].rseg = -1;
tr[tr_idx].state = ST_VALID;
return tr_idx++;
}
else {
fprintf(stderr, "newtrap: Trapezoid-table overflow %d\n", tr_idx);
assert(0);
return -1;
}
}
The value TRSIZE
gets set to 1901 and the check tr_idx < TRSIZE
fails when tr_idx
gets to large. On Linux (Debian) however, tr_idx
only reaches 1657 so all is fine. Could one of the other developers shed some light on this?
Note that there really is only one layout program; neato and the rest are just links to dot. Alternatively, as you noticed, you can specify the layout algorithm using the -K flag.
This would probably be trivial to debug with access to a Windows debugger, especially if one could compare that with Linux output. A first step would be to remove as many nodes as possible. Of course, one could always extend the trapezoid array, but the bound used should be adequate.
I tried to reduce the example but I couldn't get it much smaller. If I remove any further box, there is no crash anymore on my Windows 7 system (with dot - graphviz version 2.39.20160612.1140 (20090106.0545), i.e. the msbuild build mentioned in my second comment).
Here is the reduced example: graphviz-neato-crash-reduced.txt
The error output with the reduced example is:
newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881
On some of the tries on Windows 10 (with neato - graphviz version 2.36.0 (20140111.2315)) with the reduced example I additionally got the following error messages after the error messages listed above:
failed at node 0[1] failed at node 2[0] failed at node 3[0] out of memory
@emden Why is there a bound on the number of trapezoids? Wouldn't it be better to dynamically add and allow any number of trapezoids?
Because one should be able to bound the possible number of trapezoids given the number of rectangular objects. There must be something in the Windows code that makes this analysis not work.
One could write a more defensive code which would allow one to exceed the bound, but that would be hiding something wrong in the code.
Is there some way I could assist in fixing this bug as someone who doesn't know anything about the code of graphviz?
The crash currently appears in a production release of one of our tools and there is no workaround.
This is to confirm that on Mac OSX, the graph renders as shown in this attachment. The graph has only four edges. Three are rendered as straight lines between nodes on the same level. One is rendered with two bends and is between nodes on adjacent levels. These are not complicated cases.
You can try to increase QSIZE or TRSIZE but as already mentioned it is unlikely to help since the routes are so simple, something fundamental seems to be wrong.
One approach would be to dig in to understand the code in graphviz/lib/ortho and determine why there is a problem, starting with, which edge is being routed when the failure occurs. I am not familiar with this code. Emden Gansner wrote it. He's a better, more thoughtful programmer than me, fortunately. The module is about 4500 lines long and there are a lot of useful comments. I'm not sure what algorithm it is based on. It would be helpful to identify that in the hope of finding a publication that would provide some guidance about how the router is supposed to work. Possibly you could trace the Mac (or Linux?) version side by side with the Windows version and determine what goes wrong without needing to develop a deeper understanding of how the router works. Maybe there is a problem of numerical stability somewhere.
Neither Emden or I use Windows much so we can't help much with that.
This brings up a larger issue, of how open source code that provides useful infrastructure but is supported by only one or two people can be maintained, especially when the principals have moved on to work on other projects. There doesn't seem to be a good answer. You could try looking on BountySource or Kickstarter. In the past I checked out whether it would be practical to look for, say, NIH funding to support Graphviz, at the level of say 2 full-time staff. An experienced, successful grant writer told me he felt we had a very good case, so if we would spend 12 weeks of full-time effort writing a proposal (including getting some strong external endorsements, and explaining what new contributions would be supported by the grant), probably piggybacking on some larger institution (that would have to be sold on the project) then our chances of getting funded might be as high as 50%. I just can't expend a lot of effort on something as risky and difficult as that when there is so much other visualization R&D out there that is fairly easy to get without much overhead. A commercial product might want to move to a commercially supported software platform like Tom Sawyer Software or yWorks for example.
The algorithm is based on the work done with a summer intern long ago. The basic algorithm is fairly pretty but it was never published unfortunately. I believe I still have notes on it. I have considered replacing it with the one from Monash, but that would be a lot of work and the latter algorithm also has weaknesses.
As Stephen notes, it is probably a subtle numerical problem, as the graph is so simple and regular. (One wonders if the number of collinearities is part of the problem.) If I had access to a Windows debugger, the problem could be nailed in a few hours. As a hack, I could try to extend the sizes. If I can, I'll also see if there is some debugging scaffolding I can add.
Someone could offer Emden rdesktop access to a Windows machine with Visual Studio to debug this.
On Sep 13, 2017, at 6:16 PM, emden notifications@github.com wrote:
The algorithm is based on the work done with a summer intern long ago. The basic algorithm is fairly pretty but it was never published unfortunately. I believe I still have notes on it. I have considered replacing it with the one from Monash, but that would be a lot of work and the latter algorithm also has weaknesses.
As Stephen notes, it is probably a subtle numerical problem, as the graph is so simple and regular. (One wonders if the number of collinearities is part of the problem.) If I had access to a Windows debugger, the problem could be nailed in a few hours. As a hack, I could try to extend the sizes. If I can, I'll also see if there is some debugging scaffolding I can add.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ellson/graphviz/issues/1260#issuecomment-329312819, or mute the thread https://github.com/notifications/unsubscribe-auth/ACtWz6W_ZkFkNMrZDWxjRo3kUEiHtQPFks5siFRWgaJpZM4OeMFh.
What about using a Virtual Machine with Windows and Visual Studio installed to debug the problem?
Visual Studio Community edition can be used for free. There is also an evaluation version available of Windows 10. There are even Virtual Machines with preinstalled Windows and Visual Studio available, e.g. as Vagrant boxes.
Unfortunately, I can't offer remote desktop access because our company guidelines don't allow this.
With the instructions for graphviz >= 2.41 in doc/winbuild.html, I managed to build graphviz in Visual Studio 2015 Professional as Debug / Win32.
With "dot.exe -Tdot -n -Gsplines=ortho -Kneato C:\BeSy\Temp\graphviz-neato-crash-reduced.txt", this is the stack trace to the the crash (graphviz version):
gvplugin_neato_layout.dll!newtrap(trap_t tr) Line 83 C gvplugin_neato_layout.dll!add_segment(int segnum, segment_t seg, trap_t tr, qnode_t qs) Line 591 + 0x9 bytes C gvplugin_neato_layout.dll!construct_trapezoids(int nseg, segment_t seg, int permute, int ntraps, trap_t tr) Line 1073 + 0x30 bytes C gvplugin_neato_layout.dll!partition(cell cells, int ncells, int nrects, boxf bb) Line 751 + 0x19 bytes C gvplugin_neato_layout.dll!mkMaze(Agraph_s g, int doLbls) Line 491 + 0x29 bytes C gvplugin_neato_layout.dll!orthoEdges(Agraph_s g, int doLbls) Line 1289 + 0xd bytes C gvplugin_neato_layout.dll!_spline_edges(Agraph_s g, expand_t pmargin, int edgetype) Line 609 + 0xb bytes C gvplugin_neato_layout.dll!splineEdges(Agraph_s g, int (Agraph_s , expand_t , int) edgefn, int edgetype) Line 737 + 0x11 bytes C gvplugin_neato_layout.dll!spline_edges1(Agraph_s g, int edgetype) Line 750 + 0x12 bytes C gvplugin_neato_layout.dll!spline_edges0(Agraph_s g, unsigned char set_aspect) Line 780 + 0xd bytes C gvplugin_neato_layout.dll!init_nop(Agraph_s g, int adjust) Line 601 + 0xb bytes C gvplugin_neato_layout.dll!neato_layout(Agraph_s g) Line 1421 + 0xb bytes C gvc.dll!gvLayoutJobs(GVC_s gvc, Agraph_s g) Line 85 + 0xd bytes C dot.exe!main(int argc, char * argv) Line 132 + 0x15 bytes C dot.exe!invoke_main() Line 64 + 0x1b bytes C++ dot.exe!__scrt_common_main_seh() Line 253 + 0x5 bytes C++ dot.exe!__scrt_common_main() Line 296 C++ dot.exe!mainCRTStartup() Line 17 C++
@emden: If you can't use a Virtual Machine with Windows, I could use the Visual Studio debugger to check the values of certain variables if that helped. You would have to tell me where to look.
Could you comment out all but the first 4 nodes in your graph. Then, lib/ortho/partition.c, uncomment the dumpTrap function, and put a call to it dumpTrap(trs, nt); right after the two calls to nt = construct_trapezoids(nsegs, segs, permute, ntraps, trs);
Then run neato, and save the stderr output and post it. Thanks.
@emden:
I removed everything except the first 4 nodes from the original file (not the reduced one): graphviz-neato-4-nodes.txt
Here are is the content of stderr of "dot.exe -Tdot -n -Gsplines=ortho -Kneato graphviz-neato-4-nodes.txt": graphviz-neato-4-nodes_stderr.txt
dot.exe didn't crash.
Is there something else I can do to help with fixing this bug (e.g. provide further debug information)?
Thank you for following up. As mentioned before, it seems there is no one in a position to work on this (who is familiar with the code and with the Windows VS development environment).
If you feel it’s important, then a way forward is to trace the Linux code that works, and the Windows code that fails, and understand why they diverge.
Or, maybe someone knows how to set up and fund a project like Google Summer of Code.
The commercial alternatives include Tom Sawyer Software and yWorks graph layout technology, which I believe have orthogonal layout and are reasonably priced.
Stephen North
On Mar 19, 2018, at 12:48 PM, visubesy <notifications@github.com mailto:notifications@github.com> wrote:
Is there something else I can do to help with fixing this bug (e.g. provide further debug information)?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ellson/MOTHBALLED-graphviz/issues/1260#issuecomment-374282841, or mute the thread https://github.com/notifications/unsubscribe-auth/ABatMP4gV93iRNo73-DDr2ykGeJB8AgRks5tf-F6gaJpZM4OeMFh.
I managed to get the same crash on Linux (Debian Stretch, 64 bit):
newtrap: Trapezoid-table overflow 3241 dot: trapezoid.c:88: newtrap: Assertion `0' failed.
File: http://fff.mifritscher.de/data-6.crash.dot
I'm using dot, and I'm using orthogonal edges as well.
Perhaps this finding can help debugging it as it happens for me under Linux.
Command line: dot -Tpdf /var/www/html/data-6.crash.dot >/var/www/html/graph-6.pdf
I get a similar error message with graphviz version 2.41.20190918.0833 when calling
dot.exe -Tdot -n -Gsplines=ortho -Kneato
with the following dot file: Neato_TrapezoidTableOverflow161_Simplified.txt
newtrap: Trapezoid-table overflow 161 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 161 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 161 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 161 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 161 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 161 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 161 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 161
Note I'm using subgraphs with rank="source" and rank="sink" to draw edges to the outer border of the graph. The error doesn't occur if the subgraph with rank="sink" appears before the subgraph with rank="source" in the input file.
Graphviz 2.47.3 is also affected by this crash. Output of stable_windows_10_msbuild_Debug_Win32_graphviz-2.47.3-win32.zip:
dot.exe -V
dot - graphviz version 2.47.3 (20210619.1520)
dot.exe -Tdot -n -Gsplines=ortho -Kneato < "C:\Temp\graphviz-neato-crash-reduced.txt"
newtrap: Trapezoid-table overflow 1881 Assertion failed: 0, file C:\GitLab-Runner\builds\graphviz\graphviz\lib\ortho\trapezoid.c, line 79
Output of stable_windows_10_msbuild_Release_Win32_graphviz-2.47.3-win32.zip:
dot.exe -V
dot - graphviz version 2.47.3 (20210619.1520)
dot.exe -Tdot -n -Gsplines=ortho -Kneato < "C:\Temp\graphviz-neato-crash-reduced.txt"
newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881 newnode: Query-table overflow newnode: Query-table overflow newtrap: Trapezoid-table overflow 1881
This bug apparently was fixed because I can't reproduce the crash with graphviz 7.0.6 (20230106.0513) anymore. Consequently, I close this bug report.
When executing neato.exe with certain input data, neato.exe crashes. I tried it on two systems, one being Windows 7 x64 and one being Windows 10 x64. On the Windows 7 system, the crashes seem to always occur. On the Windows 10 system, the crashes seem to occur in about 50% of the program calls (with identical input data).
I can reproduce the crashes with the following versions of neato.exe:
Unfortunately, there are no newer builds of graphviz / neato available for Windows.
Program arguments:
Program exit code (ERRORLEVEL):
Error messages which are output by neato.exe:
Input data:
graphviz-neato-crash.txt