xflr6 / graphviz

Simple Python interface for Graphviz
https://graphviz.readthedocs.io
MIT License
1.61k stars 208 forks source link

Add support for literal colons in node name (WAS: escaping of `::`) #53

Open thomas-riccardi opened 6 years ago

thomas-riccardi commented 6 years ago
from graphviz import Digraph

print("Bug 1: strange generated DOT source")
dot = Digraph(filename='bug-escaping-1.dot', format='png')

n1 = "A"
n2 = "::"
dot.node(n1)
dot.node(n2)
dot.edge(n1, n2)

print(dot.source)
dot.render()

print("Bug 2: crash")
dot = Digraph(filename='bug-escaping-2.dot', format='png')

n1 = "A"
n2 = "'B::C'"
dot.node(n1)
dot.node(n2)
dot.edge(n1, n2)

print(dot.source)
dot.render()

Result:

$ python3 bug-escaping.py 
Bug 1: strange generated DOT source
digraph {
        A
        "::"
        A -> "":""
}
Bug 2: crash
digraph {
        A
        "'B::C'"
        A -> "'B":"":C'
}
Error: bug-escaping-2.dot: syntax error in line 4 near '''
Traceback (most recent call last):
  File "bug-escaping.py", line 26, in <module>
    dot.render()
  File "/home/thomas/.local/lib/python3.5/site-packages/graphviz/files.py", line 176, in render
    rendered = backend.render(self._engine, self._format, filepath)
  File "/home/thomas/.local/lib/python3.5/site-packages/graphviz/backend.py", line 124, in render
    subprocess.check_call(args, startupinfo=STARTUPINFO, stderr=stderr)
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['dot', '-Tpng', '-O', 'bug-escaping-2.dot']' returned non-zero exit status 1

Info:

xflr6 commented 6 years ago

Thanks for the detailed report. The edge method currently accepts :-separated strings as node_ids (DOT language), currently splitting them naively into node, port, and compass: https://github.com/xflr6/graphviz/blob/5eaf0b7c8b28c6e8f260175ecc3d1eecad92dc98/graphviz/lang.py#L55-L74 Unforunately, fixing the 'colons in node identifiers' use case might require an API change, so will need to check how to best do this in a backwards-compatible way.

In the mean time, you can use node names without literal colons (putting them into the label instead):

In [1]: from graphviz import Digraph
   ...: 
   ...: d = Digraph()
   ...: d.node('A')
   ...: d.node('B', label='::')
   ...: d.edge('A', 'B')
   ...: 
   ...: d
Out[1]:

d

In [2]: d2 = Digraph()
   ...: d2.node('A')
   ...: d2.node('B', label="'B::C'")
   ...: d2.edge('A', 'B')
   ...: 
   ...: d2
Out[2]: 

d2

thomas-riccardi commented 6 years ago

Thanks @xflr6 for the workaround, I'll use that in the meantime.

mgoral commented 4 years ago

I just encountered this issue when creating a graph with some C++ modifiers in it (:: is used as a namespace separator in C++). I'm not very familiar with DOT language so the way how graphviz treated colons was very surprising to me. I'm fine with pre-processing nodes though and it's not very hard to e.g. assign a number to each unique node identifier.

I think that situation would be better if special meaning of colons (and possibly other characters) was documented better, especially that Python library hides the details of DOT language behind a nice and clean API.

xflr6 commented 4 years ago

Thanks. I agree: let's try to better document the current behaviour (maybe you want to help with this?).

mgoral commented 4 years ago

I'll try my best, but as I said I'm not very familiar with DOT language nor with implementation of your library, so I don't want to promise anything. :)

bayersglassey-zesty commented 2 years ago

It might be nice if port and compass were kwargs, as opposed to things parsed out of the name; or even if name was allowed to be any of: 'id', ('id,), ('id', 'port'), or ('id', 'port', 'compass').

(Bringing this suggestion into this issue from here: https://github.com/xflr6/graphviz/issues/168#issuecomment-1100677538)

bayersglassey-zesty commented 2 years ago

Having looked at the code a bit and thought about the different ways node() and edge() do their quoting, I actually think a NodeID class would work quite well, instead of tuples. It's a bit more explicit, and allows you to pass "fully-qualified" node IDs (including port + compass) to node(), which wasn't previously possible. (And is also totally useless, because node statements ignore ports and compasses. But it's nice to be consistent.) Here's a draft PR for that: https://github.com/xflr6/graphviz/pull/169