wting / python-graph

Automatically exported from code.google.com/p/python-graph
Other
5 stars 4 forks source link

Rewrite pydot #56

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Pull in the pydot code into python-graph and fix it for faster loading.

Note: This will remove the external dependency, and likely remove our need
for install-dot.

Original issue reported on code.google.com by christian.muise on 17 Nov 2009 at 11:28

GoogleCodeExporter commented 9 years ago
We'll still have pyparsing as a dependency.

Original comment by pmatiello on 17 Nov 2009 at 11:47

GoogleCodeExporter commented 9 years ago

Original comment by pmatiello on 15 Jan 2010 at 5:21

GoogleCodeExporter commented 9 years ago
Ok i will summarize what i understand(by a quick review of source) pydot does:
Somebody please validate(as i have never used it before): 
1. Verifies Graphviz is installed.
2. Defines its own graph class..
3. Reads/writes between this graph class and Dot file
4. Calls the Render executable from Graphviz and displays the graph..

Original comment by anand.ib...@gmail.com on 31 Jan 2010 at 12:54

GoogleCodeExporter commented 9 years ago
As far as I recall, ya. The use of pyparsing, and a fully defined grammar, make 
it 
quite suitable for all of the dot extras that you can use. The problem with the 
package 
is that 1. It's horribly inefficient for large graphs, and 2. It's no longer 
being 
maintained.

Original comment by christian.muise on 31 Jan 2010 at 5:12

GoogleCodeExporter commented 9 years ago
Christian++.

We should only worry about building the DOT stuff. We should not worry about 
graphviz
install or execution as 1. it can vary a lot (Windows, different Linux 
distributions,
etc) 2. someone might want to use an alternative program for drawing the DOT 
graphs.

Original comment by pmatiello on 31 Jan 2010 at 5:49

GoogleCodeExporter commented 9 years ago
Right..  I'll see what issues i noticed and what seems to need changing.(Am 
assuming
the only thing that needs to be done is Dot file to object conversion and 
vice-versa)
We need:
1.a class/object to represent the Dot file as a graph object. 
2.a couple of conversion functions to and fro between the object and the actual 
file.

1. we can use our graph class but in that case we'll have to add some 
attributes and
properties(to Common class) related to layout that Dot 
supports(Clusters,subgraphs
etc.) Or may be we can generalize those stuff as a hyper graph.. (guess that 
needs a
case to case basis discussion, as it is an otherwise bad idea to just add 
attributes
as supported by a format.) 
2. Here we will just have to tweak and optimize the functions currently 
provided by
pydot??

Original comment by anand.ib...@gmail.com on 1 Feb 2010 at 12:06

GoogleCodeExporter commented 9 years ago
Basically on the right track. Some points:

- The pydot code passes all Graphiz tests -- it would be ideal to pull in the 
breadth 
of DOT language handling that pydot currently has.

- The graph representation we write to should certainly be our graph classes. 
As you 
say, we'll need to deal with each 'special' graph type individually. I don't, 
however, think that hypergraphs are a solution to this. What we could likely do 
is 
use Graph objects as nodes in an abstract graph: each cluster is a Graph 
object, and 
also a node in the high-level graph (since nodes can be arbitrary objects). 
This is 
probably what the user would expect when reading / writing a dot file / Graph 
with 
clusters.

- We should pull as much of pydot over as we can, stripping out the unneeded 
(ie. 
Graphiz) stuff. The project is mostly dead (as far as development goes), but 
very 
much still used (as far as the new issues on Google Code goes).

---------

  I'm quite busy with school stuff these days, so I would be fine re-assigning this 
to Anand. Pedro -- you'll need to add him as a committer if this is the case. 
Alternatively, Anand can do up a patch and I'll review / fixup / commit on his 
behalf.

  Cheers

Original comment by christian.muise on 2 Feb 2010 at 12:06

GoogleCodeExporter commented 9 years ago
If you are interested, Anand, do it a send a patch to Christian (and CC me, 
please).

Right now, just a Python3 port is good enough. Grab pydot source from their 
site and
make it work on Python3. The less the code is changed, the better.

Don't worry about making it faster yet. We can work on that later.

Original comment by pmatiello on 2 Feb 2010 at 5:49

GoogleCodeExporter commented 9 years ago
1.Graphiz tests?? Are they a suite of tests that come along with graphviz 
package.
Googling is empty.. Will check up the official page on that...Also,does this 
mean we
shouldn't bother writing unittests??
2.Abstract graph?? you mean we create a new class which summarizes the dot file
parameters layout (which is relevant to the graph drawing layout issue
http://code.google.com/p/python-graph/issues/detail?id=63) or is there already 
an
abstract graph class?? (Actually that raises the question about this whole issue
being out of line with the project goals, but since it is accepted, i presume 
this is
an exception case and has been discussed.)

3. In that case we should write atleast a performance benchmark tests..
@Pedro: Yep, I can take up this one. (I am new to svn, but have worked with a 
CVS
repository before.)

Original comment by anand.ib...@gmail.com on 2 Feb 2010 at 9:49

GoogleCodeExporter commented 9 years ago
- These are the regression tests for graphviz:
http://www.graphviz.org/pub/scm/graphviz2/rtest/

- Unit tests are still helpful. Especially since we have a graph equivalence 
now (G
== toGraph(toDot(G)) type of tests)

- By abstract graph, I meant conceptually -- not implementation wise. For 
example
(note: not checked for correctness):

> # Create the first cluster
> cluster1 = graph()
> cluster1.add_nodes([1,2,3])
> cluster1.add_edge((1,2))
> cluster1.add_edge((2,3))
>
> # Create another cluster
> cluster2 = graph()
> cluster2.add_nodes([4,5])
> cluster2.add_edge((4,5))
>
> # Connect the clusters in an abstract graph
> G = graph()
> G.add_nodes([cluster1, cluster2])
> G.add_edge((cluster1, cluster2))

- Nothing with layout should be considered. Graph drawing is a separate (hard)
problem that (for now) is not going to be part of the pygraph functionality.

- Performance testing can be done via a small utility since it's not really a 
unit
test. It can probably go in the tests directory though (Pedro -- let us know if
that'll mess up the unittest runs).

@anand: So the approach for this will be making the changes and submitting a 
patch
(attachment to this issue). If you need help figuring this out with svn just 
let me know.

  Cheers

Original comment by christian.muise on 2 Feb 2010 at 2:38

GoogleCodeExporter commented 9 years ago
@Pedro: Yep i have begun using the 2to3 converter program.. currently trying to
figure out declaring namespaces. Will send as a patch file after am done..

@Christian:
--Graphiz ...That's whole lot of files out there.. there is a shell script for
running the tests... Will look a little deeper ..

--Abstract graph: Hmm.. i believe in pydot the cluster is relevant
concept(http://www.graphviz.org/doc/info/lang.html) only for drawing. So my 
question
was kinda moot considering i was alreading raising the issue.

@Christian: Thanks.. will bug you once am ready... last time i did a svn diff 
there
were space indentation issues... (believe that was to do with the editor i was 
using
)Anyway.. 

Original comment by anand.ib...@gmail.com on 3 Feb 2010 at 2:48

GoogleCodeExporter commented 9 years ago
@Pedro: I am sorry.  Looks like, i probably can't make it in time. I joined a 
startup
which have a prototype being released in 2 months time.. Haven't done much 
changes.
@all: If anyone else is willing they can take it up, mail me if you think i 
could help.

Original comment by anand.ib...@gmail.com on 3 Mar 2010 at 1:27

GoogleCodeExporter commented 9 years ago
No problem. Let's wait for Christian then. Good luck with the new job.

Original comment by pmatiello on 3 Mar 2010 at 2:00

GoogleCodeExporter commented 9 years ago

Original comment by pmatiello on 20 Mar 2010 at 10:22

GoogleCodeExporter commented 9 years ago
the speed is probably more pyparsing's fault, see
http://www.dalkescientific.com/writings/diary/archive/2007/11/03/antlr_java.html

igraph is written in C http://igraph.sourceforge.net/index.html with a python
interface so should be much quicker, you might find that useful

Original comment by infinity0x@gmail.com on 21 Mar 2010 at 12:36

GoogleCodeExporter commented 9 years ago
@infinity0x As far as I can recall, the issue was with graphs with a large 
number of 
nodes since the internals of pydot used an adjacency matrix. Using all of that 
memory 
up front was bring the system to it's knees...but it's been a while since I've 
taken a 
look at it.

  Thanks for the tip on igraph -- may be a great solution rather than re-working pydot 
for our needs...

Original comment by christian.muise on 21 Mar 2010 at 12:43