wting / python-graph

Automatically exported from code.google.com/p/python-graph
Other
5 stars 3 forks source link

pygraph.algorithms.generators functions should allow a start graph to be passed in as an argument #57

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The current functions generate and generate_hypergraph assume that the user
wants a graph to be created and then have random nodes & edges attached,
however the functions make no provision for adding nodes and edges to a
pre-existing graph. 

The motivation for this is that these functions assume that only three
implementations of graphs exist (graph, digraph & hypergraph), however we
will need to be able to create all kinds of random graphs - not just these
three initial types. 

I propose that the functions be changed:

def generate(num_nodes, num_edges, directed=False, weight_range=(1, 1),
random_graph=None )

becomes:

def generate(G, num_nodes, num_edges, weight_range=(1, 1), random_graph=None ):

Graph G (in this case a digraph or graph) must exist already. The function
will return no value. 

Likewise

def generate_hypergraph(num_nodes, num_edges, r = 0)

becomes:

def generate_hypergraph(G, num_nodes, num_edges, r = 0):

Original issue reported on code.google.com by salimfadhley@gmail.com on 18 Nov 2009 at 12:04

GoogleCodeExporter commented 9 years ago
What's the motivation for this? It would seem to me that a generator should do 
just
that -- create a graph. If there is a need for shuffling edges around, or 
randomizing
an existing graph, that sounds like a new library to me.

Original comment by christian.muise on 18 Nov 2009 at 3:14

GoogleCodeExporter commented 9 years ago
As I stated before, the motivation is that there are many kinds of graphs we 
might
conceive of. The python-graph package is intended as a framework for all kinds 
of
graph related activities rather than simply a collection of three graph classes.

Supposing I were to create a new kind of graph, I'd like to be able to use the
generate behavior to make random nodes & edges however I do not want to have to 
hack
this method each time I make a new graph implementation. 

Last night I was testing a new version of digraph against randomly generated 
graphs.
The functions as they currently stand do not permit a sensible way of generating
anything other than regular graph, digraph and hypergraphs. 

Original comment by salimfadhley@gmail.com on 18 Nov 2009 at 3:47

GoogleCodeExporter commented 9 years ago
There's two things at play here: new implementations of a graph type that we 
handle,
and a new graph type altogether. For new graph types (tree, dag, etc), then a
completely new generator function is the only reasonable approach. If it's a new
implementation, then the check for digraph should still work as it is.

The algorithms in core should only use the interface, so if it works for one 
digraph
it should work for all. I see your point with the fact that we can't choose 
which
implementation to use, but I don't think passing in the graph object is the best
option. I think a more universal approach is to have the ability to change the
default digraph / graph / hypergraph implementation.

Original comment by christian.muise on 18 Nov 2009 at 4:10

GoogleCodeExporter commented 9 years ago
What about make generator receive a class reference to instantiate?

Receiving a graph object would make it a random modificator, not a random 
generator,
I guess. It would also avoid problems with the generator trying to add nodes 
that
already belong to the object, etc.

Original comment by pmatiello on 18 Nov 2009 at 8:41

GoogleCodeExporter commented 9 years ago
I think Salim's hope (correct me if I'm wrong) was to avoid having to check for 
every
new class of graph in the algorithms. Passing in the class reference wouldn't 
avoid this.

  The only clearcut way is some sort of abstract class hierarchy. May be worth a
thought...

Original comment by christian.muise on 18 Nov 2009 at 9:26

GoogleCodeExporter commented 9 years ago
Yes, but the class reference would not be checked against anything. It would be
instatiated. Say:

def generate(clazz):
  gr = clazz()
  ... (continue as usual)

Original comment by pmatiello on 18 Nov 2009 at 10:02

GoogleCodeExporter commented 9 years ago
We'd still need to check it for the generation process. We build a graph 
differently
than a hypergraph differently than a digraph, etc. But we should build a 
digraph the
same way regardless of what's under the hood.

Original comment by christian.muise on 18 Nov 2009 at 10:10

GoogleCodeExporter commented 9 years ago
"I think Salim's hope (correct me if I'm wrong) was to avoid having to check 
for every
new class of graph in the algorithms. Passing in the class reference wouldn't 
avoid
this."

Yes, but that assumes that every kind of graph will have a trivial default
constructor. Right now I'm working on a new implementation of a graph which 
stores
it's data in an external database. I wouldnt want the syntax of the function to 
have
to cope with every single graph construction variant that we could ever 
anticipate -
it's just too much to expect.

All I wanted to do was de-couple the buisness of creating new instances of
graph-classes ( which is specific to the graph implementation ) with the other
business of adding random nodes & edges ( which clearly applies to all kinds of 
graph
and digraph regardless of their implementation ). 

Original comment by salimfadhley@gmail.com on 18 Nov 2009 at 10:25

GoogleCodeExporter commented 9 years ago
@Christian: The same issue applies to receiving the graph object as an 
argument. We
could still check for gr.DIRECTED or gr.HYPERGRAPH. Then the function could 
work for
any implementation of graphs, digraphs and hypergraphs.

@Salim: We can still do something like:
def generate(clazz, *args)
  gr = clazz(*args)

This should be enough for most possible implementations.

Original comment by pmatiello on 18 Nov 2009 at 10:40

GoogleCodeExporter commented 9 years ago
@pedro:
 @Salim: We can still do something like:
  def generate(clazz, *args)
  gr = clazz(*args)

This would still expect the graph clazz to implement a constructor. I believe 
salim
doesn't want to have to do that everytime he experiments with a new class
implementation..(am i right salim??)

@ salim:

"All I wanted to do was de-couple the buisness of creating new instances of
graph-classes ( which is specific to the graph implementation ) with the other
business of adding random nodes & edges ( which clearly applies to all kinds of 
graph
and digraph regardless of their implementation )"

May be we need a randomizer function on a given graph?? 
 i don't see how you can avoid writing a constructor to your graph implementation class

@christian:
"The only clearcut way is some sort of abstract class hierarchy. May be worth a
thought..." 

yep that seems to be the need of the hour...... considering all the 
experimenting
done with different implementations ..

May for starters we can just have the graph types (graph,digraph, hypergraph, 
etc..)
and graph implementation classes ??
Graph types can instantiate the implementations they prefer..

Original comment by anand.ib...@gmail.com on 18 Nov 2009 at 11:10

GoogleCodeExporter commented 9 years ago
So if the issue is a constructor, then we could use factory design.
generate_{graph,digraph,hypergraph} would (as part of its code) call the factory
function for the graph it needs. If the factory is configurable, then we could 
set it
up to instantiate any implementation of our choice, with the settings / 
parameters of
our choice. It keeps the generators clean.

"All I wanted to do was de-couple the buisness of creating new instances of
graph-classes ( which is specific to the graph implementation ) with the other
business of adding random nodes & edges ( which clearly applies to all kinds of 
graph
and digraph regardless of their implementation )."

  Unless I'm misreading, the problem here is that random graph generation is very
(graph type) specific. Unless we force the graphs to have a
return-potential-edge-you-don't-currently-have function, we need to deal with 
each
type separately. (eg. you don't want cycles introduced in the tree graph type if
you're just connecting random nodes).

Original comment by christian.muise on 19 Nov 2009 at 12:12

GoogleCodeExporter commented 9 years ago
def generate(clazz, *args)
  gr = clazz(*args)

Yes, we could do this - the generate function already has quite a few arguments 
so it
would end up being a messy looking function. That's something I want to avoid. 

That's why I originally suggested that we simply de-couple the business of
constructing objects from the other business of generating random edges. 

The use-case is in testing / benchmarking where we may want to generate random 
edges
on multiple implementations of digraph (for example) - in this case it's useful 
to
have a function that does the random edge generation. Invoking a constructor 
for each
of the types I want to test is trivial.

Original comment by salimfadhley@gmail.com on 19 Nov 2009 at 12:20

GoogleCodeExporter commented 9 years ago
The type-specificity (specificity?) is surely an issue, so I think you are 
right:
different functions for each category are probably the way to go.

About the factory thing, could you clarify with an example? I'm still thinking 
about
passing a class reference, and optional constructor parameters, as arguments.

Original comment by pmatiello on 19 Nov 2009 at 12:20

GoogleCodeExporter commented 9 years ago
For an example, consider the following case: I want to use the database-digraph
implementation for a random digraph. Assume that the constructor takes an 
address and
a port argument.

----

from pygraph import graphfactory as gf
from pygraph.algorithms.generators import generate_digraph as gd

gf.set_default_digraph('db-digraph')
gf.digraph_settings['address'] = "address.to.my.db"
gf.digraph_settings['port'] = 1234

myGraph = gd(num_nodes = N, num_edges = E)

# Now change it to normal digraph
gf.set_default_digraph('digraph')
myOtherGraph = gd(num_nodes = N, num_edges = E)

# ...and if we wanted just an empty graph
myEmptyGraph = gf.create_digraph()

----

  That would be the general idea. The arguments are set (globally) in the factory,
and any part of the code can call for a new graph to be generated. You could 
even do
fancier things like passing in a lambda for an argument value which would make 
it
evaluated whenever a new graph is generated.

Original comment by christian.muise on 19 Nov 2009 at 1:47

GoogleCodeExporter commented 9 years ago
I don't like it. It's both non-intuitive and thread-unsafe. I'd like to avoid 
this
configuration stuff too.

I still think that something like this is better:
generate_digraph(some_digraph_implementation, other arguments...)

Original comment by pmatiello on 19 Nov 2009 at 2:10

GoogleCodeExporter commented 9 years ago
Fair 'nough. Are we going to have any 2.4-2.5-2.6-3.0 compatibility issues with 
the
handling of the arguments in a bundled way?

Original comment by christian.muise on 19 Nov 2009 at 2:14

GoogleCodeExporter commented 9 years ago
We're 2.6+ now, so I don't think so. I guess. Salim?

Original comment by pmatiello on 19 Nov 2009 at 2:33

GoogleCodeExporter commented 9 years ago

Original comment by pmatiello on 15 Jan 2010 at 5:22

GoogleCodeExporter commented 9 years ago

Original comment by pmatiello on 20 Mar 2010 at 10:22