Figure out a sane creation/factory pattern

sdboyer commented 10 years ago

i still have yet to figure out a nice, consistent way to build the public interfaces for graph creation that doesn't involve exposing just an absurd number of functions. there are already 8 type permutations ({directed,undirected}x{basic,weighted,labeled,data}), and that'll be 16 once the {mutable,immutable} axis is implemented.

my general thinking thus far is that the main API should offer two basic entry points: creating via a named identifier (for those who know what they want), and creating via a set of capabilities (for those who don't, or just don't want to think hard).

OTOH, having individual New*() methods for each graph has some significant godoc benefits:

if they return an interface and the underlying graph struct is undexported, then all the different New*() functions are grouped under the interface. and the doc on the interface then better implicitly applies to its implementations.
it gives a single place in which to document, for that graph implementation in particular, any of the gotchas, performance realities, etc., that may be pertinent over and above what is specified by the interface.

this is a prerequisite to a lot of nicer things.

sdboyer commented 10 years ago

pinging @technosophos and @mattfarina for their input on this

mattfarina commented 10 years ago

Reading this, without reading the other/linked issues yet, I like the idea of New*() functions. Having a top level place to create the graphs and have them documented would be useful.

An alternate pattern would be to have one New function that takes a name. The problem here would be in creating sane documentation for each case that's in a useful place for end users.

You have 16 permutations right now that would lead to 16 New*() functions, right? How will this 16 number grow over time? I'm curious about the scalability of this interface.

A consideration here is when the type of graph is chosen and then interacting with the graph. I'm not intimately familiar with this codebase or how I would use is so this might be a poor questions but, how often will users of the library choose between different types of graphs? I'm curious how this would apply when using a named parameter vs a different function and how that would play out in the code flow.

sdboyer commented 10 years ago

Reading this, without reading the other/linked issues yet, I like the idea of New*() functions. Having a top level place to create the graphs and have them documented would be useful.

yeah, i think it handles the basic case well, and it's a "good enough" approach atm. but i don't think it scales really well.

An alternate pattern would be to have one New function that takes a name. The problem here would be in creating sane documentation for each case that's in a useful place for end users.

righto. and also that there is potentially some variation in the parameters required by different graph types. at the very least, immutables will probably need to have a graph or graph-like thing passed to them from which they create themselves. that's the only parameter variation i've thought of thus far, though, and there's no problem with having mutables also take a graph that they populate themselves from.

You have 16 permutations right now that would lead to 16 New*() functions, right? How will this 16 number grow over time? I'm curious about the scalability of this interface.

a lot more. the multigraph vs. simple graph distinction adds at least a 2-item, possibly a 3-item axis. and then there's the currently 1-item axis of implementation (adjacency list). here's a breakdown of the potential axes i'm currently thinking may be worth implementing. currently implemented are italicized.

Multiplicity: simple graph, multigraph, pseudograph
Edge directionality: undirected, directed
Edge type: basic, weighted, labeled, arbitrary data (permutations could also arise from implementing multiple of these together)
Mutability: mutable, immutable
Representation algorithm: adjacency list, adjacency matrix, edge list...others? cf. BGL

some properties on some axes preclude properties on other axes, but yeah.

sdboyer commented 10 years ago

A consideration here is when the type of graph is chosen and then interacting with the graph. I'm not intimately familiar with this codebase or how I would use is so this might be a poor questions but, how often will users of the library choose between different types of graphs? I'm curious how this would apply when using a named parameter vs a different function and how that would play out in the code flow.

yeah, this is really the most important thing to think about. i currently see two types of users:

"I need some graph for this basic case"
"I need a particular graph implementation with specific performance characteristics"

for the former case, the axes that generally matter are primarily edge directionality, secondarily edge type, and maybe multiplicity. in the latter case, where performance (and possibly scalability) are a chief concern, mutability and the representation algorithm matter.

it's entirely possible that people who start in the first camp may eventually move into the latter camp. this is a lot of why i want to keep the interfaces consistent - swap out an implementation (or even pick a distributed one) but keep your code generally the same.

i generally expect the use pattern to be "create a graph once, then apply lots of transforms to it." so this entry point won't be one that's used often. the real challenge for a lot of people, i think, is understanding how their use case maps to gogl's concepts. so my chief goal in designing the entry point is to make it clear that it is THE entry point, and then enumerate the choices people have within it. from an information-organization perspective, i think this should will enormously in letting folks hone in on what exactly they need from gogl, and be sure that they're starting from the right place.

sdboyer commented 10 years ago

@technosophos suggested using lann/builder. yesssss. gonna do that.

for a sec i thought it would let me magically dodge around type parametricity...yeah, that's not possible. but it'll let me have a set of New*() functions which return exported structs (for case 2, above), and still provide a meta-builder interface that returns interfaces and makes it all easy to use (for case 1).

mattfarina commented 10 years ago

That's an excellent recommendation. How did I not know about Iann/builder before. Yayz!

sdboyer commented 10 years ago

i'm now more or less happy with this structure - considering this done

sdboyer / gogl

Figure out a sane creation/factory pattern #1