snowleopard / alga

Algebraic graphs
MIT License
719 stars 68 forks source link

Optional methods for Graph type class #4

Closed snowleopard closed 7 years ago

snowleopard commented 7 years ago

The current definition of the Graph type class is very minimalistic:

class Graph g where
    type Vertex g
    empty   :: g
    vertex  :: Vertex g -> g
    overlay :: g -> g -> g
    connect :: g -> g -> g

I like the minimalism, because it makes the library very simple.

However, for the sake of efficiency, it may useful to add optional methods to the type class, such as:

removeVertex :: Eq (Vertex g) => Vertex g -> g -> g

We already have an implementation (see Algebra.Graph.Util), which we could try to make the default (although it seems tricky). It works in O(n) time, where n is the size of the algebraic expression, and some instances could provide faster implementations by overriding the removeVertex method.

What are the pros and cons?

Good: the user of the library doesn't have to think when calling removeVertex -- the fastest available implementation will be called, depending on the specific instance.

Bad: reasoning about performance may become harder. At the moment we know that removeVertex runs in O(n) time. If it becomes a class method, polymorphic code will have unpredictable performance characteristics.

Bad: the type class becomes heavier and more difficult to understand. When do we stop? There are many graph transformation primitives -- do we have to add them all to the type class?

Bad: what if a graph instance provides more than one implementation of removeVertex with different characteristics? Which one do we choose?

A possible alternative: use additional type classes to specify performance constraints. For example:

class Graph g => GraphWithFastRemoveVertex g where fastRemoveVertex = ...

-- UGr is a graph data structure defined in the fgl library.
-- It provides an efficient delNode realisation.
instance GraphWithFastRemoveVertex UGr where fastRemoveVertex = delNode

Now you can demand graph data structures with particular performance characteristics via the type signature.

/cc @ndmitchell

ndmitchell commented 7 years ago

The GraphWithFastRemoveVertex seems a bit grim - you end up with N classes. And note that removeVertex isn't O(n) - it's only that if you use something with constant time construction for all operations, and only works if you start with a GraphMonad.

I think there is potential value in putting a small number of carefully chosen operations in the class, which can be implemented 100% generically, but for certain classes of graph are vastly more efficient. The other option is to say they are only implemented in a few choice data types, typically imported qualified, and people are expected to convert.

snowleopard commented 7 years ago

The GraphWithFastRemoveVertex seems a bit grim - you end up with N classes. I think there is potential value in putting a small number of carefully chosen operations in the class

@ndmitchell Shall we list the candidates here? I'm starting to think that there are not that many of them. The removeVertex operation does seem to make sense for all inhabitants of the Graph type class, because it corresponds to the vertex method. But removeEdge on the other hand is already problematic, because some instances, like ToList would find it awkward to have it, even though they can implement it very efficiently as a no-op. Note, there is no explicit notion of edges in the type class itself.

Is there any other candidate to bring N above 1? :)

P.S.: For various newtype's like GraphFunctor the requirement to also handle removeVertex is actually a burden. Presumably, this will become a common theme: the larger the core, the more often one needs to handle cases that may be irrelevant for the task at hand.

ndmitchell commented 7 years ago

If the type class changes as I suggest in #5 then most of these objections change in nature - so I suggest we resolve that before coming back to this.

snowleopard commented 7 years ago

After a lot of refactoring, we ended up with the API where each Graph instance defines a bunch of custom functions, such as isEmpty, hasVertex, vertices, removeVertex, and many others, which all have different complexity characteristics.

See: http://hackage.haskell.org/package/algebraic-graphs/docs/Algebra-Graph.html

Making all of them part of the core type class seems to be an overkill.

I guess we can close this issue for now. If there is a proposal to create a more feature-rich type class, with complete graph manipulation API (say, as defined in Algebra.Graph), we can create a separate issue.