Remove GraphKL from the API

snowleopard commented 7 years ago

GraphKL records are defined in AdjacencyMap and IntAdjacencyMap modules for interoperability with King-Launchbury graphs. Specifically, they provide a way to reuse the algorithms from Data.Graph.

While reviewing #18, I realised that GraphKL's getVertex function is partial, and I'd like to avoid having partial functions in the API where possible.

I propose to remove GraphKL, because there are not too many algorithms left in Data.Graph that we haven't yet wrapped in a clean interface: dfs (which is added in #18), bcc, plus a few reachability tests.

My plan is therefore to provide clean interfaces to all of these functions instead of exposing GraphKL.

Please shout if you are using GraphKL!

snowleopard commented 7 years ago

There is one performance aspect to consider: constructing GraphKL from AdjacencyMap takes time, so if the user would like to run multiple Data.Graph algorithms, it would be better to construct GraphKL once and then reuse it. If we drop GraphKL, we'll need to somehow cache it within AdjacencyMap.

UnkindPartition commented 7 years ago

If we drop GraphKL, we'll need to somehow cache it within AdjacencyMap.

Good point. Probably the easiest way is to have a lazy pair (AdjacencyMap, GraphKL) (or an equivalent data type) where one field is lazily constructed in terms of the other.

snowleopard commented 7 years ago

@feuerbach Is this what you had in mind?

https://github.com/snowleopard/alga/blob/master/src/Algebra/Graph/AdjacencyMap/Internal.hs#L90-L102

I hope I'll finish the refactoring by the end of this week and will release the new version.

UnkindPartition commented 7 years ago

Yes, it is what I had mind. I think you may want to make the adjacencyMap field strict.

snowleopard commented 7 years ago

I think you may want to make the adjacencyMap field strict.

Good point! I'll do this.

This brings me to the question I've been thinking about for some time (which probably deserves a separate issue, but I'll start here).

Another interesting implementation to consider is:

data AdjacencyMap a = Empty
                    | Vertex a
                    | Overlay (AdjacencyMap a) (AdjacencyMap a)
                    | Connect (AdjacencyMap a) (AdjacencyMap a)
                    | Cached a

data Cached a = Cached
    { expression   :: AdjacencyMap a   -- subexpression
    , adjacencyMap :: !(Map a (Set a)) -- cached adjacency map representation
    , graphKL      :: !(GraphKL a) }   -- cached Data.Graph representation

(Or something isomorphic).

Here we keep the expression tree lazy, which is cheap (unless it contains a lot of cached subtrees).

In this way, I think, we can make AdjacencyMap an instance of higher-kinded graphs with standard Functor and Monad instances, by keeping expressions Cache-free during most of the tree transformations.

When an expression needs to be "evaluated", we turn it into a Cached one, which will require the Ord a constraint, but at this point we don't care.

UnkindPartition commented 7 years ago

This reminds me of various attempts to turn Set into a Functor/Monad — there were a few papers on this topic you may want to check out.

snowleopard commented 7 years ago

Yes, indeed. I found a good write-up on this by Oleg Kiselyov: http://okmij.org/ftp/Haskell/set-monad.html. I'll see if anything can be adapted to our case.

It just occurred to me, that my above data type can be generalised as follows:

-- Generic graphs with tagged subgraphs
-- For example, TaggedGraph String a can be used to attach names to selected subgraphs of a graph
data TaggedGraph t a = Empty
                     | Vertex a
                     | Overlay (TaggedGraph t a) (TaggedGraph t a)
                     | Connect (TaggedGraph t a) (TaggedGraph t a)
                     | Subgraph t (TaggedGraph t a)

instance Functor (TaggedGraph t) where ...
instance Monad (TaggedGraph t) where ...

-- We can now define AdjacencyMap as a TaggedGraph
data Cache a = Cache !(Map a (Set a)) !(GraphKL a)
type AdjacencyMap a = TaggedGraph (Cache a) a

But wait -- TaggedGraph is exactly one of the data types I proposed for edge-labelled graphs in #17!

I think something interesting is going on here.

snowleopard commented 7 years ago

@feuerbach Note that I flipped the order of arguments to dfsForestFrom in the last commit -- the new one seems to be more natural/convenient. Let me know if you disagree.

UnkindPartition commented 7 years ago

It's fine, I don't feel strongly about it.

snowleopard commented 7 years ago

Released: http://hackage.haskell.org/package/algebraic-graphs-0.0.5.

snowleopard / alga

Remove GraphKL from the API #19