Closed snowleopard closed 7 years ago
There is one performance aspect to consider: constructing GraphKL
from AdjacencyMap
takes time, so if the user would like to run multiple Data.Graph
algorithms, it would be better to construct GraphKL
once and then reuse it. If we drop GraphKL
, we'll need to somehow cache it within AdjacencyMap
.
If we drop GraphKL, we'll need to somehow cache it within AdjacencyMap.
Good point. Probably the easiest way is to have a lazy pair (AdjacencyMap, GraphKL)
(or an equivalent data type) where one field is lazily constructed in terms of the other.
@feuerbach Is this what you had in mind?
https://github.com/snowleopard/alga/blob/master/src/Algebra/Graph/AdjacencyMap/Internal.hs#L90-L102
I hope I'll finish the refactoring by the end of this week and will release the new version.
Yes, it is what I had mind. I think you may want to make the adjacencyMap
field strict.
I think you may want to make the
adjacencyMap
field strict.
Good point! I'll do this.
This brings me to the question I've been thinking about for some time (which probably deserves a separate issue, but I'll start here).
Another interesting implementation to consider is:
data AdjacencyMap a = Empty
| Vertex a
| Overlay (AdjacencyMap a) (AdjacencyMap a)
| Connect (AdjacencyMap a) (AdjacencyMap a)
| Cached a
data Cached a = Cached
{ expression :: AdjacencyMap a -- subexpression
, adjacencyMap :: !(Map a (Set a)) -- cached adjacency map representation
, graphKL :: !(GraphKL a) } -- cached Data.Graph representation
(Or something isomorphic).
Here we keep the expression tree lazy, which is cheap (unless it contains a lot of cached subtrees).
In this way, I think, we can make AdjacencyMap
an instance of higher-kinded graphs with standard Functor
and Monad
instances, by keeping expressions Cache
-free during most of the tree transformations.
When an expression needs to be "evaluated", we turn it into a Cached
one, which will require the Ord a
constraint, but at this point we don't care.
This reminds me of various attempts to turn Set into a Functor/Monad — there were a few papers on this topic you may want to check out.
Yes, indeed. I found a good write-up on this by Oleg Kiselyov: http://okmij.org/ftp/Haskell/set-monad.html. I'll see if anything can be adapted to our case.
It just occurred to me, that my above data type can be generalised as follows:
-- Generic graphs with tagged subgraphs
-- For example, TaggedGraph String a can be used to attach names to selected subgraphs of a graph
data TaggedGraph t a = Empty
| Vertex a
| Overlay (TaggedGraph t a) (TaggedGraph t a)
| Connect (TaggedGraph t a) (TaggedGraph t a)
| Subgraph t (TaggedGraph t a)
instance Functor (TaggedGraph t) where ...
instance Monad (TaggedGraph t) where ...
-- We can now define AdjacencyMap as a TaggedGraph
data Cache a = Cache !(Map a (Set a)) !(GraphKL a)
type AdjacencyMap a = TaggedGraph (Cache a) a
But wait -- TaggedGraph
is exactly one of the data types I proposed for edge-labelled graphs in #17!
I think something interesting is going on here.
@feuerbach Note that I flipped the order of arguments to dfsForestFrom
in the last commit -- the new one seems to be more natural/convenient. Let me know if you disagree.
It's fine, I don't feel strongly about it.
GraphKL
records are defined inAdjacencyMap
andIntAdjacencyMap
modules for interoperability with King-Launchbury graphs. Specifically, they provide a way to reuse the algorithms fromData.Graph
.While reviewing #18, I realised that GraphKL's
getVertex
function is partial, and I'd like to avoid having partial functions in the API where possible.I propose to remove
GraphKL
, because there are not too many algorithms left inData.Graph
that we haven't yet wrapped in a clean interface:dfs
(which is added in #18),bcc
, plus a few reachability tests.My plan is therefore to provide clean interfaces to all of these functions instead of exposing
GraphKL
.Please shout if you are using
GraphKL
!