Semigroup version of Algebra.Graph.Labelled.AdjacencyMap

dwincort commented 1 year ago

I was looking into using this library (and the Algebra.Graph.Labelled.AdjacencyMap module in particular), but the edge type I want to use is not a Monoid, only a Semigroup. Indeed, one of the things I like about Labelled.AdjacencyMap is that the adjacencies are stored sparsely, i.e., that non-existent edges are not stored, which makes a lot of sense for me because there is no value in my edge type that represents not-an-edge.

If I'm understanding the code correctly, it looks like mempty (or zero) of the edge monoid is only used to filter zero edges out of the adjacency map. I'm not proposing you drop the Monoid constraint -- for edge types that do have a zero element, filtering it out makes sense -- but it seems to me there could be a parallel module for labeled adjacency maps that only uses a Semigroup constraint and never filters.

P.S. Also, I understand that I could wrap my edge type in Maybe and, because the labeled AdjacencyMap never stores zero edges, unwrap with something like fromJust every time I view an edge. I find this unsatisfying.

snowleopard commented 1 year ago

How are you planning to use Algebra.Graph.Labelled.AdjacencyMap if your edge labels have no zero? You won't be able to use overlay, so you'll need to stick to fully-connected graphs, I guess, which seems odd.

dwincort commented 1 year ago

I think if you look at the code you'll find that you can use overlay just fine.

overlay :: (Eq e, Monoid e, Ord a) => AdjacencyMap e a -> AdjacencyMap e a -> AdjacencyMap e a
overlay (AM x) (AM y) = AM $ Map.unionWith nonZeroUnion x y

-- Union maps, removing zero elements from the result.
nonZeroUnion :: (Eq e, Monoid e, Ord a) => Map a e -> Map a e -> Map a e
nonZeroUnion x y = Map.filter (/= zero) $ Map.unionWith mappend x y

If you make e a Semigroup instead of a Monoid, then you don't even need to worry about such a nonZeroUnion and can simply use Map.unionWith (<>) instead.

This is because, unlike Algebra.Graph.Labelled, the AdjacencyMap version stores only non-zero adjacencies anyway. In fact even the presence of a zero seems rather odd for such a construction, since it allows for the possibility of the key invariant (that is, that the adjacency map contains no zero edges) to be easily broken.

Another way to say this is that in the AdjacencyMap version, zero edges are implicitly defined via their absence in the mapping. As such, there's no need to explicitly define them by having an edge type that includes a zero value.

Going back to how I plan to use this: I want to be able to use the adjacencyMap function and get back a map of maps that is guaranteed to have no zero edges in it. This is explicit in the AdjacencyMap invariant, but because the functions require a Monoid constraint, it is not described by the type constraints.

snowleopard commented 1 year ago

Oh, I see! I didn't realise that you were suggesting to remove Monoid e from all constraints in this module (I was only thinking of nonZeroUnion for some reason).

Now it all makes sense. Interesting. I was always thinking of Algebra.Graph.Labelled.AdjacencyMap as an implementation of the API from Algebra.Graph.Labelled but from this discussion, it appears to somehow be more general.

One problem with just removing Monoid e and the associated checks from the implementation is that if e does happen to be a monoid, then the user will in fact be able to create meaningless edges, which could affect the performance of existing code. Turning one module into two seems wrong (this library is already too big to my taste). But keeping only the semigroup version seems problematic too, doesn't it?

dwincort commented 1 year ago

One problem with just removing Monoid e and the associated checks from the implementation is that if e does happen to be a monoid, then the user will in fact be able to create meaningless edges, which could affect the performance of existing code. Turning one module into two seems wrong (this library is already too big to my taste). But keeping only the semigroup version seems problematic too, doesn't it?

Yes! I'm honestly not sure what to do either. Even though the semigroup version seems more general, it is subtly wrong when e is a monoid for exactly the reason you pointed out -- it breaks the internal invariant that there are no zero edges. I also agree that it seems inelegant to duplicate the whole module only to make such a slight change.

The best option I came up with is to do something like what's done in containers with, e.g., Map. The data type for AdjacencyMap would be common, but there would be two interfaces for it -- one with Semigroup constraints and one with Monoid constraints -- and a note explaining to use the Semigroup version only when there is no zero edge in your edge type. I'm not quite sure this is a good solution, so maybe more brainstorming is in order.

snowleopard commented 1 year ago

The best option I came up with is to do something like what's done in containers with, e.g., Map. The data type for AdjacencyMap would be common, but there would be two interfaces for it -- one with Semigroup constraints and one with Monoid constraints -- and a note explaining to use the Semigroup version only when there is no zero edge in your edge type.

Could you point me to an example?

dwincort commented 1 year ago

In containers, there's both Data.Map.Strict and Data.Map.Lazy. The source for .Lazy is uninteresting, as it's simply a re-export of Data.Map.Internal, but the source for Data.Map.Strict.Internal is more interesting (https://hackage.haskell.org/package/containers-0.6.7/docs/src/Data.Map.Strict.Internal.html). Note that a large amount of the functionality is brought in from the "lazy" implementation in Data.Map.Internal, and the forcing strict functions are redefined.

I imagined one could do something similar with this library. I don't know that this library needs a module like Algebra.Graph.Labelled.AdjacencyMap.Internal (I mean, if it doesn't need that now, than it shouldn't need it for this change), but that doesn't mean we can't define something like:

module Algebra.Graph.Labelled.AdjacencyMap.Semigroup (...) where

-- We can import and reuse the `AdjacencyMap` type itself as well as functions like 
-- `empty`, `vertexSet`, `hasEdge`, and many others that don't rely on the `Monoid e` instance.
import Algebra.Graph.Labelled.AdjacencyMap (...)

-- For functions where the signature changes, we redefine them
overlay :: (Semigroup e, Ord a) => AdjacencyMap e a -> AdjacencyMap e a -> AdjacencyMap e a
overlay (AM x) (AM y) = AM $ Map.unionWith (<>) x y

...

In this way, there would certainly be a new module, but it wouldn't be a full duplicate of the existing one -- it would only include new code. Furthermore, the type AdjacencyMap would actually be shared between them.

The biggest problem with this approach, which is also a problem in Data.Map.Strict, is the instances. In an ideal world, we'd want, for example, the instance Semigroup AdjacencyMap to rely on Semigroup e instead of Monoid e and use the Semigroup version of overlay rather than the Monoid one. Alas, that's impossible. (Well, it's technically possible with orphan instances, but I do not recommend it.) Notice the IsList instance in Data.Map and how it suffers from this problem: even if you import Data.Map.Strict, if you use the GHC.IsList version of fromList, you'll be performing the lazy version. This is perhaps even more obvious (and alarming) in the Functor case: fmaping over a "strict" Map is not strict! If you want strict mapping, you need to use Data.Map.Strict.map.

But, I mean, if containers is doing it, it can't be such a bad precedent, right?

snowleopard commented 1 year ago

I see, thank you for the detailed explanation!

What you suggest is certainly doable in practice but at the same time, it doesn't feel right.

It's also expensive in terms of maintenance, and I'm not sure I'm ready to sign up to support this sort of thing. As I mentioned, I actually plan to go in the opposite direction and remove a few modules instead of adding new ones, because the library is already too bulky and confusing.

From all the discussed options, I'm leaning towards just dropping Monoid from all the constraints, like you initially suggested, and stopping filtering out zeroes. We can provide an explicit new function to filter out edges. If that sounds good to you, could you perhaps draft a PR?

snowleopard / alga

Semigroup version of Algebra.Graph.Labelled.AdjacencyMap #307