Open NicolasRouquette opened 2 years ago
We'd love a PR! Thanks for the proposal and contribution. The scc code definitely should be generalized. Perhaps we can even take this opportunity to factor out a new generic Graph.lean library, that can eventually scale to something along the lines of fgl, graphs in Haskell or petgraph in Rust?
I've created https://github.com/yatima-inc/Graph.lean as a template where we can do this in. @arthurpaulino and @winston-h-zhang, can you guys follow up with how to factor our code out and how to incorporate @NicolasRouquette's changes?
Perhaps we can even take this opportunity to factor out a new generic Graph.lean library, that can eventually scale to something along the lines of fgl, graphs in Haskell or petgraph in Rust?
These are great examples. For my internal Scala code base, I've been using this library for several years: http://scala-graph.org/
The petgraph library seems well-reviewed; however, it lacks the monadic style of the Haskell libs. The scala graph library is heavily influenced by the design of the Scala collection library; it's altogether a different style of API.
Thanks for the contribution! We should definitely try and factor out any generic graph algorithms. Since John has already set up the Graph.lean
repo we can move everything into there first.
fgl
, graphs
, and petgraph
all look like great libraries to follow after. I think we should try to strike a good balance between having good core functionality but maybe not committing to building a truly all-purpose library. graphs
and petgraph
both seem like clean implementations, and I would vote to follow the design of these two.
Ironically, purely lazy functional code is often not great for Lean, since Lean is strict and can do functional-but-in-place "pseudo-mutations." (For example, many pure functional graph algorithms in Haskell don't quite work because they take advantage of its laziness.) So that's why I'm hesitant on fgl
.
I also don't know much about Scala -- how different is the API style?
One final thought is that petgraph
lacking monadic style is not really a problem. We say code is "monadic" in the sense that it isn't "functional" and mimics imperative style -- so the Lean code is probably going to be very close to what we would write in Rust anyway.
I'd love to hear any thoughts about this!
Thanks for the feedback. It seems that all of these options have some language-specific and API conventions that may not be a good fit for Lean.
The scala-graph library API design is deliberately focused on blending w/ the Scala collection API, which is really nice. I see some similarities here and there with some of the proposed Yatima Std library operations.
I understand that all of the above options involve biases due to the semantics of the languages involved (Haskell, Rust, Scala) and the API conventions they follow; we can get inspiration from these.
If anything deserves to be called "monadic", the Yatima Graph DFS and strongly-connected component algorithms deserve that label:
structure dfsState (α) [Ord α] where
visited : RBMap α Bool compare
abbrev dfsM (α) [Ord α] := ReaderT (Graph α) $ EStateM String (dfsState α)
and:
structure NodeInfo where
index : Nat
lowlink : Nat
onStack : Bool
deriving Repr
instance : ToString NodeInfo :=
{ toString := fun info => s!"i: {info.index}, low: {info.lowlink}, on: {info.onStack}" }
structure sccState (α) [Ord α] where
info : RBMap α NodeInfo compare
index : Nat
stack : List α
instance : Inhabited (sccState α):=
{ default := ⟨.empty, default, default⟩ }
abbrev sccM (α) [Ord α] [ToString α] := ReaderT (Graph α) $ EStateM String (sccState α)
Both algorithms are implemented as reader monads over the structures mentioned above separately from the definition of the Graph
itself.
A good start might be to look at graph
for replacing Yatima's specific Graph
type
structure Graph (α : Type) [Ord α] where
data : Std.RBMap α (List α) compare
with something similar to graph
's generic definition:
https://github.com/ekmett/graphs/blob/master/src/Data/Graph/Class.hs#L43:L50
type VertexMap g = PropertyMap g (Vertex g)
type EdgeMap g = PropertyMap g (Edge g)
class (Monad g, Eq (Vertex g), Eq (Edge g)) => Graph g where
type Vertex g :: *
type Edge g :: *
vertexMap :: a -> g (VertexMap g a)
edgeMap :: a -> g (EdgeMap g a)
I really like the graph
library because it is closest to Yatima's design in that it clearly separates the generic class for graphs from the generic class for graph algorithms:
https://github.com/ekmett/graphs/blob/master/src/Data/Graph/Algorithm.hs#L30:L37
-- | Graph search visitor
data GraphSearch g m = GraphSearch
{ enterVertex :: Vertex g -> g m -- ^ Called the first time a vertex is discovered
, enterEdge :: Edge g -> g m -- ^ Called the first time an edge is discovered, before enterVertex
, grayTarget :: Edge g -> g m -- ^ Called when we encounter a back edge to a vertex we're still processing
, exitVertex :: Vertex g -> g m -- ^ Called once we have processed all descendants of a vertex
, blackTarget :: Edge g -> g m -- ^ Called when we encounter a cross edge to a vertex we've already finished
}
The key difference is that Yatima's DFS and SCC algorithms are not currently written to a common interface like graph
's GraphSearch
. It is unclear to me whether this is something that we should have in Yatima's Graph lib or not.
FYI: I also found that there is an SCC algorithm implemented in Lean:
Interestingly, there is no graph data structure per se except for the information about vertices and their successors; e.g:
import Lean.Util.SCC
def vertices : List String := ["a", "b", "c", "d", "e", "f"]
def successorsOf(s: String) : List String :=
match s with
| "a" => ["b", "c"]
| "b" => ["a", "d"]
| "c" => ["d"]
| "d" => ["e"]
| "e" => ["c"]
| _ => []
#eval Lean.SCC.scc vertices successorsOf
-- [["d", "e", "c"], ["a", "b"], ["f"]]
The monadic formulation of the strongly-connected component algorithm caught my attention; it's really elegant and concise. I hope that the https://github.com/leanprover/functional_programming_in_lean book will eventually explain the nifty Lean4 programming tricks involved!
I would like to use this library to formulate in Lean4 some algorithms I've been working on in Scala3; see: https://github.com/NicolasRouquette/scala3-bundle-closure
However, the current Graph library is tied to Lean4's symbols as graph vertices:
https://github.com/yatima-inc/yatima-lang/blob/ada78d298e08e495b307cb639f4305731f3af663/Yatima/Compiler/Graph.lean#L32-L34
So as a fun exercise to flex my new Lean4 programming brain, I extracted your library and parameterized it: https://github.com/NicolasRouquette/YatimaGraphLib
To facilitate comparison w/ your original source, I've kept the same file/directory organization & names.
In writing a simple test, I noticed that the
buildG
function should add the edge target as a vertex in the map. See:https://github.com/NicolasRouquette/YatimaGraphLib/blob/d74377ead019a05a43b11b0fff535abe3b3466dd/src/lean4/Yatima/Compiler/Graph.lean#L28:L36
If this idea makes sense, are you interested in a PR?