Tree check: cycle-free , rooted tree..

younes-io commented 11 months ago

Updated the IsTreeWithRoot(G, r) function to more accurately define a rooted tree in a graph

Ensure single-node trees are correctly handled
Each node, except the root, is now guaranteed to have exactly one incoming edge
Verify that all nodes (except the root) are reachable from the root
Add a check to prevent any cycles in the graph

Please let me know your thoughts..

lemmy commented 11 months ago

Please consider adding tests to https://github.com/tlaplus/CommunityModules/blob/master/tests/GraphsTests.tla

younes-io commented 11 months ago

Please consider adding tests to https://github.com/tlaplus/CommunityModules/blob/master/tests/GraphsTests.tla

Oh, I wasn't aware of that.. I'll take a look into it and change it as well. Thank you for pointing that out^^

muenchnerkindl commented 11 months ago

I am not sure I fully understand this. For example, neither the original definition nor the new one (except for the single-node case) require that r \in G.nodes. Also, the attempt to exclude cycles looks wrong to me because <<n>> \in SimplePath(G) holds for every n – one would have to require a path of length at least 2, also there should not be a cycle at the root either. How about something like the following, which I believe is closer to the mathematical definition of a tree:

IsTreeWithRoot(G, r) ==
  /\ IsDirectedGraph(G)
  /\ r \in G. nodes
  /\ \* the root has no predecessor
     ~ \E n \in G.nodes : <<n,r>> \in G.edges
  /\ \* every non-root node has a predecessor
     \A n \in G.nodes \ {r} : \E m \in G.nodes : <<m,n>> \in G.edges
  /\ \* every node has at most one predecessor
     \A e,f \in G.edges : e[2] = f[2] => e = f
  /\ \* the graph is acyclic
     \A n \in G.nodes : <<n,n>> \notin ConnectionsInIrreflexive(G)

where ConnectionsInIrreflexive is defined just like ConnectionsIn except for dropping reflexivity in the base case, i.e.

      IF N = {}
      THEN [m,n \in G.node |-> <<m,n>> \in G.edge]
     ...

In fact, I believe that this should be the base definition: ConnectionsIn can then be defined as the reflexive closure of ConnectionsInIrreflexive (and if it were not for compatibility problems I'd rename the two such that ConnectionsIn refers to the definition without reflexivity dropped in and ConnectionsInReflexive to the existing one).

@lemmy: The above definition easily generalizes to forests as follows:

IsForestWithRoots(G, R) ==
  /\ IsDirectedGraph(G)
  /\ R \subseteq G. nodes
  /\ \* roots do not have predecessors
     \A r \in R : ~ \E n \in G.nodes : <<n,r>> \in G.edges
  /\ \* every non-root node has a predecessor
     \A n \in G.nodes \ R : \E m \in G.nodes : <<m,n>> \in G.edges
  /\ \* every node has at most one predecessor
     \A e,f \in G.edges : e[2] = f[2] => e = f
  /\ \* the graph is acyclic
     \A n \in G.nodes : <<n,n>> \notin ConnectionsInIrreflexive(G)

IsTreeWithRoot(G,r) == IsForestWithRoots(G, {r})

muenchnerkindl commented 11 months ago

Arrgh, GitHub pretty-printing ... the above should read <<n>> \in SimplePath(G) holds for every n.

lemmy commented 11 months ago

Arrgh, GitHub pretty-printing ... the above should read <<n>> \in SimplePath(G) holds for every n.

Just put it in single backticks the next time.

younes-io commented 11 months ago

@muenchnerkindl : Thank you Stephan Could you please point out a link to a "standard mathematical" definition so that we make sure we encompass all elements in the TLA+ spec ? It would also help me write appropriate tests for properties.. Just to make sure we "seal" the definition ^^

muenchnerkindl commented 11 months ago

@younes-io Perhaps I was a bit abrupt, sorry for that. There are many equivalent characterizations of trees in graph theory, (see e.g. Wikipedia, most of them for undirected graphs. I think the one closest to what we have here is "Any two vertices can be connected by a unique simple path". The original definition similarly had \A n \in G.node : AreConnectedIn(n, r, G), which requires that every node can be reached by a simple path from the root. It remains to require that this path is unique, and the original definition required that every node has at most one predecessor, expressed (perhaps somewhat obliquely) as \A e, f \in G.edge : (e[2] = f[2]) => (e = f), as well as that the root has no predecessor: \A e \in G.edge : e[2] # r. (In fact, the original definition had e[1] and f[1] instead, which are clearly typos.)

Thinking about this again, I do not see why you see a need for changing the existing definition except for correcting the typos. Your special case for single-node trees looks odd. Indeed, suppose that G.node = {r} for some element r. Then the first conjunct IsDirectedGraph(G) requires that G.edge \subseteq {r} \X {r}, so either G.edge = {} or G.edge = {<<r, r>>}. However, the second case is ruled out by the condition that no edge ends at r, so we must have G.edge = {}. The other conditions of the original condition hold as well. In particular, r is connected to r by the singleton simple path <<r>>.

For the remaining items, as indicated above, every node can have at most one incoming edge. The root has no incoming edge. Every node except for the root must have an incoming edge, otherwise it could not be connected to the root, and therefore it has exactly one incoming edge. Reachability from the root is checked. Now assume the graph had a (shortest) cycle. That cycle cannot go through the root, because then the root would have an incoming edge. But then there must be a node that has two incoming edges: one linking it to the root (which cannot be on the cycle), and one through the cycle. Again, this is impossible.

Summing up, unless you can demonstrate that the (fixed) definition fails, I simply suggest fixing the two typos and writing

IsTreeWithRoot(G, r) ==
  /\ IsDirectedGraph(G)
  /\ \A e \in G.edge : /\ e[2] # r
                       /\ \A f \in G.edge : (e[2] = f[2]) => (e = f)
  /\ \A n \in G.node : AreConnectedIn(n, r, G)

In particular, the definition I suggested is equivalent to that one. (My assertion that the condition r \in G.nodes was not required by the original definition was wrong: IsDirectedGraph restricts edges to be pairs of nodes, and AreConnectedIn requires the existence of a simple path, which is a sequence of nodes, so implicitly r must be a node.)

younes-io commented 2 weeks ago

https://github.com/tlaplus/CommunityModules/pull/106

tlaplus / CommunityModules

Tree check: cycle-free , rooted tree.. #94