Revision of Graph Theory

avekens commented 4 years ago

While moving the friendship theorem into the main part of set.mm (see #1389), I noticed some intricatenesses which bother me a lot:

Edges:

To be compatible with the definition of multigraphs, simple graphs were defined as pairs of a set (vertices) and an edge function (or indexed edges) instead of edges (as pairs of vertices), which would have been sufficient for simple graphs. To consider the "edges" always as functions makes many theorems clumsy and their proofs longer. Therefore, I introduced a function Edges which provides the edges as a set of pairs of vertices for a simple graph (already available in the main body of set.mm):

( Edges `` <. V , E >. ) = ran E or ( Edges `` G ) = ran ( 2nd `` G ) [Would Edge be a better symbol instead of Edges? We also have, for example, the symbol Ring to denote the class of all (unital) rings, and not Rings . Another alternative would be to use Edg.]

This is the "usual" way to talk about edges (of simple graphs), see [Bollobas] p.1: "... such that E is a subset of the set V^2 of unordered pairs of V." Using this definition, many theorems can be expressed in a better (that is shorter and more concise and comprehensible) way, e.g.

usgvincvadeu: ( ( G e. USGrph /\ C e. ( Edges `` G ) /\ X e. C ) -> E! y e. V C = { X , y } )

instead of

usgraedgreu: ( ( V USGrph E /\ X e. dom E /\ Y e. ( E `` X ) ) -> E! y e. V ( E `` X ) = { Y , y } )

Therefore, I want to propose to use Edges in more theorems in the context of simple graphs, mainly replacing dom E and ran E in the currently available theorems, making those obsolete. I started this endeavor already in my mathbox, see subsection "Edges of undirected simple graphs without loops".

A graph represented by a class variable G:

We started talking about graphs regarding them always as pairs <. V , E >. Instead of writing <. V , E >. e. USGrph, we used to use the representation as binary relation V USGrph E, which is shorter (saves 7 characters), but does not reflect the original meaning "A graph is a pair of vertices and edges" (instead, "V USGrph E" would literally stand for "The vertices and edges are related by a graph", which is not wrong, but not the way how graphs are seen).

Furthermore, there are many cases in which a separation into vertices and edges is not necessary/useful, so a graph should be represented by a single class variable, often G. By this, "G is a simple graph" is formalized as G e. USGrph, which seems to be more natural, for example in ~usgedgppr:

usgedgppr.e $e |- `E = ( Edges `` G )` $.
usgedgppr $p |- `( ( G e. USGrph /\ C e. E ) -> ( # `` C ) = 2 ) $=

If the representation as pair is required, it can always be provided by

~ usgracl: ( G e. USGrph -> ( 1st `` G ) USGrph ( 2nd `` G ) )

Additional definitions:

Following the approach to use a single class variable for a graph G, other concepts of graph theory could be explicitly defined`, for example (see also my mathbox):

( Vtx `` G ) = ( 1st `` G )
( GrOrder `` G ) = ( # `` ( 1st `` G ) ) [ alternative symbols: GrOrd GraOrder (as suggested by Gerard, because he is worrying about associating "GrOrd" with "group order" - In my opinion, however, GOrd could be used for the group order, like "gsum" is the symbol for a "group sum") GrphOrd For the HTML representation, we could use "Ord_Graph" or Ord_Grph" or "O_Grph"]
( GrSize `` G ) = ( # `` ( 2nd `` G ) ) [ alternative symbols: GraSize (as suggested by Gerard, because he is worrying about associating "GrSize" with "group size") GrphSize For the HTML representation, we could use "Size_Graph" or "Size_Grph" or "S_Grph"]

Having these definitions, many theorems could be stated in a more usual and comprehensible way, for example:

~usgo1s0: ( ( G e. USGrph /\ ( GrOrder `` G ) = 1 ) -> ( GrSize `` G ) = 0 ) instead of ~usgrafisindb1: ( ( V USGrph E /\ ( # `` V ) = 1 ) -> ( # `` E ) = 0 )

If the sets of vertices and/or edges are needed for a theorem, class variables could be provided for them as hypotheses:

xxx.v $e |- V = ( Vtx ` G ) $.
xxx.e $e |- E = ( Edges ` G ) $.

These definitions do not add any essential value (and often lead to longer representations and/or proofs of theorems), but they could increase the comprehensibility. In my opinion, however, we can do without explict definitions for vertices, the order and the size, if we use corresponding hypotheses. See, for example,

usgedgleord.v $e |- V = ( Vtx ` G ) $.
usgedgleord.e $e |- E = ( Edges ` G ) $.
usgedgleord $p |- ( ( G e. USGrph /\ N e. V ) -> ( # ` { e e. E | N e. e } ) <_ ( GrOrder ` G ) ) $=

compared with

usgedgleordALT.v $e |- V = ( 1st ` G ) $.
usgedgleordALT.e $e |- E = ( Edges ` G ) $.
usgedgleordALT.o $e |- O = ( # ` V ) $.
usgedgleordALT $p |- ( ( G e. USGrph /\ N e. V ) -> ( # ` { e e. E | N e. e } ) <_ O ) $=

From my oint of view, ~usgedgleordALT is as clear as ~usgedgleord, does not require an additional definition, and have a much shorter proof (18 vs. 26 essential steps). Furthermore, "Size" and "Order" would not be used very often (in contrast to "Vtx").

Finite simple graphs:

Finally, finite simple graphs play the central role in graph theory (in [Bollobas] p. 1 it is assumed that only finite graphs are regarded: "Unless it is explicitly stated otherwise, we consider only finite graphs, that is, V and E are always finite"). Therefore, a definition for them should be provided (see also my mathbox):

G e. FinUSGrph <-> ( G e. USGrph /\ ( 1st ` G ) e. Fin ) )

This definition is more natural than a definition using the order of a graph (if available, see item 3. before):

G e. FinUSGrph <-> ( G e. USGrph /\ ( GrOrder ` G ) e. NN0 )

I propose to move this definition to the main body of set.mm and use it consequently instead of ( V USGrph E /\ V e. Fin ).

Summary:

Revise theorems for simple graphs to use Edges (or Edge or Edg ?).
Revise some theorems (or provide variants of them) for simple graphs to use a single class variable G instead of the pair notation <. V , E . >.
Special definitions for the set of vertices, the order and the size of graphs is not necessary/useful.
Revise theorems for finite simple graphs to use FinUSGrph.

@nmegill @benjub @tirix @david-a-wheeler @digama0 what do you think?

tirix commented 4 years ago

That way of working with Graphs looks a lot like working with extensible structures. Should we not define vertices as a synonym of base, and edges as an additional, new component/slot? This may for example allow to define the (travelling salesman) metric over graphs, and view them as a metric space.

digama0 commented 4 years ago

I agree with tirix. The original formalization UMGrph was intended to be a "low overhead" definition so that I could state and prove konigsberg. For "proper" graph theory, in particular when we want to talk about different kinds of graphs, and their relation to other structures (rare but not totally unheard of, for example tirix's metric space example), we want to move to extensible structures.

I recall Stefan O'Rear had a fairly complete sketch of how this should go, although I'm not sure if it made it into set.mm or a mathbox.

digama0 commented 4 years ago

Also, I don't think GrOrder should be defined; it should be written ( # ` ( Verts ` G ) ) or however we end up expressing that. This is both clearer (less ambiguous), and matches the literature, which will usually write |V| or |E| for this.

avekens commented 4 years ago

Yes, I also thought about defining a graph as extensible structure, with the set of vertices as base set, and a new slot for the edge function. But I didn't see any advantage of the additional overhead. I also do not see how a graph should become a metric space (using Slot 12 "dist"?). @sorear : @digama0 said that you have already made an proposal how to represent graphs as extensible structures - where can I find your material? It is not containded in set.mm (neither in te main body nor in a mathbox).

tirix commented 4 years ago

Yes, one could create a function mapping graphs to metric spaces by populating the ‘ dist ‘ slot with the geodesic distance, like for ‘ toMetSp ‘.

And once we have a metric space, we have a topological space.

Just a thought: For that purpose, it might be interesting to label the edges with their length, i.e. have an operation taking two vertices and returning the length of the edge between the two, if any. I would even not specify the range of that operation, so that we can use it to provide lengths, but also morphisms, differential equations (for quantum graphs), etc.

A graph may also be extended to a Pre-ordered set, by setting the order relation ‘ le ‘ to “ x < y iff there exists a path from x to y”.

digama0 commented 4 years ago

@avekens Here's a follow on discussion (by you!) that links to the original sorear post: https://groups.google.com/d/msg/metamath/KdVXdL3IH3k/6YfJDeC9AwAJ

avekens commented 4 years ago

Thanks, Mario, to give a hint to my own post ;-) - since we decided (in this thread) that we will not follow the extensible structure approach, I think I erased this from my memory...

With my latest pull request (#1421), I provide an example how we could define hypergraphs as extensible structures (see ~df-uhgr and ~df-ushgr). I named the additional slot ".ef" (for edge function, any better suggestion would be welcome) - this corresponds to "inci" in Stefan's proposal.

Furthermore, Stefan's "incidence structure" (InciSy) corresponds to my (not necessarily simple) "hypergraph" (Stefan calls it alternatively "multi-hypergraph"), whereas Stefan's "hypergraph" corresponds to my "simple hypergraph". Since an incident structure is a more general concept ("a triple (P, L, I) where P is a set whose elements are called points, L is a distinct set whose elements are called lines and I C_ P X L is the incidence relation.", see Wikipedia) which is used in geometry mainly, and (multi-)hypergraphs are only special incident structures, I would propose not to use this name for the definition.

In Stefan`s proposal, directed and undirected graphs are defined separately, using different slots (inci and edgeDir). Since we currently work with undirected graphs only, we do not need this second slot for directed edges at the moment. But if directed graphs are introduced, I wonder if this approach seems to be well suited: if either directed or undirected graphs were regarded independently, one slot would be sufficient. If undirected graphs shall also be regarded as directed graphs and vice versa, using two slots could be useful (of course the corresponding functions must be defined appropriately, e.g. each undirected edge corresponds to two directed edges).

Stefan's Multigraph and Simplegraph corresponds to UMGrph and USGrph in set.mm, which could be transformed to extensible structures as I did for UHGrph -> UHGraph.

I did not check if Stefan's definitions for connectivity and graph2top are appropriate - it would have been good if there were some (basic) theorems showing their soundness. I think, however, that connectivity could be expressed by our definitions for walks and/or paths. To be honest, I do not understand his definition of a mapping of graphs to topological spaces. As Thierry suggested, a metric should be defined first (length of the shortest path between two vertices), which can be done for slot 12 "dist". In principle, I would prefer Thierry's suggestion, but as long as there is no current need for it, I would postpone this.

According to your post on 2-Oct-2017

If and when we eventually move to extensible structures, I would like it to be motivated by an actual use case that goes beyond the applicability of the current definition. Just shuffling the parts of UMGrph into slots doesn't actually make it any more useful - if the slots are all new and have no relation to anything else, and there are no combination structures like a ring which is an undirected graph as well, then it's just a complicated way to write ordered pairs.

I still see no use case which justifies the big amount of effort to spend on changing the definitions (and about 700 theorems) to extensible structures at the moment. If it is required, we can provide a transformation, as you proposed in your poston 17-Apr-2015:

( <. V , E >. e. UMGra <-> { <. ( Base ndx ) , V >. , <. ( inci ndx ) , E >. } e. Multigraph )

Therefore, I propose to concentrate on revising the theorems using the new definitions Edges (I think I will rename it to Edg) and FinUSGrph, as proposed in my bullet points 1 and 4, maybe also provide some theorems using a single class variable instead of a pair to represent a graph (see bullet point 2.).

I agree with you that GrOrder (and GrSize) are not really useful, so I will not move them to the main body of set.mm (and stop using them further), as stated in bullet point 3. Regarding the symbol Vtx, I would also propose not to use it - either we can use ( 1st `` G ) or ( Base `` G ), if graphs are represented as extensible structures, instead.

By the way, the definitions HypGrph, SmpGrph and PsGrph you mentioned in your reply on Stefan's proposal are not availabe in set.mm anymore - they were contained in Steve Rodriguez' mathbox in 2015, but were removed before 2016...

digama0 commented 4 years ago

( <. V , E >. e. UMGra <-> { <. ( Base ndx ) , V >. , <. ( inci ndx ) , E >. } e. Multigraph )

To clarify, I wasn't proposing to use that as a mechanism for translation, I was just writing down the relationship between Stefan's graphs and every other graph theory definition that existed at the time, for comparison.

If you don't want to use extensible structures at the moment, I would recommend isolating the definitions using ordered pairs such that a later retrofit can touch a limited number of theorems.

avekens commented 4 years ago

If you don't want to use extensible structures at the moment, I would recommend isolating the definitions using ordered pairs such that a later retrofit can touch a limited number of theorems.

This sounds reasonable. But how can we achive this? Will it be sufficient if we have two functions EdgF ("edge function" or "familiy of edges") and Vtx ("vertices"), and always use V = ( Vtx `` G ) and E = ( EdgF `` G ) (or E = ( Edg `` G ) in the case of simple graphs) as hypotheses? Then we can define

( Vtx `` G ) = ( 1st `` G )
( EdgF `` G ) = ( 2nd `` G)

now, continuing working with graphs represented by ordered pairs, and if we switch to the representation of graphs as extensible structures, we can just replace these definitions by

( Vtx `` G ) = ( Base `` G )
( EdgF `` G ) = ( .ef `` G)

Is this what you meant? So it would be good to use a function Vtx, despite my arguments against it in bullet point 3.

digama0 commented 4 years ago

Yes, that's what I meant, and I agree that it gives Vtx a reasonable excuse to exist.

avekens commented 3 years ago

Copy of comment for PR #1846: Since the beginning of dealing with graph theory in set.mm, the question if graphs should be represented as ordered pairs (of a set of vertices and a set of edges) or as extensible structures (the set of vertices as base set, and a special slot for the edges) is open (although the current graph theory is based on the representation by ordered pairs).

By PR #1846, I provide (still in my Mathbox) definitions ~df-vtx and ~df-iedg for functions Vtx and iEdg to get the vertices and (indexed) edges of graphs represented by an arbitrary class G, and this works for G being an ordered pair or an extensible structure (and some other, maybe degenerated cases). If required, other representations could be supported by extending the definitions in the future.

The goal is now to build up the graph theory on these two functions (usually setting V = ( Vtx `` G ) and E = ( iEdg `` G ) and using V and E only), disregarding the concrete representation of a graph. Therefore, a decision how to represent a graph will be not necessary anymore.

See also comments posted on 19 Jan 2020.

david-a-wheeler commented 3 years ago

@avekens - I saw you mention my name earlier, but I haven't delved into all the pros & cons, so I don't have an informed opinion. I would like something that's general, flexible, clear, easy to use, and ideally not too far from the literature. But these are all criteria we'd all agree on, and they don't really help distinguish between the options. I do generally prefer definitions that allow multiple underlying representations, since I think that tends to be more flexible. But again, I haven't delved into this as deeply. So I'll leave it to people who've delved into this more than I have.

avekens commented 3 years ago

@david-a-wheeler With PR #1848, I enhanced the section header comment now showing that the definitions of Vtx and iEdg correspond to analog definitions in the literature ([Bollobas], [Diestel]).

avekens commented 3 years ago

About 3 month ago, I started the revision of the definitions and theorems of graph theory, mainly based on the results of the discussion in this issue. Now I have finished the revision about half of the existing material (including the main concepts), so I think it's time to get some feedback.

The main changes are:

definition of functions Vtx and iEdg which provides the vertices resp. indexed edges of an arbitrary class regarded as graph. This allows representations of graphs as ordered pairs <. V , E >. as well as extensible structures { <. ( Base ndx ) , V >. , <. ( .ef ndx ) , E >. , ...}, see sections 21.33.8.1 and 21.33.8.2.
revision of the definitions for hyper-/pseudo-/multi-/simple graphs based on the functions Vtx and iEdg , see sections 21.33.8.3, 21.33.8.4 and 21.33.8.6
the graphs formerly called "multigraphs" are now called "pseudographs", whereas "multigraphs" are special pseudographs (still with multiedges, but without loops) now, see section 21.33.8.4
more systematic usage of the definition of (not indexed) edges, see section 21.33.8.5
provison of a definition for subgraphs, see section 21.33.8.8
revision of the definitions for neighbors, complete graphs and universal vertices based on the functions Vtx and iEdg , see sections 21.33.8.10
revision of the definition for the vertex degree and regular graphs based on the functions Vtx and iEdg and to be appropriate for arbitrary hypergraphs, see sections 21.33.8.11 and 21.33.8.12
revision of the definition for walks based on the functions Vtx and iEdg and to be appropriate for arbitrary hypergraphs, see sections 21.33.8.13

These changes cover sections 16.1.1-4, 16.1.5.1 partially, 16.1.6, 16.1.7.1. (formerly about 330, after revision about 560 definitions and theorems) and some of the results of section 21.33.9 in my mathbox. My plan is to revise the other sections of part 16 "graph theory" accordingly (still about 500 definitions and theorems!). After the revision is finished, section 21.33.8 of my mathbox should replace part 16 "graph theory" completely. Section 21.33.9 in my mathbox will also become obolete afterwards.

The names I used for class symbols and labels are in part temporary, because they overlap with the current names in part 16 "graph theory". They will be adjusted after part 16 "graph theory" is replaced by the revised material. However, suggestions for improved names are always welcome.

I want to ask everybody who is interested in graph theory, and who has some spare time, to have a look at the revised definitions and theorems in section 21.33.8 of my mathbox, and to provide advices and suggestions for further improvements before I continue. Gérard already gave some hints which I took already into account.

avekens commented 3 years ago

I finished the revision of the definition of Eulerian paths and the related theorems, including the Konigsberg Bridge Problem, see the corresponding sections in my mathbox. The definition of an Eulerian path is aligned with the general, revised definitions of graphs and walks within graphs now.

@digama0: Since you originally contributed these theorems, I want to ask you to have a look at my revision, if it is OK for you.

I'll start to revise the section about friendship graphs, including the friendship theorem now. After this, the revision of the chapter "Graph Theory" will be finished, and the old version/chapter could be replaced by the new one.

avekens commented 3 years ago

With PR #2040, the revision of the definitions and theorems for Graph Theory is finished (in principle). I have created already a corresponding conversation in the Metamath Google Group: https://groups.google.com/g/metamath/c/kxrJe3lYxws/m/9rNJhtATAQAJ. Please add comments of common interest to this conversation, more technical oriented issues can be discussed in this issue.

The next main step will be to replace "PART 16 GRAPH THEORY" entirely by section "21.34.8 Graph theory (revised)" of my mathbox. Some adjustments of names, labels and comments have to be done during this action. I plan to perform the replacement starting in August.

avekens commented 2 years ago

Issue can be closed, because the remaining work is tracked by issue #2265

metamath / set.mm

Revision of Graph Theory #1418