JuliaGraphs / GraphsBase.jl

Basic interface and structures for the JuliaGraphs ecosystem
http://juliagraphs.org/GraphsBase.jl/
MIT License
11 stars 1 forks source link

Interface as simple as possible #13

Open gdalle opened 9 months ago

gdalle commented 9 months ago

Public service announcement if people look at this PR: it's probably not gonna get merged. I mostly made it as a trigger for discussion, like GraphInterface.jl by @CameronBieganek :)

Based on #9, here is a smaller PR that will hopefully be easier to review and merge as a starting point. Here are the highlights:

If you want to review, the easiest starting point is the documentation preview at https://juliagraphs.org/GraphsBase.jl/previews/PR13

Related issues:

codecov[bot] commented 9 months ago

Codecov Report

Merging #13 (c001ac9) into main (f43d51f) will increase coverage by 29.91%. The diff coverage is 29.31%.

@@            Coverage Diff            @@
##           main      #13       +/-   ##
=========================================
+ Coverage      0   29.91%   +29.91%     
=========================================
  Files         0       13       +13     
  Lines         0      117      +117     
=========================================
+ Hits          0       35       +35     
- Misses        0       82       +82     
Files Coverage Δ
src/GraphsBase.jl 100.00% <ø> (ø)
src/SimpleGraphs/SimpleGraphs.jl 100.00% <100.00%> (ø)
src/SimpleWeightedGraphs/SimpleWeightedGraphs.jl 100.00% <100.00%> (ø)
src/interface/utils.jl 63.63% <63.63%> (ø)
src/SimpleGraphs/simplegraph.jl 0.00% <0.00%> (ø)
src/SimpleGraphs/simpledigraph.jl 0.00% <0.00%> (ø)
src/SimpleWeightedGraphs/simpleweightedgraph.jl 0.00% <0.00%> (ø)
src/SimpleGraphs/simpleedge.jl 0.00% <0.00%> (ø)
src/SimpleWeightedGraphs/simpleweightededge.jl 0.00% <0.00%> (ø)
src/SimpleWeightedGraphs/simpleweighteddigraph.jl 0.00% <0.00%> (ø)
... and 3 more
CameronBieganek commented 9 months ago

The adjlist::Vector{Vector{<:Integer}} data structure in SimpleGraph seems incompatible with add_vertex!(g, v). In other words, I would expect to be able to do the following:

g = SimpleGraph{Int}()
add_vertex!(g, 1_000_000)

...but the vector of vectors data structure is not well-suited for that sort of usage.

In other words, in my opinion, the generic interface should guarantee that add_vertex!(g, v) will always succeed regardless of the value of v, as long as the type of v is compatible with the vertex type for g.

gdalle commented 9 months ago

In my opinion, the generic interface should guarantee that add_vertex!(g, v) will always succeed regardless of the value of v, as long as the type of v is compatible with the vertex type for g.

That's not for the interface to impose, that's a matter of implementation. The interface just provides the method add_vertex!(g, v), but then the implementation is free to throw an error if v != nv(g) + 1.

SimpleGraph was designed with a contiguous vertex set 1:n in mind, and that's why it is so efficient. Of course it also comes with the limitation above. If you want somewhere you can add arbitrary vertices, you can roll out your own implementation like VertexSafeGraphs.jl. The goal is to allow both in a single interface.

CameronBieganek commented 9 months ago

That's not for the interface to impose, that's a matter of implementation. The interface just provides the method add_vertex!(g, v), but then the implementation is free to throw an error if v != nv(g) + 1.

Julia interfaces provide more than just method signatures, they also provide concepts. Here is the concept that I have in mind for adding and removing vertices:

In my opinion, in the new interface this concept should apply to all subtypes of AbstractGraph. SimpleGraph does not get an exemption.

For clarity, here's an example that shows what I expect to happen:

# vertices(g) == [1, 2, 3, 4, 5]
rem_vertex!(g, 3)   # now `vertices(g) == [1, 2, 4, 5]
add_vertex!(g, 42)  # now `vertices(g) == [1, 2, 4, 5, 42]
mtfishman commented 9 months ago

@CameronBieganek I don't see why adding any vertex should be allowed for any graph type, since that won't even be possible in many cases (say adding a string vertex to a graph that only supports integer vertices). I think it's reasonable to throw an error if you try to add a vertex that isn't compatible with the data structure (or isn't fast to add), as @gdalle suggested above.

CameronBieganek commented 9 months ago

I don't see why adding any vertex should be allowed for any graph type

That was just an imprecision in my language above. I mean any vertex of a compatible type. The precise definition is contained in my docstring for GraphInterface.add_vertex!:

"""
    add_vertex!(g::AbstractGraph, v)

If the vertex `v` is not in the graph `g`, then add it to `g`, otherwise do nothing. This is
guaranteed to succeed as long as the following is true:

```julia
isequal(convert(vertex_type(g), v), v)
```

Thus, `v` must be convertible to the vertex type of graph `g` *and* the result of the conversion
must be `isequal` to `v`. Thus, if `g` has vertex type `Int`, then `add_vertex!(g, 'a')` will
fail, because `isequal(97, 'a')` returns `false`.

Returns the graph `g` (after adding `v`).
"""

So, you can add an Int8 vertex to a graph with vertices of type Int64, but you cannot add a Char vertex to that graph. Note that Base.Set and Base.Dict do the same thing:

julia> push!(Set(1), 'a')
ERROR: ArgumentError: 'a' is not a valid key for type Int64
Stacktrace:
 [1] setindex!(h::Dict{Int64, Nothing}, v0::Nothing, key0::Char)
   @ Base .\dict.jl:363
 [2] push!(s::Set{Int64}, x::Char)
   @ Base .\set.jl:103
 [3] top-level scope
   @ REPL[1]:1
mtfishman commented 9 months ago

I still don't see why we wouldn't allow graph types (like SimpleGraph) where you can only add a vertex add_vertex!(g, v) such that v == nv(g) + 1, what problem does that cause? I could see an argument that it is hard to make code generic between graphs with such constrained vertex label requirements and more general ones, but that seems to be the core of the challenge of generalizing Graphs.jl. Seems like we should try to solve that before disallowing it entirely, since that constraint could make graph functions faster, which is why (from what I understand) it was chosen for the core graph type of Graphs.jl in the first place.

CameronBieganek commented 9 months ago

The behavior of add_vertex!(g, v) on SimpleGraph is unfortunate, but the real nail in the coffin for SimpleGraph is the behavior of rem_vertex!(g, v). Suppose I work for Facebook and on Monday I create a graph that represents a social network among 10 individuals, labeled from 1 to 10. I want to find the distance between 4 and 9, so I run shortest_path(g, 4, 9). Then on Tuesday, person 2 deletes their account, so I run rem_vertex!(g, 2). I still want to know the distance between 4 and 9, so I run shortest_path(g, 4, 9) again. But that will give me the wrong answer, because the vertex codes from 2 to 10 have all changed! And in fact, I won't know the manner in which they've changed unless I read the source code of rem_vertex!(g::SimpleGraph, v). That is unacceptable behavior for a generic graph interface.

...I know that Graphs.rem_vertex! has a warning that says "This operation has to be performed carefully [...] since internally the removal is performed by swapping the vertices v and |V|, and removing the last vertex |V| from the graph." However, implementation details like that should not leak out into a generic interface.

mtfishman commented 9 months ago

Agreed that the behavior of rem_vertex! is more problematic. I guess I have some optimism that supporting both SimpleGraph-like graphs (let's call them linear vertex graphs) and also generic vertex graphs can be salvaged with good generic code and maybe some traits or data structure specialization (like create_vertex_container brought up in ^1) to dispatch on functions that assume linear vertices (for the sake of performance) vs. generic vertices. Not all graph functions use rem_vertex!, so it wouldn't be problematic to toss in a SimpleGraph.

In fact, I searched for rem_vertex! and rem_vertices! in the Graphs.jl repository and get no hits in any graph algorithms:

so I don't see why supporting them would be an issue. It seems like if a user doesn't like the behavior of rem_vertex! then they should use a different graph type.

mtfishman commented 9 months ago

I appreciate the sentiment that functions should have contracts, and it could make sense that rem_vertex! strictly follows the rule that rem_vertex!(g, v) returns a graph such that v ∉ vertices(g) and all other vertices are preserved. What if, for SimpleGraph-like graphs, there is a restriction that:

Then, there could be some internal function for removing the vertices of a SimpleGraph that is used by graph functions that are ok with not preserving the vertices, and the vertices are just tracked internally as needed (for example rem_vertex_and_shift! or something like that). If users want to add or remove other vertices they can use a different graph type, or use those different functions knowing they will get relabeled vertices.

mtfishman commented 9 months ago

Here is a proposal for generic linear indexing of graphs.

For graphs with linear vertices (like SimpleGraph):

add_vertex!(g, v) # Only works if `v == nv(g) + 1`
rem_vertex!(g, v) # Only works if `v == nv(g)`

and for more general graphs, there would be fewer constraints on v (say just constrained by the vertex type, but that's up to the graph type to decide).

To add or remove arbitrary vertices of a SimpleGraph (which would involve shifting the vertex labels):

insert_vertex!(g, v) # Analogous to `insert!` for inserting an element into a `Vector` in Base Julia
add_vertex!(g, v * th) # Syntax from https://github.com/jishnub/OrdinalIndexing.jl for linear indexing

delete_vertex!(g, v) # Analogous to `deleteat!` for deleting an element of a `Vector` in Base Julia
rem_vertex!(g, v * th) # Syntax from https://github.com/jishnub/OrdinalIndexing.jl for linear indexing

These could generalize to a graph with general vertices that has an underlying data structure that has linear vertices (say a graph with arbitrary vertices wrapping a SimpleGraph), in which case the vertex v above corresponds to the linear parent vertex and directly accesses the parent graph. An advantage of the OrdinalIndexing.jl-style syntax is that it generalizes to other operations, such as neighbors(g, v * th) (of course it could be some other indexing object like neighbors(g, ParentVertex(v)) or neighbors(g, LinearVertex(v)), that's not so important).

filchristou commented 7 months ago

Haven't read everything, but I think the following was still not raised as a point: Given that MetaGraphs will be integrated in GraphBase isn't it a bit a redundant to also have WeightedGraph implemented differently ? I mean a weighted graph is nothing else but a MetaGraph with a primitive data type as edge data. Having it as a MetaGraph might also be more flexible since composite types can be treated as weights ( if an interface is followed ).