Closed sbromberger closed 6 years ago
This came out of a discussion at JuliaCon about what if any difference there is between fadj/badj
and in_neighbors/out_neighbors
. Seth proposes that fadj/badj is an iterable of the edges that leave/enter the vertex, while in_neighbors/out_neighbors is an iterable over the set of neighbors which ignores the multiplicity in a multigraph.
Since we have never supported Multigraphs, this semantic debate was never an issue. Almost none of our algorithms would correctly handle multigraphs if we just dropped them in without modifying the algorithms.
Does anyone have strong opinions about these names? /bikeshed.
I agree that we should have different names for these concepts. A quick google search shows that "Forward Adjacencies" has been used in RFC4655. The phrase appears once in this paragraph:
There is benefit in knowing which TE LSPs exist, and their routing, to support such applications as placing a high-priority TE LSP in a crowded network such that it preempts as few other TE LSPs as possible (also known as the "minimal perturbation" problem). Note that preempting based on the minimum number of links might not result in the smallest number of TE LSPs being disrupted. Another application concerns the construction and maintenance of a Virtual Network Topology [MLN]. It is also helpful to understand which other TE LSPs exist in the network in order to decide how to manage the forward adjacencies that exist or need to be set up. The cost- benefit of stateful PCE computation would be helpful to determine if the benefit in path computation is sufficient to offset the additional drain on the network and computational resources.
And in this comment in BGL's implementation of Parallel Brandes' Algorithm:
// Mark forward adjacencies in DAG of shortest paths
I think that neighbors is the more fundamental concept and the set of outgoing edges in a multigraph is the secondary concept with a representation that depends on the implementation details of the multigraph.
So actually, I just took a look at the code, and I realize that I misspoke yesterday. fadj
and badj
are not in interface.jl
. They're not even exported from simplegraphs
, just used as accessors. Therefore, the correct contract is already with in_neighbors
and out_neighbors
.
So I'll check off that first idea as "already done" - there is no need for every graph type to support fadj
and badj
. Just out_neighbors
and in_neighbors
, but with the caveat that they should not be altered (they should be treated as immutable).
I would still appreciate some discussion on the remaining issues.
First, a propos #642, what exactly is distinction between the API and core? I assume by API you mean "if you make a custom graph Type, it must offer those functions". Does that imply that in moving into the core
, we moving these functions inside SimpleGraphs, and people making their own Types don't have to offer those functions?
(If I follow that right, then) I think those basic modifications are so core to network analysis they belong in the API. I suppose one can measure descriptives without them, but things like clique percolation, robustness measures, and things like Louvain algorithm all require graph modifications. Never mind just building up graphs piecemeal.
Yeah the idea is that the interface is the minimal amount of functionality your type must provide in order to get the rest of the library functions to work on your type. There can be additional functions that you have to get in order to achieve optimal performance, but we don't really have a concrete plan for that yet.
To elaborate a bit further, this is how I think about it as a three-tiered system:
API - stuff that must be implemented in order for basic functionality tests to pass - these functions are the building blocks for everything else. These include accessors for the graph structures.
Core - functions that should be implemented (and/or optimized) in order to enable more complex features. Things like indegree
and outdegree
are included here. These functions typically use API directly as their building blocks.
Everything else - functions that use core or API to do some other stuff, and that don't necessarily serve as building blocks themselves. Examples would be the centrality measures, and a counter-example to the latter criterion would be dijkstra_shortest_paths
.
Yes the Core functionality should be stuff that is defined in terms of the API but might need to be reimplemented by a type to improve performance. and the Everything else should all be completely oblivious to the fact that there are multiple types of graphs underneath.
So things like merge
would be Core because it is not necessary that every graph type support efficient merging of vertices, but community detection is in the Everything else category building on Core operations. If your graph type doesn't have an efficient merge, then it won't have an efficient community detection either, but that is ok, we need to either find a way to implement merge efficiently or use a different data structure.
So things like
merge
would be Core
I agree with this (and realize it's a change from my original position of having this in operators
).
@sbromberger that's a very helpful summary, thanks. The README.md has a single category called "Core API" so I was getting a little confused.
In that case, my inclination is to still have the constructors/modifiers as API, but I understand if they get shifted to core.
The key for me is this: if we make it part of the API, then we're saying that every single graph type must implement this function. This includes immutable graphs and ones that might be backended by an external datastore. Immutable graphs can't / won't have any modifiers by definition, which is why I think these belong in core.
You're 100% right about the README being ambiguous. We need to fix that.
Yup makes sense. Hadn't thought about immutable graphs (especially backed by a database or something) as a thing someone would want to implement. On Tue, Jul 11, 2017 at 7:58 PM Seth Bromberger notifications@github.com wrote:
The key for me is this: if we make it part of the API, then we're saying that every single graph type must implement this function. This includes immutable graphs and ones that might be backended by an external datastore. Immutable graphs can't / won't have any modifiers by definition, which is why I think these belong in core.
You're 100% right about the README being ambiguous. We need to fix that.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaGraphs/LightGraphs.jl/issues/665#issuecomment-314622429, or mute the thread https://github.com/notifications/unsubscribe-auth/AJPC7TBeLUfjXaUeodEPjkVERZVlA7v1ks5sNCg2gaJpZM4ODumG .
Coming back to this. I don't think I like in_degree
replacing indegree
since we have indegree_centrality
and in_degree_centrality
seems awkward.
I think I'd prefer renaming in_neighbors
and out_neighbors
to inneighbors
and outneighbors
for consistency, or maybe just making neighbors(...; dir=:in)
instead. (Though this might have performance implications.)
@jpfairbanks - thoughts?
I would prefer neighbors(g, v, :in)
as long as it is fast enough. inneighbors
has 1 too many n
s and a keyword arg is a pain. For a real C style api neighbors(g, v, DIR_IN)
neighbors(g, v, DIR_OUT)
Here's my latest idea (hat tip @ararslan):
neighbors(g, v, ::Val{:in}) = ...
neighbors(g, v, ::Val{:out}) = ...
neighbors(g, v, ::Val{:both}) = ...
neighbors(g, v, d::Symbol=:out) = neighbors(g, v, Val(d))
similarly,
degree(g, v, ::Val{:in}) = ...
degree(g, v, ::Val{:out}) = ...
degree(g, v, ::Val{:both}) = ...
degree(g, v, d::Symbol=:out) = degree(g, v, Val(d))
@jpfairbanks - thoughts?
I like it!
other possibility is using empty struct
s
abstract type DegreeDirection end
struct InDegree <: DegreeDirection end
...
degree(g, v, ::InDegree) = ...
...
This would come to the same design as maximum_flow which takes a FlowAlgorithm
argument
I think the compile-time dispatch is still guaranteed. Which do you think would fit best for these?
This would come to the same design as maximum_flow which takes a
FlowAlgorithm
argument
Yes. The history behind using struct
s for dispatch in LightGraphs is twofold: first, Graphs.jl used them, and a lot of our code came from that package. second, Val
wasn't an option.
On the flip side: where we're using empty structs, we should evaluate whether Val
could be used instead. GraphIO.jl could benefit.
degree(g, v, ::InDegree) = ...
This bothers me, because it's redundant, and we'd need to have InNeighbors
, and In*
for other things.
This is a problem for all functions that currently take a neighborfunction those functions currently would require you to make an anonymous function. Or "lift" the :in/:out vals up to their argument lists.
This is a problem for all functions that currently take a neighborfunction those functions currently would require you to make an anonymous function.
This is a key point. We do this throughout the code for the degree functions. Alright, I think we're back to just renaming them in_degree
and out_degree
.
Yeah, and thinking for metagraphs you might have weighted_degree
property_degree` which count the degree according to some other definition than the number of edges such as sum of edge weights or number of edges that satisfy some predicate.
I pulled 0.13.0 because I'd really like to get this in first.
Proposal:
fadj
/badj
withoutneighbors
/inneighbors
(ref consensus in #823)out_neighbors
andin_neighbors
add_edge!
,rem_edge!
,add_vertex!
,rem_vertex!
, andadd_vertices!
from API (put them in core)This would simplify the API to the following functions:
nv
ne
vertices
edges
is_directed
has_vertex
has_edge
in_neighbors
out_neighbors
zero
All changes would move through deprecation for at least one minor version.
cc @jpfairbanks @juliohm