Closed jpfairbanks closed 8 years ago
LightGraphs exports CSC matrices using adjacency_matrix()
- will that help?
That means that we can write a bridge easily. If you have a LightGraph and want to make a CombinatorialAdjacency you can just use
using LightGraphs
using GraphMatrices
n = 10
g = Graph(n)
add_edge!(g, 1, 2)
A = CombinatorialAdjacency(adjacency_matrix(g))
L = CombinatorialLaplacian(A)
But this method copies all of the data in g
into a CSC matrix. This is worth it if you are going to make a static graph and then do a lot of operations on the GraphMatrices of g
. If you are going to repeatedly modify the graph and then update the GraphMatrices, then you are going to pay for a lot of copying.
If we back the GraphMatrix with some structure that provides access to the edges, degrees, a matvec function and the following array functions:
then we can have a data type for GraphMatrices which will update when the graph changes without copying. I think we just need to write some functions on LightGraph.Graph
in order to make it work here. Then abstract out the storage of the adjacency matrix in the CombinatorialAdjacency type.
For lightgraphs, it's easy. eltype(A)
is whatever you want (you can get adjacency matrices in any type; it defaults to int). length(A)
is nv(g)
, ndims(A)
is 2
, size(A)
is (nv(g), nv(g))
, size(A,n)
== nv(g)
for all n
. Not sure what strides are :)
For 2d arrays, they are the spacing in memory between the columns. They aren't necessary for sparse Matrices to implement. Operations that use the memory layout of a matrix directly won't work but that is OK.
Can you write a matvec routine for lightgraphs? It should take an arbitrary type AbstractVector{T} and use the right element type on each edge so that the output vector is the same type as the input vector. There should also be a form *!(g, y, x) where y is the preallocated output.
The naive way is for each vertex for each neighbor y[i] += x[i]. But you should pick the way that is fastest on the lightgraphs data structure. Having edge weights will be important in the future.
On Thu, Mar 19, 2015, 10:27 AM Seth Bromberger notifications@github.com wrote:
For lightgraphs, it's easy. eltype(A) is whatever you want (you can get adjacency matrices in any type; it defaults to int). length(A) is nv(g), ndims(A) is 2, size(A) is (nv(g), nv(g)), size(A,n) == nv(g) for all n. Not sure what strides are :)
— Reply to this email directly or view it on GitHub https://github.com/jpfairbanks/GraphMatrices.jl/issues/1#issuecomment-83606088 .
Help me understand the matvec
concept a bit better, please. A vector of what, exactly?
x is a vector of any type representing one value per vertex. The operation y = A*x where A is the adjacency matrix and x is a vector whose elements are compatible with the edge weights is the action of the graph on vectors. You can think of a vector as a function from vertices to values in some type T.
For example if T is Int and g is a graph
A = CombinatorialAdjacency(g)
x=ones(Int, nv(g))
y=A*x
for i=1:nv(g)
@test y[i] == degree(g, i)
end
Spectral Graph Theory is the field of thinking about graphs in this way. The graph is a linear operator on the vector space T^nv(g). Then things like var = x'_L_x is the variance of the function x over the graph. In math T is usually one of Integer, Real or Complex. The graph can have edge weights in Integer, Real, or Complex too.
Having access to the matvec function is sufficient to do most linear algebra by applying iterative methods.
The easiest (and probably fastest) way to do this, if I understand it correctly, is just to regenerate the adjacency matrices on demand. They're very fast, even with graphs on the order of (1e6,1e7):
julia> @time a = adjacency_matrix(g);
elapsed time: 2.46080318 seconds (515 MB allocated, 39.28% gc time in 11 pauses with 1 full sweep)
julia> size(a)
(1000000,1000000)
Why is 40% of the time spent in gc? The benefit of using the graph directly to do the matvec instead of converting to SparseCSCMatrix is only going to appear when making small changes to the graph and then using matrix again. That assumes that edge insertion to a graph is faster than into a CSC or going through the CSC -> COO, append new edges, COO -> CSC cycle.
It might be the case with fast sparse matrix routines that reconstructing the adjacency matrix as a CSC is the fastest way. If so, we should just make the user reconstruct the CombinatorialAdjacency object every time the graph changes, because the usage I foresee is change graph -> do 1-100 matvecs -> change graph again. So once you modify the graph, many matvecs will be performed before modifying the graph again.
for j in 1:n_v
dsts = out_neighbors(g, j) #<-memory allocation
colpt[j+1] = colpt[j] + length(dsts)
append!(rowval,sort!(dsts))
end
These all allocate a new array for their output. Could it be done with subarrays?
neighbors(g::AbstractGraph, v::Int) = [e.dst for e in g.finclist[v]]
in_neighbors(g::AbstractGraph, v::Int) = [e.src for e in g.binclist[v]]
out_neighbors(g::AbstractGraph, v::Int) = [e.dst for e in g.finclist[v]]
If you don't want to complicate the API then at least the memory could be reused in the construction of the CSC by using out_edges and manually reusing the memory.
I wonder if we can simply translate finclist/binclist into an adjacency matrix very quickly. I'll try this.
I'd like to revisit this if you're still maintaining the package. I think having this functionality available to LightGraphs would be very cool.
Yes I am maintaining this. I haven't been using it this summer but will be soon. What do you need? On Jul 28, 2015 6:01 PM, "Seth Bromberger" notifications@github.com wrote:
I'd like to revisit this if you're still maintaining the package. I think having this functionality available to LightGraphs would be very cool.
— Reply to this email directly or view it on GitHub https://github.com/jpfairbanks/GraphMatrices.jl/issues/1#issuecomment-125766183 .
I'd like to create / close out https://github.com/JuliaGraphs/LightGraphs.jl/issues/23 :)
I think we just need to make sure that g_v is defined for graph_vector in an efficient way and then change some type constraints in this package and it all will work. Of course adding tests.
I forgot to quote the '*' operator
What should g*v
produce?
A vector y such that y[i]=sum(v[out_neighbors(g, i)])
for all i .
One way to represent it is
for (i,j) in edges(g)
y[i] += v[j]
end
What's the correct behavior for undirected graphs? Should we count both directions, or just one?
This function counts both directions (and therefore the sum(bar(g,ones(Int,nv(g))))
equals 2*ne(g)
):
function bar{T<:Real}(g::SimpleGraph, v::Vector{T})
y = zeros(T, nv(g))
for i in vertices(g)
y[i] = sum(v[out_neighbors(g, i)])
end
return y
end
but the sum of the vector returned by your function above yields ne(g)
.
Your function is obviously faster, so if it's correct, awesome.
(ping?)
yeah it should count both directions like you have. did you test them to see which is faster? I haven't tried either.
Yours is faster. We can modify it: I think
for (i,j) in edges(g)
y[i] += v[j]
y[j] += v[i]
end
for Graphs
would work, and we can specialize on DiGraphs
without the second accumulator.
Confirmed that this:
function foo{T<:Real}(g::Graph, v::Vector{T})
y = zeros(T, nv(g))
for (i,j) in edges(g)
y[i] += v[j]
y[j] += v[i]
end
return y
end
works and produces the same results as bar(g,v)
above, while
function foo{T<:Real}(g::DiGraph, v::Vector{T})
y = zeros(T, nv(g))
for (i,j) in edges(g)
y[i] += v[j]
end
return y
end
does the same for directed graphs.
So, what do I call this function, and where do we put it - in GraphMatrices, or in LightGraphs?
You call it *
and there are two options
using LightGraphs
using GraphMatrices
g = graph(nv)
# initialize graph
A = CombinatorialAdjacency(g)
L = CombinatorialLaplacian(A)
evals = eigs(L)
You can use L like a matrix in any iterative method.
I think it should go in LightGraphs. GraphMatrices should be completely independent of the storage format underlying the graph.
The code above should work (you will need to actually do the initialization). If it doesn't then it is probably because the other functions necessary (size, eltype,...) aren't defined for the graph type. I am back on campus next week so I will be back to working with graphs.
OK, I've created a matvec
branch in LightGraphs with the necessary code. Unfortunately, I have no way to test that it actually produces correct results with GraphMatrices. Could you test (and write some tests)? Thanks.
Also, I error out if the length of the vector does not equal the number of vertices. Should we error if length < nv
and merely warn if it's >
?
Also also: should the *
be commutative? (That is, should I define *(v::Vector, g::SimpleGraph)
as well?)
julia> g = PathGraph(10)
{10, 9} undirected graph
julia> A = CombinatorialAdjacency(g)
ERROR: MethodError: `convert` has no method matching convert(::Type{GraphMatrices.CombinatorialAdjacency{T}}, ::LightGraphs.Graph)
This may have arisen from a call to the constructor GraphMatrices.CombinatorialAdjacency{T}(...),
since type constructors fall back to convert methods.
julia> A = CombinatorialAdjacency(adjacency_matrix(g))
GraphMatrices.CombinatorialAdjacency{Int64}(10x10 sparse matrix with 18 Int64 entries:
[2 , 1] = 1
[1 , 2] = 1
[3 , 2] = 1
[2 , 3] = 1
[4 , 3] = 1
[3 , 4] = 1
[5 , 4] = 1
[4 , 5] = 1
[6 , 5] = 1
[5 , 6] = 1
[7 , 6] = 1
[6 , 7] = 1
[8 , 7] = 1
[7 , 8] = 1
[9 , 8] = 1
[8 , 9] = 1
[10, 9] = 1
[9 , 10] = 1,[1,2,2,2,2,2,2,2,2,1])
I am looking into it. I'll make a PR I think we have a decision to make.
Can we close this until somebody else writes a graphs library and wants to integrate?
Fine by me. It's working :)
This works on LightGraphs, open a new issue if you want integration with a different graph library.
Currently this only uses CSC matrices and has only been tested on that. But you could have the adjacency matrix stored in a LightGraphs.jl or Graphs.jl Graph data structure.