Closed sbromberger closed 9 years ago
It looks like you are building the CSC "by hand" on lines 29 to 34 of adjacency_matrix. If SimpleGraph is being stored as a SparseMatrixCSC, why not just return that matrix to the user?
This function in https://github.com/JuliaLang/julia/blob/master/base/sparse/csparse.jl#L15 builds CSC matrices from coordinate triples. The functions are LGPL so if you don't want to LGPL LightGraphs, don't copy/look at it. Just use it.
sparse{Tv,Ti<:Integer}(I::AbstractVector{Ti},
J::AbstractVector{Ti},
V::AbstractVector{Tv},
nrow::Integer, ncol::Integer,
combine::Union{Function,Base.Func})
Since LightGraphs stores the set of edges, you can build the triples from that Set with one pass
I = []
J = []
for (u,v) in g.edges
push!(I,u)
push!(J,v)
end
A = sparse(I,J,1)
Because we are storing things in CSC we want A[:,i]
to be the out_neighbors(g,i)
and that also matches the Linalg definition so that A*e_i = sum(e_j for j in out_neighbors(g,i))
. Based on the formula for Katz centrality in the paper by @weijianzhang (I-A)x = ones(n)
he is using the definition that A[i,j] = 1 iff j -> i
If you are using CSR then everything is backwards and you define Katz Centrality as the x
which solves x(I-A) = 1
I was rolling the adjacency matrix by hand. I'm not storing SimpleGraph as a SparseMatrixCSC. I am storing SimpleSparseGraph as one, though (this is the current test).
The issue comes in with SimpleGraph not providing the same results as SimpleSparseGraph, and that's what's concerning me.
ETA: ok, I get that things should be backwards with CSC. That means that "forward adj matrix" needs to be renamed to "backwards adj matrix" and then things should just work. :)
There are two issues here: how things are stored internally, and how they're presented to the user.
As far as storage, we should be using whatever's more performant. That is, if we're using CSC sparse matrices, we should represent forward adjacencies as fm[i,j] = 1 iff j->i, and backward adjacencies as bm[i,j] =1 iff i->j.
For presentation, however, this presents a problem. We don't want to expose the internal structures since this may result in cases where we have an adjacency matrix that is column-major for one implementation and row-major for the other. Since graph theorists (unscientific polling n = 3) typically expect adjacency matrices to be A[i,j] = 1 iff i->j, I think adjacency_matrix()
should return something such that full(adjacency_matrix(g))
yields the expected matrix, and this should be completely independent of the underlying graph storage type.
I will rename references to :out
and :in
in adjacency_matrix()
to :byrow
or :bycol
. These should be used by programming types, not graph theory types. :byrow
will be default.
Any objections to this plan?
(I've implemented this in the sparsemx branch already. Closing out now - please feel free to reopen if you think it needs more discussion.)
This is a good recap of our discussion. I would just like to add that even full(adjacency_matrix(g))
is not independent of the underlying storage since access to columns of full(adjacency_matrix(g))
is faster than accessing the rows. So while correctness might be independent of storage, performance never is.
I've made some interesting progress with the new sparsematrix implementation. I've also implemented setindex!()
and getindex()
on graphs for a nice easy way to add and remove edges.
Nice, just make sure that the docs say that setindex! is inefficient compared to bulk insertions
We don't have a method yet for bulk insertions :)
cc @jpfairbanks
I noticed something unusual today:
This looks backwards. The adjacency matrix
A[i,j]
should be1
if there's an edge fromi
toj
.See https://github.com/JuliaGraphs/LightGraphs.jl/blob/master/src/spectral.jl#L9-L35 for the relevant code.
The problem is, when I change it to what I think is correct code,
katz_centrality()
breaks.