Open szhorvat opened 10 months ago
The calculation of the normalized Laplacian matrix is not documented for directed graphs. Consider the graph 1 -> 2. The out-degree of vertex 1, 2 is zero, one, respectively. The formula in the documentation leads to division by zero.
A normalized version of the Laplacian Matrix is similar: element (i,j) is 1 if i==j, -1/sqrt(d[i] d[j]) if i!=j and there is an edge between vertices i and j and 0 otherwise.
Note that
laplacian_matrix(graph(c(1,2,1,3)), normalized=TRUE)
gives
[1,] 1 -0.5 -0.5
[2,] . . .
[3,] . . .
Weights and multi-edges are not discussed.
I have developed a few tests for thelaplacian_matrix()
function. What is the best place to store the tests?
I'm too overwhelmed to take a deep look, but I seem to remember coming to a similar conclusion and correcting these issues in the C core. Eventually (hopefully soon) laplacian_matrix()
in R should be refactored to pick up these improvements. The purpose of creating these tests is to make sure that nothing changes in the default output during this refactoring.
Here's the C documentation where I typed up how normalization works at one point:
This should explain what normalization was used for the old version (what laplacian_matrix()
does at this moment in R):
It would be nice to have a function B <- vertex_edge_incidence_matrix()
, both for directed and undirected graphs.
Then the laplace_matrix could also be defined as L = BWBT. Where BT is the transpose of B, and W is the the diagonal weight matrix diag(strength())
.
I believe this is the reason for the failures in countland. We identified this as the only package for which we know that tests are failing and where the failures aren't handled one way or another.
requireNamespace("Matrix")
#> Loading required namespace: Matrix
mat <- rbind(
c(116, 210, 200),
c(210, 386, 380),
c(200, 380, 401)
)
A <- as(mat, "dgCMatrix")
Ai <- igraph::as.undirected(igraph::graph.adjacency(A, weighted = T))
Ai
#> IGRAPH f882c58 U-W- 3 6 --
#> + attr: weight (e/n)
#> + edges from f882c58:
#> [1] 1--1 1--2 2--2 1--3 2--3 3--3
L <- igraph::laplacian_matrix(Ai, normalized = T)
L
#> 3 x 3 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 1.0000000 -0.4269739 -0.4101324
#> [2,] -0.4269739 1.0000000 -0.6495964
#> [3,] -0.4101324 -0.6495964 1.0000000
Created on 2024-01-23 with reprex v2.1.0
requireNamespace("Matrix")
#> Loading required namespace: Matrix
mat <- rbind(
c(116, 210, 200),
c(210, 386, 380),
c(200, 380, 401)
)
A <- as(mat, "dgCMatrix")
Ai <- igraph::as.undirected(igraph::graph.adjacency(A, weighted = T))
#> Warning: `graph.adjacency()` was deprecated in igraph 2.0.0.
#> ℹ Please use `graph_from_adjacency_matrix()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
Ai
#> IGRAPH 6fdb7b3 U-W- 3 6 --
#> + attr: weight (e/n)
#> + edges from 6fdb7b3:
#> [1] 1--1 1--2 2--2 1--3 2--3 3--3
L <- igraph::laplacian_matrix(Ai, normalized = T)
L
#> 3 x 3 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 0.7794677 -0.2930903 -0.2784214
#> [2,] -0.2930903 0.6045082 -0.3883508
#> [3,] -0.2784214 -0.3883508 0.5912334
Created on 2024-01-23 with reprex v2.1.0
CC @shchurch.
@clpippel: Would you like to post your tests here? Happy to embed them.
Results are the same for the unnormalized case, though. I'll mark this a breaking change because I don't see how to make the current code compatible with the old implementation.
requireNamespace("Matrix")
#> Loading required namespace: Matrix
mat <- rbind(
c(116, 210, 200),
c(210, 386, 380),
c(200, 380, 401)
)
A <- as(mat, "dgCMatrix")
Ai <- igraph::as.undirected(igraph::graph.adjacency(A, weighted = T))
#> Warning: `graph.adjacency()` was deprecated in igraph 2.0.0.
#> ℹ Please use `graph_from_adjacency_matrix()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
Ai
#> IGRAPH dd41cdb U-W- 3 6 --
#> + attr: weight (e/n)
#> + edges from dd41cdb:
#> [1] 1--1 1--2 2--2 1--3 2--3 3--3
L <- igraph::laplacian_matrix(Ai, normalized = FALSE)
L
#> 3 x 3 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 820 -420 -400
#> [2,] -420 1180 -760
#> [3,] -400 -760 1160
Created on 2024-01-23 with reprex v2.1.0
Graphs
# clp
# laplacian_matrix() needs tests, #1102
# https://github.com/igraph/rigraph/issues/1102
library(igraph)
library(Matrix)
igraph_version() # W11
# [1] "1.6.0.9025"
# graphs to be tested.
# Directed, simple, unweighted, without loops
g1 <- make_ring(3, circular=FALSE) # (----)
g2 <- graph(c(1,2,1,2,2,3), directed=FALSE ) # (-M--)
g2w <- graph(c(1,2,2,3), directed=FALSE ) #
E(g2w)$weight <- c(2,1); E(g2w)$label <- E(g2w)$weight # (-MW-)
g3 <- make_ring(3,circular=FALSE, directed=TRUE) # (D---)
g4 <- graph(c(1,2,1,2,2,3), directed=TRUE ) # (DM--)
g4w<- g4;
E(g4w)$weight<- c(2,3, 5); E(g4w)$label<- E(g4w)$weight # (DMW-)
# An undirected loop adds 2 degree, or 1 (depending on convention).
g5 <- graph(c(1,1,1,1,1,2,1,2,2,3), directed=FALSE) # (-M-L)
# A directed loop adds 0 degree.
g6 <- graph(c(1,1,1,1,1,2,1,2,2,3), directed=TRUE) # (DM-L)
Tests
g <- g5
if (is_weighted(g)) # Adjacency matrix.
A <- as_adjacency_matrix(g, attr="weight") else
A <- as_adjacency_matrix(g)
D <- Diagonal(x=strength(g, mode="out") ) # Weighted degree matrix.
L <- laplacian_matrix(g) # Laplacian.
format( object.size(L), units = "MB")
if ( is_weighted(g)) W <- Diagonal(x=E(g)$weight) # Diagonal matrix W containing the edge weights.
if (!is_weighted(g)) W <- Diagonal(gsize(g)) # Identity matrix, indexed by edge number.
# When undirected, L is symmetric.
if (!is_directed(g)) isSymmetric(L)
# Test number of components.
if (!is_connected(g) ) {
sum(round(eigen(L)$values, 10) == 0L) == components(g, mode="weak")$no }
# Test definition of Laplacian matrix (graph without loops).
if (!any_loop(g)) {
print(all(L == D - A)) # L = D - A.
print(all(rowSums(L) == 0)) # Row sums equal 0.
}
# Validate https://igraph.org/r/doc/laplacian_matrix.html
diagL <- diag(L, nrow = length(L)) # diagonal values of L.
all(diagL == strength(g, mode="out")) # Diagonal L is weighted degree g.
all(L[row(L) != col(L)] == -A[row(A) != col(A)]) # off-diagonal L equals minus off-diagonal A.
if (!any_loop(g)) all(rowSums(L) == 0L) # Test row sums == 0 when graph without loops.
Graphs g5, g6 fail.
I believe this is the reason for the failures in countland. We identified this as the only package for which we know that tests are failing and where the failures aren't handled one way or another.
https://github.com/igraph/rigraph/issues/1102#issuecomment-1906455815
@krlmlr This change is actually a bugfix in the handling of self-loops. When self-loops are present, the diagonal of the Laplacian $L = D - A$ will no longer contain the degrees. Therefore the normalized Laplacian will also not contains all 1s on the diagonal. Notice that 1.6 returns all 1s.
As I recall, the bugfix applies to both normalized and non-normalized versions (both were wrong IIRC).
Here's what I suggest we should do with igraph_laplacian()
:
normalization
with values "unnormalized"
, "symmetric"
, "left"
, "right"
mode
(usual out, in, all values) for directed graphsnormalized=T
should use "symmetric"
when the graph is undirected or mode == "all"
. It should use "left"
with directed graphs with mode set to out/in. There is no very good reasoning for this behaviour, so eventually normalized
should be deprecated and removed, forcing users to think about what sort of normalization they want and why. The risk is that people read "normalized Laplacian" in a book and naïvely assume that this is normalized=T
in igraph—but that won't always be the case.Something to pay attention to is that R/igraph is still not exposing the loop-control feature of as_adjacency_matrix()
, i.e. to control whether a self-loop adds 1 or 2 to the diagonal in undirected graphs. Right now it always adds 1. The implementation of laplacian_matrix()
always assumes 2.
I tried in R/igraph:
library(igraph)
g <- make_graph(c(1,1), directed = FALSE)
degree(g)
laplacian_matrix(g)
Function degree(), laplacian_matrix() assumes 2, 0 respectively. Same if g is directed.
The implementation of
laplacian_matrix()
in 2.0 looks at least suspicious to me. This function does not have tests at the moment. Adding tests that check that nothing changed since 1.6.0 should be a priority.Dense and sparse cases should both be tested