laplacian_matrix() needs tests

szhorvat commented 10 months ago

The implementation of laplacian_matrix() in 2.0 looks at least suspicious to me. This function does not have tests at the moment. Adding tests that check that nothing changed since 1.6.0 should be a priority.

Dense and sparse cases should both be tested

clpippel commented 9 months ago

The calculation of the normalized Laplacian matrix is not documented for directed graphs. Consider the graph 1 -> 2. The out-degree of vertex 1, 2 is zero, one, respectively. The formula in the documentation leads to division by zero.

A normalized version of the Laplacian Matrix is similar: element (i,j) is 1 if i==j, -1/sqrt(d[i] d[j]) if i!=j and there is an edge between vertices i and j and 0 otherwise.

Note that laplacian_matrix(graph(c(1,2,1,3)), normalized=TRUE) gives

[1,] 1 -0.5 -0.5
[2,] .  .    .  
[3,] .  .    .

Weights and multi-edges are not discussed.

clpippel commented 9 months ago

I have developed a few tests for thelaplacian_matrix() function. What is the best place to store the tests?

szhorvat commented 9 months ago

I'm too overwhelmed to take a deep look, but I seem to remember coming to a similar conclusion and correcting these issues in the C core. Eventually (hopefully soon) laplacian_matrix() in R should be refactored to pick up these improvements. The purpose of creating these tests is to make sure that nothing changes in the default output during this refactoring.

Here's the C documentation where I typed up how normalization works at one point:

This should explain what normalization was used for the old version (what laplacian_matrix() does at this moment in R):

https://igraph.org/c/html/latest/igraph-Structural.html#igraph_laplacian

clpippel commented 9 months ago

It would be nice to have a function B <- vertex_edge_incidence_matrix(), both for directed and undirected graphs.

Then the laplace_matrix could also be defined as L = BWB^T. Where B^T is the transpose of B, and W is the the diagonal weight matrix diag(strength()).

krlmlr commented 9 months ago

I believe this is the reason for the failures in countland. We identified this as the only package for which we know that tests are failing and where the failures aren't handled one way or another.

igraph 1.6.0

requireNamespace("Matrix")
#> Loading required namespace: Matrix
mat <- rbind(
  c(116, 210, 200),
  c(210, 386, 380),
  c(200, 380, 401)
)
A <- as(mat, "dgCMatrix")

Ai <- igraph::as.undirected(igraph::graph.adjacency(A, weighted = T))
Ai
#> IGRAPH f882c58 U-W- 3 6 -- 
#> + attr: weight (e/n)
#> + edges from f882c58:
#> [1] 1--1 1--2 2--2 1--3 2--3 3--3
L <- igraph::laplacian_matrix(Ai, normalized = T)
L
#> 3 x 3 sparse Matrix of class "dgCMatrix"
#>                                      
#> [1,]  1.0000000 -0.4269739 -0.4101324
#> [2,] -0.4269739  1.0000000 -0.6495964
#> [3,] -0.4101324 -0.6495964  1.0000000

^{Created on 2024-01-23 with reprex v2.1.0}

igraph 1.99.99

requireNamespace("Matrix")
#> Loading required namespace: Matrix
mat <- rbind(
  c(116, 210, 200),
  c(210, 386, 380),
  c(200, 380, 401)
)
A <- as(mat, "dgCMatrix")

Ai <- igraph::as.undirected(igraph::graph.adjacency(A, weighted = T))
#> Warning: `graph.adjacency()` was deprecated in igraph 2.0.0.
#> ℹ Please use `graph_from_adjacency_matrix()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
Ai
#> IGRAPH 6fdb7b3 U-W- 3 6 -- 
#> + attr: weight (e/n)
#> + edges from 6fdb7b3:
#> [1] 1--1 1--2 2--2 1--3 2--3 3--3
L <- igraph::laplacian_matrix(Ai, normalized = T)
L
#> 3 x 3 sparse Matrix of class "dgCMatrix"
#>                                      
#> [1,]  0.7794677 -0.2930903 -0.2784214
#> [2,] -0.2930903  0.6045082 -0.3883508
#> [3,] -0.2784214 -0.3883508  0.5912334

^{Created on 2024-01-23 with reprex v2.1.0}

CC @shchurch.

krlmlr commented 9 months ago

@clpippel: Would you like to post your tests here? Happy to embed them.

krlmlr commented 9 months ago

Results are the same for the unnormalized case, though. I'll mark this a breaking change because I don't see how to make the current code compatible with the old implementation.

requireNamespace("Matrix")
#> Loading required namespace: Matrix
mat <- rbind(
  c(116, 210, 200),
  c(210, 386, 380),
  c(200, 380, 401)
)
A <- as(mat, "dgCMatrix")

Ai <- igraph::as.undirected(igraph::graph.adjacency(A, weighted = T))
#> Warning: `graph.adjacency()` was deprecated in igraph 2.0.0.
#> ℹ Please use `graph_from_adjacency_matrix()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
Ai
#> IGRAPH dd41cdb U-W- 3 6 -- 
#> + attr: weight (e/n)
#> + edges from dd41cdb:
#> [1] 1--1 1--2 2--2 1--3 2--3 3--3
L <- igraph::laplacian_matrix(Ai, normalized = FALSE)
L
#> 3 x 3 sparse Matrix of class "dgCMatrix"
#>                    
#> [1,]  820 -420 -400
#> [2,] -420 1180 -760
#> [3,] -400 -760 1160

^{Created on 2024-01-23 with reprex v2.1.0}

clpippel commented 9 months ago

Graphs

# clp
# laplacian_matrix() needs tests, #1102
# https://github.com/igraph/rigraph/issues/1102

library(igraph)
library(Matrix)
igraph_version()                                        # W11
# [1] "1.6.0.9025"

# graphs to be tested.
# Directed, simple, unweighted, without loops  
g1  <- make_ring(3, circular=FALSE)                     # (----)

g2  <- graph(c(1,2,1,2,2,3), directed=FALSE )           # (-M--)
g2w <- graph(c(1,2,2,3), directed=FALSE )               # 
E(g2w)$weight <- c(2,1); E(g2w)$label <- E(g2w)$weight  # (-MW-)

g3 <- make_ring(3,circular=FALSE, directed=TRUE)        # (D---)

g4 <- graph(c(1,2,1,2,2,3), directed=TRUE )             # (DM--)
g4w<- g4;
E(g4w)$weight<- c(2,3, 5); E(g4w)$label<- E(g4w)$weight # (DMW-)

# An undirected loop adds 2 degree, or 1 (depending on convention).
g5 <- graph(c(1,1,1,1,1,2,1,2,2,3), directed=FALSE)     # (-M-L)

# A directed loop adds 0 degree.
g6 <- graph(c(1,1,1,1,1,2,1,2,2,3), directed=TRUE)      # (DM-L)

Tests


g <- g5
if (is_weighted(g))                                     # Adjacency matrix.
  A <- as_adjacency_matrix(g, attr="weight") else
  A <- as_adjacency_matrix(g)
D   <- Diagonal(x=strength(g, mode="out") )             # Weighted degree matrix.
L   <- laplacian_matrix(g)                              # Laplacian.
format( object.size(L), units = "MB")

if ( is_weighted(g)) W <- Diagonal(x=E(g)$weight)       # Diagonal matrix W containing the edge weights.
if (!is_weighted(g)) W <- Diagonal(gsize(g))            # Identity matrix, indexed by edge number.

# When undirected, L is symmetric.
if (!is_directed(g)) isSymmetric(L)

# Test number of components.
if (!is_connected(g) ) {
  sum(round(eigen(L)$values, 10) == 0L) == components(g, mode="weak")$no }

# Test definition of Laplacian matrix (graph without loops).
if (!any_loop(g)) {
  print(all(L == D - A))                                # L = D - A.
  print(all(rowSums(L) == 0))                           # Row sums equal 0.
}

# Validate https://igraph.org/r/doc/laplacian_matrix.html
diagL <- diag(L, nrow = length(L))                      # diagonal values of L.
all(diagL == strength(g, mode="out"))                   # Diagonal L is weighted degree g.
all(L[row(L) != col(L)] ==  -A[row(A) != col(A)])       # off-diagonal L equals minus off-diagonal A.

if (!any_loop(g)) all(rowSums(L) == 0L)                 # Test row sums == 0 when graph without loops.

Graphs g5, g6 fail.

szhorvat commented 9 months ago

I believe this is the reason for the failures in countland. We identified this as the only package for which we know that tests are failing and where the failures aren't handled one way or another.

https://github.com/igraph/rigraph/issues/1102#issuecomment-1906455815

@krlmlr This change is actually a bugfix in the handling of self-loops. When self-loops are present, the diagonal of the Laplacian $L = D - A$ will no longer contain the degrees. Therefore the normalized Laplacian will also not contains all 1s on the diagonal. Notice that 1.6 returns all 1s.

As I recall, the bugfix applies to both normalized and non-normalized versions (both were wrong IIRC).

szhorvat commented 9 months ago

Here's what I suggest we should do with igraph_laplacian():

Introduce a new parameter normalization with values "unnormalized", "symmetric", "left", "right"
Introduce a new parameter mode (usual out, in, all values) for directed graphs
For compatibility with older igraph, normalized=T should use "symmetric" when the graph is undirected or mode == "all". It should use "left" with directed graphs with mode set to out/in. There is no very good reasoning for this behaviour, so eventually normalized should be deprecated and removed, forcing users to think about what sort of normalization they want and why. The risk is that people read "normalized Laplacian" in a book and naïvely assume that this is normalized=T in igraph—but that won't always be the case.

szhorvat commented 9 months ago

Something to pay attention to is that R/igraph is still not exposing the loop-control feature of as_adjacency_matrix(), i.e. to control whether a self-loop adds 1 or 2 to the diagonal in undirected graphs. Right now it always adds 1. The implementation of laplacian_matrix() always assumes 2.

clpippel commented 9 months ago

I tried in R/igraph:

library(igraph)
g <- make_graph(c(1,1), directed = FALSE)
degree(g)
laplacian_matrix(g)

Function degree(), laplacian_matrix() assumes 2, 0 respectively. Same if g is directed.

igraph / rigraph

laplacian_matrix() needs tests #1102

igraph 1.6.0

igraph 1.99.99