igraph / xdata-igraph

xdata igraph, has been merged into igraph/igraph
GNU General Public License v2.0
18 stars 3 forks source link

Adjacency Spectral Embedding fails to converge #49

Closed dpmcsuss closed 10 years ago

dpmcsuss commented 10 years ago

Perhaps we just need to increase the max iteration or try random restarts.

set.seed(12346)
g<- erdos.renyi.game(100,.2)
ase<-adjacency.spectral.embedding(g,2)
# Error in .Call("R_igraph_adjacency_spectral_embedding", graph, no, weights,  : 
#   At arpack.c:944 : ARPACK error, Maximum number of iterations reached
# In addition: Warning message:
# In .Call("R_igraph_adjacency_spectral_embedding", graph, no, weights,  :
#   At arpack.c:776 :ARPACK solver failed to converge (3001 iterations, 1/2 eigenvectors converged)
clusters(g)
# $membership
#   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#  [42] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#  [83] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# 
# $csize
# [1] 100
#
# $no
# [1] 1
gaborcsardi commented 10 years ago

So why don't you just increase the number of iterations?

youngser commented 10 years ago

Try this:

set.seed(12346) g<- erdos.renyi.game(100,.2) ase<-adjacency.spectral.embedding(g,2,options=list(maxiter=5000)) str(ase) List of 4 $ X : num [1:100, 1:2] -0.402 -0.338 -0.388 -0.406 -0.469 ... $ Y : NULL $ D : num [1:2] 21.28 -7.86 $ options:List of 20 ..$ bmat : chr "I" ..$ n : int 100 ..$ which : chr "LM" ..$ nev : int 2 ..$ tol : num 0 ..$ ncv : int 3 ..$ ldv : int 0 ..$ ishift : int 1 ..$ maxiter: int 5000 ..$ nb : int 1 ..$ mode : int 1 ..$ start : int 0 ..$ sigma : num 0 ..$ sigmai : num 0 ..$ info : int 0 ..$ iter : int 3605 ..$ nconv : int 2 ..$ numop : int 3607 ..$ numopb : int 0 ..$ numreo : int 2

On May 12, 2014, at 11:26 AM, Gabor Csardi notifications@github.com wrote:

So why don't you just increase the number of iterations?

— Reply to this email directly or view it on GitHub.

dpmcsuss commented 10 years ago

Perhaps we can increase the default maxiter to 5000 then? I was pretty routinely getting errors for some tests I was running.

dpmcsuss commented 10 years ago

I'm still seeing problems with this even with 5000 iterations. I understand that we can just increase maxiter but not sure why we are having this problem to begin with. Perhaps I'm wrong but I rarely saw anything like this before with failure to converge. @itmfl @youngser have you seen this?

gaborcsardi commented 10 years ago

anything like this before with failure to converge

Before what?

dpmcsuss commented 10 years ago

I guess before using iGraph, ie when I just used eigen or irlba in R. Also, I didn't see this before the changes you implemented in regards to #38.

youngser commented 10 years ago

Hmm, I've never seen it before other than your example...

dpmcsuss commented 10 years ago

When did you last update your igraph? I am using igraph_0.7.999-329

I just found another example this morning.

youngser commented 10 years ago

I'm still using 0.7.999-320. My above example still works. Let me try 329 and see if it works.

gaborcsardi commented 10 years ago

Before we always did the SVD, even if the graph was undirected, and maybe that stabilizes things. It also seems that lm is less stable than la, but this might not be true in general. We can also try tuning the parameters, e.g. increasing ncv usually helps.

But maybe the irlba method is just better than ARPACK, on average.

dpmcsuss commented 10 years ago

I am fine with going back to SVD. It is quite easy to compute the eigendecomposition from the SVD by checking whether the sign on the left and right singular vectors are equal.

gaborcsardi commented 10 years ago

Let's just increase ncv first. If this does not solve the problems, then let me know and we can go back to SVD.

youngser commented 10 years ago

Here is what I'm getting:

> require(igraph)
> igraph.version()
[1] "0.8.0-pre+329.25d2917"
> set.seed(12346)
> g<- erdos.renyi.game(100,.2)
> ase <- adjacency.spectral.embedding(g,2)
> str(ase)
List of 4
 $ X      : num [1:100, 1:2] -0.402 -0.338 -0.388 -0.406 -0.469 ...
 $ Y      : NULL
 $ D      : num [1:2] 21.28 -7.86
 $ options:List of 20
  ..$ bmat   : chr "I"
  ..$ n      : int 100
  ..$ which  : chr "LM"
  ..$ nev    : int 2
  ..$ tol    : num 0
  ..$ ncv    : int 3
  ..$ ldv    : int 0
  ..$ ishift : int 1
  ..$ maxiter: int 5000
  ..$ nb     : int 1
  ..$ mode   : int 1
  ..$ start  : int 0
  ..$ sigma  : num 0
  ..$ sigmai : num 0
  ..$ info   : int 0
  ..$ iter   : int 3605
  ..$ nconv  : int 2
  ..$ numop  : int 3607
  ..$ numopb : int 0
  ..$ numreo : int 2
dpmcsuss commented 10 years ago

Try

library(igraph)
set.seed(12345)
# SBM parameters
pref.matrix <- diag(0.2,2)+0.2
block.sizes <-c(100,100)
n <- sum(block.sizes)
# Generate SBM Graph
g<-sbm.game(n,pref.matrix,block.sizes)
# Compute the ASE for d=2
ase.X <- adjacency.spectral.embedding(g,no=2)$X

Which gives

Error in .Call("R_igraph_adjacency_spectral_embedding", graph, no, weights,  : 
  At arpack.c:944 : ARPACK error, Maximum number of iterations reached
In addition: Warning message:
In .Call("R_igraph_adjacency_spectral_embedding", graph, no, weights,  :
  At arpack.c:776 :ARPACK solver failed to converge (5001 iterations, 1/2 eigenvectors converged)
youngser commented 10 years ago

Ha! It failed on me at the first try (with the exactly the same error), but it worked on the second try, with the same seed!! This seems to be strange...

library(igraph)
set.seed(12345)
# SBM parameters
pref.matrix <- diag(0.2,2)+0.2
block.sizes <-c(100,100)
n <- sum(block.sizes)
# Generate SBM Graph
g<-sbm.game(n,pref.matrix,block.sizes)
# Compute the ASE for d=2
ase.X <- adjacency.spectral.embedding(g,no=2)$X
str(ase.X)
# num [1:200, 1:2] 0.615 0.436 0.52 0.606 0.592 ...
gaborcsardi commented 10 years ago

Increasing ncv is fine here, it only needs a couple of iterations:

ase <- adjacency.spectral.embedding(g, no=2, options=list(ncv=5))
ase$options$iter
# [1] 17

The problematic cases are the ones that have clustered eigenvalues. That's why la works better, because there is less clustering there.

The current 5000 default iterations is too extreme btw., even 3000 is. If it does not converge in a couple of hundred iterations, then you are better off using a dense solver (if the graph is smallish) or increase ncv. I would set it to 1000, increase ncv to nev+3, and use a dense solver for all graphs with less than (say) a 1000 vertices.

@youngser: ARPACK does not use R's RNG seeds, it is Fortran code and it has its own RNG. I am not even sure that it is deterministic from a given starting point.

dpmcsuss commented 10 years ago

All those suggestions sound fine.