yixuan / RSpectra

R Interface to the Spectra Library for Large Scale Eigenvalue and SVD Problems
http://cran.r-project.org/package=RSpectra
80 stars 12 forks source link

Error: TridiagEigen: failed to compute all the eigenvalues #1

Closed pcarbo closed 6 years ago

pcarbo commented 8 years ago

I get the above message when running

out <- svds(X,k = 2)

on a very large matrix X. It is hard to tell based on that message what is the problem because I do not get any more details. What are the possible issues? Should I try to increase the number of Lanzcos basis vectors, the convergence tolerance and/or the maximum number of iterations?

Thanks, Peter

yixuan commented 8 years ago

Hi Peter, I noticed that you closed this issue. Does it mean you've already solved it, or you just hit the wrong button by accident?

pcarbo commented 8 years ago

Hi Yixuan,

Thanks for the email.

No, it isn't solved---but I figured I should experiment with svds more before I report an issue. I've had a lot of success with eigs (for a different problem), so I don't understand why I'm having problems with svds. The error svds gave was frustrating because it wasn't very specific. If you have any suggestions or ideas as to what could be the problem, please let me know. In any case, I will continue to play around with it.

Peter

On Fri, Mar 18, 2016 at 4:37 PM, Yixuan Qiu notifications@github.com wrote:

Hi Peter, I noticed that you closed this issue. Does it mean you've already solved it, or you just hit the wrong button by accident?

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub https://github.com/yixuan/RSpectra/issues/1#issuecomment-198584620

yixuan commented 8 years ago

Usually this error occurs when the matrix contains many repeated eigenvalues. Is that the case for your matrix?

pcarbo commented 8 years ago

Yes, I suppose it is possible. I will look into it. Thank you for the suggestion.

pcarbo commented 8 years ago

Yixuan, I thought I'd bring up this issue again---maybe the solution is to provide a more informative error error message (e.g., "Your matrix may not be positive definite, or have repeated eigenvalues")?

yixuan commented 8 years ago

You are right. The error message you saw came from the C++ code. I need to consider how to generate better R error messages.

BTW do you have an example that triggers this error?

pcarbo commented 8 years ago

Unfortunately, I can't share the matrix with you due to restricted sharing. I'm trying to reproduce the error with another matrix, but I'm having trouble. Sorry!

chpmoreno commented 7 years ago

I have the same problem with a huge matrix I am using. The matrix is a sparce matrix (dgCMatrix). I think the problems is because the matrix is very huge. I have tried to decompose that matrix with all the options R offers but it has been imposible until now (I used RSparce, irlba and sparsesvd). I am trying to translate the procedure proposed for python in http://www.benfrederickson.com/matrix-factorization/. I have run the code on Python and it works. This is my R code (you can find the data on http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/lastfm-360K.html):

'

packages

library(readr) library(dplyr) library(gdata) library(Matrix) library(sparsesvd) library(irlba) library(RSpectra)

read file

data <- read_tsv(file = "lastfm-dataset-360K/usersha1-artmbid-artname-plays.tsv", col_names = FALSE) data <- data[, c(1, 3, 4)] colnames(data) <- c("user", "artist", "plays")

artist_levels <- dplyr::as.tbl(dplyr::data_frame("artist" = names(gdata::mapLevels(data$artist)), "code_artist" = gdata::mapLevels(data$artist)))

user_levels <- dplyr::as.tbl(dplyr::data_frame("user" = names(gdata::mapLevels(data$user)), "code_user" = gdata::mapLevels(data$user)))

data <- data %>% dplyr::left_join(artist_levels) %>% dplyr::left_join(user_levels)

data$code_artist <- as.integer(data$code_artist) data$code_user <- as.integer(data$code_user)

plays <- Matrix::sparseMatrix(x = data$plays, i = data$code_artist, j = data$code_user) plays_info <- summary(plays)

bm25_weight <- function(X, X_info, K1 = 100, B = 0.8) { N <- dim(X)[1] idf <- log(N / (1 + table(X_info$j)))

row_sums <- Matrix::rowSums(X) average_length <- mean(row_sums) length_norm <- (1.0 - B) + B * row_sums / average_length

X@x <- as.numeric(X@x (K1 + 1.0) / (K1 length_norm[X_info$i] + X@x) * idf[X_info$j])

return(X) }

svd_plays <- sparsesvd::sparsesvd(bm25_weight(plays, plays_info), 50)

svd_plays <- irlba::irlba(bm25_weight(plays, plays_info), 50)

svd_plays <- RSpectra::svds(bm25_weight(plays, plays_info), 50)

svd_plays <- incore_stoch_svd(bm25_weight(plays, plays_info), 50)

`

privefl commented 7 years ago

From my experience using RSpectra::svds, I have encountered this issue several times. Each time it was

Can you confirm that you are not in one of these two cases?

Edit: some rows are flawed so that they are not well read by read_tsv, which warns you about it. This may induce missing values later on. Skipping them using

ind <- unique(problems(data)$row)
data <- data[-ind, c(1, 3, 4)]

at the beginning seems to solve the problem in the SVD computation.

petrelharp commented 7 years ago

Here is a small-ish example of this or a related problem without missing values or low-variation columns or repeated eigenvalues:

X <- structure(c(0.26, -0.16, -0.16, -0.16, -0.16, 0.25, 0.14, -0.16, 
                0.26, 0.26, 0.26, 0.25, -0.16, -0.16, -0.16, -0.16, -0.16, -0.16, 
                0.26, -0.16, -0.16, 0.1, 0.1, 0.1, 0.1, -0.16, -0.11, 0.1, -0.16, 
                -0.16, -0.16, -0.16, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, -0.16, 0.1, 
                -0.16, 0.1, 0.1, 0.1, 0.1, -0.16, -0.11, 0.1, -0.16, -0.16, -0.16, 
                -0.16, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, -0.16, 0.1, -0.16, 0.1, 
                0.1, 0.1, 0.1, -0.16, -0.11, 0.1, -0.16, -0.16, -0.16, -0.16, 
                0.1, 0.1, 0.1, 0.1, 0.1, 0.1, -0.16, 0.1, -0.16, 0.1, 0.1, 0.1, 
                0.1, -0.16, -0.11, 0.1, -0.16, -0.16, -0.16, -0.16, 0.1, 0.1, 
                0.1, 0.1, 0.1, 0.1, -0.16, 0.1, 0.25, -0.16, -0.16, -0.16, -0.16, 
                0.32, 0.12, -0.16, 0.25, 0.25, 0.25, 0.23, -0.16, -0.16, -0.16, 
                -0.16, -0.16, -0.16, 0.25, -0.16, 0.14, -0.11, -0.11, -0.11, 
                -0.11, 0.12, 0.37, -0.11, 0.14, 0.14, 0.14, 0.12, -0.11, -0.11, 
                -0.11, -0.11, -0.11, -0.11, 0.14, -0.11, -0.16, 0.1, 0.1, 0.1, 
                0.1, -0.16, -0.11, 0.1, -0.16, -0.16, -0.16, -0.16, 0.1, 0.1, 
                0.1, 0.1, 0.1, 0.1, -0.16, 0.1, 0.26, -0.16, -0.16, -0.16, -0.16, 
                0.25, 0.14, -0.16, 0.26, 0.26, 0.26, 0.25, -0.16, -0.16, -0.16, 
                -0.16, -0.16, -0.16, 0.26, -0.16, 0.26, -0.16, -0.16, -0.16, 
                -0.16, 0.25, 0.14, -0.16, 0.26, 0.26, 0.26, 0.25, -0.16, -0.16, 
                -0.16, -0.16, -0.16, -0.16, 0.26, -0.16, 0.26, -0.16, -0.16, 
                -0.16, -0.16, 0.25, 0.14, -0.16, 0.26, 0.26, 0.26, 0.25, -0.16, 
                -0.16, -0.16, -0.16, -0.16, -0.16, 0.26, -0.16, 0.25, -0.16, 
                -0.16, -0.16, -0.16, 0.23, 0.12, -0.16, 0.25, 0.25, 0.25, 0.32, 
                -0.16, -0.16, -0.16, -0.16, -0.16, -0.16, 0.25, -0.16, -0.16, 
                0.1, 0.1, 0.1, 0.1, -0.16, -0.11, 0.1, -0.16, -0.16, -0.16, -0.16, 
                0.1, 0.1, 0.1, 0.1, 0.1, 0.1, -0.16, 0.1, -0.16, 0.1, 0.1, 0.1, 
                0.1, -0.16, -0.11, 0.1, -0.16, -0.16, -0.16, -0.16, 0.1, 0.1, 
                0.1, 0.1, 0.1, 0.1, -0.16, 0.1, -0.16, 0.1, 0.1, 0.1, 0.1, -0.16, 
                -0.11, 0.1, -0.16, -0.16, -0.16, -0.16, 0.1, 0.1, 0.1, 0.1, 0.1, 
                0.1, -0.16, 0.1, -0.16, 0.1, 0.1, 0.1, 0.1, -0.16, -0.11, 0.1, 
                -0.16, -0.16, -0.16, -0.16, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, -0.16, 
                0.1, -0.16, 0.1, 0.1, 0.1, 0.1, -0.16, -0.11, 0.1, -0.16, -0.16, 
                -0.16, -0.16, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, -0.16, 0.1, -0.16, 
                0.1, 0.1, 0.1, 0.1, -0.16, -0.11, 0.1, -0.16, -0.16, -0.16, -0.16, 
                0.1, 0.1, 0.1, 0.1, 0.1, 0.1, -0.16, 0.1, 0.26, -0.16, -0.16, 
                -0.16, -0.16, 0.25, 0.14, -0.16, 0.26, 0.26, 0.26, 0.25, -0.16, 
                -0.16, -0.16, -0.16, -0.16, -0.16, 0.26, -0.16, -0.16, 0.1, 0.1, 
                0.1, 0.1, -0.16, -0.11, 0.1, -0.16, -0.16, -0.16, -0.16, 0.1, 
                0.1, 0.1, 0.1, 0.1, 0.1, -0.16, 0.1), .Dim = c(20L, 20L))

eigs_sym(X, k=1)

Perhaps this should be re-opened?

yixuan commented 7 years ago

@petrelharp I can successfully compute the first eigenvalue using the code your provided. Can you double check the package version?

> eigs_sym(X, 1)
$values
[1] 3.094653

$vectors
            [,1]
 [1,]  0.2870287
 [2,] -0.1802804
 [3,] -0.1802804
 [4,] -0.1802804
 [5,] -0.1802804
 [6,]  0.2858052
 [7,]  0.1862562
 [8,] -0.1802804
 [9,]  0.2870287
[10,]  0.2870287
[11,]  0.2870287
[12,]  0.2858052
[13,] -0.1802804
[14,] -0.1802804
[15,] -0.1802804
[16,] -0.1802804
[17,] -0.1802804
[18,] -0.1802804
[19,]  0.2870287
[20,] -0.1802804

$nconv
[1] 1

$niter
[1] 1

$nops
[1] 20
petrelharp commented 7 years ago
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RSpectra_0.12-0

loaded via a namespace (and not attached):
[1] compiler_3.4.0  Matrix_1.2-10   Rcpp_0.12.11    grid_3.4.0      lattice_0.20-35
petrelharp commented 7 years ago

... but it works fine after

> devtools::install_github("yixuan/RSpectra")
> sessionInfo()
[1] RSpectra_0.12-2
> eigs_sym(X, k=1)
$values
[1] 3.094653

$vectors
            [,1]
 [1,]  0.2870287
 [2,] -0.1802804
...