Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd'

fataltes commented 2 months ago

Hi,

Running rgcca_cv with the following options:

rgcca_cv(blocks, response = 4, method = "rgcca",
                   par_type = "ncomp",
                   par_value = grid_search_matrix,
                   init = "rand",
                   tau = 0.5,
                   prediction_model = "glmnet",
                   metric = "F1",
                   k=3, n_run = 5,
                   verbose = TRUE, 
                   quiet = FALSE)

I get the following error:

Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd'
Traceback:

1. system.time(ncomp_cv_out <- ncomp_cv(blocks, grid_search_matrix, 
 .     method))
2. ncomp_cv(blocks, grid_search_matrix, method)
3. rgcca_cv(blocks, response = 4, method = cur_method, par_type = "ncomp", 
 .     par_value = grid_search_matrix, init = "rand", tau = 0.5, 
 .     prediction_model = "glmnet", metric = "F1", k = 3, n_run = 5, 
 .     verbose = TRUE, quiet = FALSE)   # at line 4-14 of file <text>
4. par_pblapply(idx, function(n) {
 .     i <- (n - 1)%/%length(v_inds) + 1
 .     j <- (n - 1)%%length(v_inds) + 1
 .     rgcca_cv_k(rgcca_args, inds = v_inds[[j]], metric = metric, 
 .         par_type = param$par_type, par_value = param$par_value[i, 
 .             ], prediction_model = model$prediction_model, ...)
 . }, n_cores = n_cores, verbose = verbose)
5. pbapply::pblapply(X, FUN, ..., cl = cl)
6. lapply(X, FUN, ...)
7. FUN(X[[i]], ...)
8. rgcca_cv_k(rgcca_args, inds = v_inds[[j]], metric = metric, par_type = param$par_type, 
 .     par_value = param$par_value[i, ], prediction_model = model$prediction_model, 
 .     ...)
9. do.call(rgcca, rgcca_args) ...

Data set is of size 927 * 986 (so number_features > number_samples)
No perfect or high colinearity in the data because I have removed highly colinear features by removing one of any pair of features having correlation >= 0.95
Since the error seems to come from SVD, I have tried replacing any arguments that thought would involve calling SVD, therefore replacing init="svd" with init="rand" and replacing prediction_model = "lda" with prediction_model = "glmnet" And I still get the error.
I was not getting any such error using previous versions of RGCCA until I updated to the most recent version (December 2023 release) or even running rgcca function itself in current release (with one example of nComp and tau), but get the error on the rgcca_cv.

What do you think can be causing this and if you have any suggestions to try?

Thank you, Fatemeh

GFabien commented 2 months ago

Hey @fataltes!

Thank you for raising this issue. This problem is related to issues with the R svd function, but some people managed to find a fix in other packages relying on the SVD. I tried replicating their fix in https://github.com/rgcca-factory/RGCCA/tree/fix_lapack_error_with_svd. Can you try out the branch to see if it solves your problem?

Best, Fabien

fataltes commented 2 months ago

Thanks for your quick response @GFabien,

I switched to that branch and now the error changed to a NAN case error. Here is the traceback:

Error in if (any(x$a != 0)) {: missing value where TRUE/FALSE needed
Traceback:

1. system.time(ncomp_cv_out <- ncomp_cv(blocks, grid_search_matrix, 
 .     method))
2. ncomp_cv(blocks, grid_search_matrix, method)
3. rgcca_cv(blocks, response = 4, method = cur_method, par_type = "ncomp", 
 .     par_value = grid_search_matrix, init = "rand", tau = 0.5, 
 .     prediction_model = "glmnet", metric = "F1", k = 3, n_run = 5, 
 .     verbose = TRUE, quiet = FALSE)   # at line 4-14 of file <text>
4. par_pblapply(idx, function(n) {
 .     i <- (n - 1)%/%length(v_inds) + 1
 .     j <- (n - 1)%%length(v_inds) + 1
 .     rgcca_cv_k(rgcca_args, inds = v_inds[[j]], metric = metric, 
 .         par_type = param$par_type, par_value = param$par_value[i, 
 .             ], prediction_model = model$prediction_model, ...)
 . }, n_cores = n_cores, verbose = verbose)
5. pbapply::pblapply(X, FUN, ..., cl = cl)
6. lapply(X, FUN, ...)
7. FUN(X[[i]], ...)
8. rgcca_cv_k(rgcca_args, inds = v_inds[[j]], metric = metric, par_type = param$par_type, 
 .     par_value = param$par_value[i, ], prediction_model = model$prediction_model, 
 .     ...)
9. do.call(rgcca, rgcca_args)
10. (function (blocks, connection = NULL, tau = 1, ncomp = 1, scheme = "factorial", 
  .     scale = TRUE, init = "svd", bias = TRUE, tol = 1e-08, verbose = FALSE, 
  .     scale_block = "inertia", method = "rgcca", sparsity = 1, 
  .     response = NULL, superblock = FALSE, NA_method = "na.ignore", 
  .     quiet = TRUE, n_iter_max = 1000, comp_orth = TRUE, A = NULL, 
  .     C = NULL) 
  . {
  .     if (!missing(A)) {
  .         warning("Argument A is deprecated, use blocks instead.")
  .         blocks <- A
  .     }
  .     if (!missing(C)) {
  .         warning("Argument C is deprecated, use connection instead.")
  .         connection <- C
  .     }
  .     rgcca_args <- as.list(environment())
  .     tmp <- get_rgcca_args(blocks, rgcca_args)
  .     opt <- tmp$opt
  .     rgcca_args <- tmp$rgcca_args
  .     rgcca_args$quiet <- quiet
  .     rgcca_args$verbose <- verbose
  .     blocks <- remove_null_sd(rgcca_args$blocks)$list_m
  .     if (opt$disjunction) {
  .         blocks[[rgcca_args$response]] <- as_disjunctive(blocks[[rgcca_args$response]])
  .     }
  .     tmp <- handle_NA(blocks, NA_method = rgcca_args$NA_method)
  .     na.rm <- tmp$na.rm
  .     blocks <- scaling(tmp$blocks, scale = rgcca_args$scale, bias = rgcca_args$bias, 
  .         scale_block = rgcca_args$scale_block, na.rm = na.rm)
  .     if (rgcca_args$superblock) {
  .         blocks[["superblock"]] <- Reduce(cbind, blocks)
  .         colnames(blocks[["superblock"]]) <- paste0("s-", colnames(blocks[["superblock"]]))
  .     }
  .     gcca_args <- rgcca_args[c("connection", "ncomp", "scheme", 
  .         "init", "bias", "tol", "verbose", "superblock", "response", 
  .         "n_iter_max", "comp_orth")]
  .     gcca_args[["na.rm"]] <- na.rm
  .     gcca_args[["blocks"]] <- blocks
  .     gcca_args[["disjunction"]] <- opt$disjunction
  .     gcca_args[[opt$param]] <- rgcca_args[[opt$param]]
  .     func_out <- do.call(rgcca_outer_loop, gcca_args)
  .     func_out <- format_output(func_out, rgcca_args, opt, blocks)
  .     class(func_out) <- "rgcca"
  .     invisible(func_out)
  . })(tau = c(0.5, 0.5, 0.5, 0), tol = 1e-08, init = "random", bias = TRUE, 
  .     quiet = TRUE, scale = TRUE, ncomp = c(1, 1, 319, 319), blocks = list( 
.....

The error is apparently coming from here: https://github.com/rgcca-factory/RGCCA/blob/eefad3ba95cfbf285ceabcdd2fdd7c0329d77f7b/R/block_project.R#L28

Before the error I also get this warning:

Missing colnames are automatically labeled.

Missing rownames are automatically labeled.

Warning message in sqrt(t(x$alpha) %*% x$M %*% x$K %*% x$alpha):
“NaNs produced”

Which I think is coming from here: https://github.com/rgcca-factory/RGCCA/blob/eefad3ba95cfbf285ceabcdd2fdd7c0329d77f7b/R/block_project.R#L40

I'd appreciate if you have any suggestions or updates on the code for me to try.

Thanks, Fatemeh

GFabien commented 2 months ago

Hi Fatemeh,

What I guess is that t(x$alpha) %*% x$M %*% x$K %*% x$alpha produces a negative value so the square root creates a NaN. From a theoretical point of view, the matrix x$M %*% x$K is symmetric and positive semi-definite, so the issue should not happen. However, it might be possible that due to numerical imprecision, the computed matrix has close to zero negative eigenvalues, which should correspond to zero eigenvalues of the theoretical x$M %*% x$K matrix.

Anyway, it means that a = t(x$x) %*% alpha is close to zero so the block does not contribute much to the objective function. I would say that you can reduce the number of components extracted for this particular block to remove the error.

A fix of the code might be to check on the value t(x$alpha) %*% x$M %*% x$K %*% x$alpha to see if it is smaller or equal to zero and set alpha to zero if so. To further investigate, I would need a reproducible example, would you be able to create one?

I hope this helps!

Best, Fabien

fataltes commented 2 months ago

Hi @GFabien ,

Thanks for the response!

Unfortunately, I cannot share the data and am trying to both apply your suggestion and generate a case that reproduces the issue on some random data.

But if you can share some contact, maybe I can follow up with you on a 1:1 meeting and then you can update the repo and issue here with the solution.

Thank you, Fatemeh

fataltes commented 2 months ago

Hi @GFabien , That would be great if you can give me some point of contact to reach out for the 1:1 meeting to go over the data and any suggested modifications/reruns by you.

GFabien commented 2 months ago

Hi @fataltes, Sorry for the late reply, I sent you an email to set up a 1:1 meeting. Best, Fabien

fataltes commented 1 month ago

Hi @GFabien ,

Thanks for updating the code. It now gives me another error which indicates x$alpha has NAN itself. I don't know if it is possible or what should one do in such a case.

Thanks, Fatemeh

GFabien commented 1 month ago

Hi @fataltes,

I updated the branch again, you shouldn't have NAN problems anymore. Please tell me if there are new errors or if it is fine now.

Best, Fabien

fataltes commented 1 month ago

Hi @GFabien ,

Thank you, I tested the code. It actually runs but never finishes. On a fairly small dataset of size 1000*1050 with three imbalanced blocks, RGCCA_cv has been running for over 10 hours. I don't think this is expected. right? Could there be an infinite loop or such a thing happening?

Thank you

GFabien commented 1 month ago

Hmmm... There shouldn't be. Did you run with verbose = TRUE? If so did the progress bar stay at 0%? Is it working if you run RGCCA without cross validation but with the same parameters?

GFabien commented 1 week ago

Hi @fataltes, How are you? Do you still face the problems you reported? Best, Fabien

fataltes commented 6 days ago

Hi @GFabien , Unfortunately yes. I still face the issue and there is another problem which I'm facing: It takes a loong time to run RGCCA on a small dataset if I activate parallelization on R.

rgcca-factory / RGCCA

Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd' #82