A tool for the unsupervised clustering of cells from single cell RNA-Seq experiments
SC3 error on linux server: Not compatible with requested type #86

LuyiTian commented 5 years ago

I have got an error running sc3 on the Linux server. The error message is:

Error in cor(data, method = "pearson") : supply both 'x' and 'y' or a matrix-like 'x' Error in ED2(data) : Not compatible with requested type: [type=closure; target=double]. Error in cor(data, method = "spearman") : supply both 'x' and 'y' or a matrix-like 'x' Error in checkForRemoteErrors(val) : 3 nodes produced errors; first error: Error in ED2(data) : Not compatible with requested type: [type=closure; target=double].

which looks similar to some previous issues:

but I tried all the solution and nothing works. The matrix in SCE is not sparse.

> traceback()
12: stop(count, " nodes produced errors; first error: ", firstmsg, 
        domain = NA)
11: checkForRemoteErrors(val)
10: dynamicClusterApply(cl, fun, length(x), argfun)
9: clusterApplyLB(cl, argsList, evalWrapper)
8: e$fun(obj, substitute(ex), parent.frame(), e$data)
7: list(args = distances(.doRNG.stream = list(c(407L, 1406993319L, 
   1691510716L, -746742387L, 1512329962L, 1513506659L, 1182413128L
   ), c(407L, 290479171L, 888000387L, -1619061888L, 1302775714L, 
   -180470159L, 2100026249L), c(407L, -1905729890L, -453851489L, 
   -2141928855L, 617900772L, 602329473L, 1060894696L))), argnames = c("i", 
   ".doRNG.stream"), evalenv = <environment>, specified = character(0), 
       combineInfo = list(fun = function (a, ...) 
       c(a, list(...)), in.order = TRUE, has.init = TRUE, init = list(), 
           final = NULL, multi.combine = TRUE, max.combine = 100), 
       errorHandling = "stop", packages = "doRNG", export = NULL, 
       noexport = NULL, options = list(), verbose = FALSE) %dopar% 
                   calculate_distance(dataset, i)
6: do.call("%dopar%", list(obj, ex), envir = parent.frame())
5: foreach::foreach(i = distances) %dorng% {
           calculate_distance(dataset, i)
4: sc3_calc_dists(object)
3: sc3_calc_dists(object)
2: sc3(sce, ks = 5)
1: sc3(sce, ks = 5)

and the sessioninfo is:

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.4 (Final)

Matrix products: default
BLAS: /wehisan/general/system/bioinf-software/bioinfsoftware/R/R-3.5.1/lib64/R/lib/libRblas.so
LAPACK: /wehisan/general/system/bioinf-software/bioinfsoftware/R/R-3.5.1/lib64/R/lib/libRlapack.so

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SC3_1.10.0                  pheatmap_1.0.10             tidyr_0.8.1                 dplyr_0.7.8                 cluster_2.0.7-1            
 [6] mclust_5.4.2                WGCNA_1.66                  fastcluster_1.1.25          dynamicTreeCut_1.63-1       flashClust_1.01-2          
[11] RCA_1.0                     RaceID_0.1.2                Seurat_2.3.4                Matrix_1.2-14               cowplot_0.9.2              
[16] biomaRt_2.38.0              CellBench_0.1.0             tibble_1.4.2                magrittr_1.5                scater_1.8.0               
[21] ggplot2_3.1.0               scran_1.8.2                 SingleCellExperiment_1.4.1  SummarizedExperiment_1.12.0 DelayedArray_0.8.0         
[26] matrixStats_0.54.0          Biobase_2.42.0              GenomicRanges_1.34.0        GenomeInfoDb_1.18.1         IRanges_2.16.0             
[31] S4Vectors_0.20.1            BiocGenerics_0.28.0         BiocParallel_1.16.5      

It is running on my benchmark dataset which is just normal CEL-seq2 and 10x dataset: https://github.com/LuyiTian/CellBench_data/tree/master/data. they are not very big.

LuyiTian commented 5 years ago

I put the same data on my Macs with the same code and it runs, but sometimes SC3 just keep running forever without any increase in the progression bar: |============= | 10%

the dataset I am running only contains ~300 cells and I select 1000 highly variable genes. So it should take too long.

It does not give any error message so I have no idea how to debug. It is data independent.

pati-ni commented 5 years ago

Hi @LuyiTian, can you post the sequence of commands you are running?

Going through the code maybe you could try n_cores = 1


wikiselev commented 5 years ago

Sorry, I couldn't reproduce. I've done the following and it worked ok on my Mac:

rowData(sce_9cells_qc)$feature_symbol <- rownames(sce_9cells_qc)
logcounts(sce_9cells_qc) <- log2(counts(sce_9cells_qc) + 1)
sce_9cells_qc <- sc3(sce_9cells_qc, ks = 2:4)

Here is the session info:

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

Here is the screenshot of sc3_interactive(): screen shot 2019-01-22 at 11 38 05

Can you share you script? Everything you ran before SC3?

LuyiTian commented 5 years ago

I realized the crash happens either when I supply a KNN smooth generated count matrix or when I use multi-core. For the KNN I guess it is because KNN smooth will give some cells identical exprs value and it might cause singularity in some matrix operation.

I have no idea why multi-core does not work. I do notice when I stopped the R console, there are still some R running on the background, which seems out of control.

wikiselev commented 5 years ago

Thanks for updating. On the server the parallelism logic of SC3 may conflict with the system architecture, sorry won't be able to help there.