hemberg-lab / SC3

A tool for the unsupervised clustering of cells from single cell RNA-Seq experiments
http://bioconductor.org/packages/SC3
GNU General Public License v3.0
118 stars 55 forks source link

SC3 error on linux server: Not compatible with requested type #86

Closed LuyiTian closed 5 years ago

LuyiTian commented 5 years ago

I have got an error running sc3 on the Linux server. The error message is:

Error in cor(data, method = "pearson") : supply both 'x' and 'y' or a matrix-like 'x' Error in ED2(data) : Not compatible with requested type: [type=closure; target=double]. Error in cor(data, method = "spearman") : supply both 'x' and 'y' or a matrix-like 'x' Error in checkForRemoteErrors(val) : 3 nodes produced errors; first error: Error in ED2(data) : Not compatible with requested type: [type=closure; target=double].

which looks similar to some previous issues:

https://github.com/hemberg-lab/SC3/issues/53 https://github.com/hemberg-lab/SC3/issues/74

but I tried all the solution and nothing works. The matrix in SCE is not sparse.

> traceback()
12: stop(count, " nodes produced errors; first error: ", firstmsg, 
        domain = NA)
11: checkForRemoteErrors(val)
10: dynamicClusterApply(cl, fun, length(x), argfun)
9: clusterApplyLB(cl, argsList, evalWrapper)
8: e$fun(obj, substitute(ex), parent.frame(), e$data)
7: list(args = distances(.doRNG.stream = list(c(407L, 1406993319L, 
   1691510716L, -746742387L, 1512329962L, 1513506659L, 1182413128L
   ), c(407L, 290479171L, 888000387L, -1619061888L, 1302775714L, 
   -180470159L, 2100026249L), c(407L, -1905729890L, -453851489L, 
   -2141928855L, 617900772L, 602329473L, 1060894696L))), argnames = c("i", 
   ".doRNG.stream"), evalenv = <environment>, specified = character(0), 
       combineInfo = list(fun = function (a, ...) 
       c(a, list(...)), in.order = TRUE, has.init = TRUE, init = list(), 
           final = NULL, multi.combine = TRUE, max.combine = 100), 
       errorHandling = "stop", packages = "doRNG", export = NULL, 
       noexport = NULL, options = list(), verbose = FALSE) %dopar% 
       {
           {
               rngtools::RNGseed(.doRNG.stream)
           }
           {
               try({
                   calculate_distance(dataset, i)
               })
           }
       }
6: do.call("%dopar%", list(obj, ex), envir = parent.frame())
5: foreach::foreach(i = distances) %dorng% {
       try({
           calculate_distance(dataset, i)
       })
   }
4: sc3_calc_dists(object)
3: sc3_calc_dists(object)
2: sc3(sce, ks = 5)
1: sc3(sce, ks = 5)

and the sessioninfo is:

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.4 (Final)

Matrix products: default
BLAS: /wehisan/general/system/bioinf-software/bioinfsoftware/R/R-3.5.1/lib64/R/lib/libRblas.so
LAPACK: /wehisan/general/system/bioinf-software/bioinfsoftware/R/R-3.5.1/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SC3_1.10.0                  pheatmap_1.0.10             tidyr_0.8.1                 dplyr_0.7.8                 cluster_2.0.7-1            
 [6] mclust_5.4.2                WGCNA_1.66                  fastcluster_1.1.25          dynamicTreeCut_1.63-1       flashClust_1.01-2          
[11] RCA_1.0                     RaceID_0.1.2                Seurat_2.3.4                Matrix_1.2-14               cowplot_0.9.2              
[16] biomaRt_2.38.0              CellBench_0.1.0             tibble_1.4.2                magrittr_1.5                scater_1.8.0               
[21] ggplot2_3.1.0               scran_1.8.2                 SingleCellExperiment_1.4.1  SummarizedExperiment_1.12.0 DelayedArray_0.8.0         
[26] matrixStats_0.54.0          Biobase_2.42.0              GenomicRanges_1.34.0        GenomeInfoDb_1.18.1         IRanges_2.16.0             
[31] S4Vectors_0.20.1            BiocGenerics_0.28.0         BiocParallel_1.16.5      

It is running on my benchmark dataset which is just normal CEL-seq2 and 10x dataset: https://github.com/LuyiTian/CellBench_data/tree/master/data. they are not very big.

LuyiTian commented 5 years ago

I put the same data on my Macs with the same code and it runs, but sometimes SC3 just keep running forever without any increase in the progression bar: |============= | 10%

the dataset I am running only contains ~300 cells and I select 1000 highly variable genes. So it should take too long.

It does not give any error message so I have no idea how to debug. It is data independent.

pati-ni commented 5 years ago

Hi @LuyiTian, can you post the sequence of commands you are running?

Going through the code maybe you could try n_cores = 1

Cheers

wikiselev commented 5 years ago

Sorry, I couldn't reproduce. I've done the following and it worked ok on my Mac:

library(SC3)
library(SingleCellExperiment)
load("9cellmix_qc.RData")
rowData(sce_9cells_qc)$feature_symbol <- rownames(sce_9cells_qc)
logcounts(sce_9cells_qc) <- log2(counts(sce_9cells_qc) + 1)
sce_9cells_qc <- sc3(sce_9cells_qc, ks = 2:4)

Here is the session info:

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] shiny_1.2.0                 SingleCellExperiment_1.4.0
 [3] SummarizedExperiment_1.12.0 DelayedArray_0.8.0
 [5] BiocParallel_1.16.2         matrixStats_0.54.0
 [7] Biobase_2.42.0              GenomicRanges_1.34.0
 [9] GenomeInfoDb_1.18.1         IRanges_2.16.0
[11] S4Vectors_0.20.1            BiocGenerics_0.28.0
[13] SC3_1.10.0

loaded via a namespace (and not attached):
 [1] jsonlite_1.5           foreach_1.4.4          gtools_3.8.1
 [4] assertthat_0.2.0       doRNG_1.7.1            GenomeInfoDbData_1.2.0
 [7] robustbase_0.93-3      pillar_1.3.0           lattice_0.20-35
[10] glue_1.3.0             digest_0.6.18          promises_1.0.1
[13] RColorBrewer_1.1-2     XVector_0.22.0         colorspace_1.3-2
[16] htmltools_0.3.6        httpuv_1.4.5           Matrix_1.2-14
[19] plyr_1.8.4             pcaPP_1.9-73           WriteXLS_4.0.0
[22] pkgconfig_2.0.2        bibtex_0.4.2           pheatmap_1.0.10
[25] zlibbioc_1.28.0        purrr_0.2.5            xtable_1.8-3
[28] mvtnorm_1.0-8          scales_1.0.0           gdata_2.18.0
[31] later_0.7.5            tibble_1.4.2           pkgmaker_0.27
[34] ggplot2_3.1.0          withr_2.1.2            ROCR_1.0-7
[37] lazyeval_0.2.1         mime_0.6               magrittr_1.5
[40] crayon_1.3.4           doParallel_1.0.14      gplots_3.0.1
[43] class_7.3-14           tools_3.5.1            registry_0.5
[46] stringr_1.3.1          munsell_0.5.0          cluster_2.0.7-1
[49] rngtools_1.3.1         bindrcpp_0.2.2         compiler_3.5.1
[52] e1071_1.7-0            caTools_1.17.1.1       rlang_0.3.0.1
[55] grid_3.5.1             RCurl_1.95-4.11        iterators_1.0.10
[58] labeling_0.3           bitops_1.0-6           gtable_0.2.0
[61] codetools_0.2-15       rrcov_1.4-7            R6_2.3.0
[64] dplyr_0.7.8            bindr_0.1.1            KernSmooth_2.23-15
[67] stringi_1.2.4          Rcpp_1.0.0             DEoptimR_1.0-8
[70] tidyselect_0.2.5

Here is the screenshot of sc3_interactive(): screen shot 2019-01-22 at 11 38 05

Can you share you script? Everything you ran before SC3?

LuyiTian commented 5 years ago

I realized the crash happens either when I supply a KNN smooth generated count matrix or when I use multi-core. For the KNN I guess it is because KNN smooth will give some cells identical exprs value and it might cause singularity in some matrix operation.

I have no idea why multi-core does not work. I do notice when I stopped the R console, there are still some R running on the background, which seems out of control.

wikiselev commented 5 years ago

Thanks for updating. On the server the parallelism logic of SC3 may conflict with the system architecture, sorry won't be able to help there.