hemberg-lab / SC3

A tool for the unsupervised clustering of cells from single cell RNA-Seq experiments
http://bioconductor.org/packages/SC3
GNU General Public License v3.0
118 stars 55 forks source link

Sparse matrix causes errors #98

Closed RebekkaWegmann closed 4 years ago

RebekkaWegmann commented 4 years ago

Hello,

I'm trying to run SC3 on a dataset that was generated with the read10XCounts() function from DropletUtils. In this SingleCellExperiment object, counts() and logcounts() are sparse matrices (dgCMatrix class). SC3 throws some errors when I run it on this object. I think are related to the sparse matrix, because if I convert counts() and logcounts() to a normal matrix, it runs without issues.

Here the details:

The first error I get is during the gene filtering step:

sce = sc3(sce ks = 5:8, biology = TRUE, n_cores = 8, gene_filter = T)
> traceback()
4: stop("'x' must be an array of at least two dimensions")
3: rowSums(counts(object) == 0)
2: sc3_prepare(sce_info)
1: sc3_prepare(sce_info)

If I skip this step by setting gene_filter=F, I instead get a different error:

sce= sc3(sce, ks = 5:8, biology = TRUE, n_cores = 8, gene_filter = F)
traceback()
12: stop(count, " nodes produced errors; first error: ", firstmsg, 
        domain = NA)
11: checkForRemoteErrors(val)
10: dynamicClusterApply(cl, fun, length(x), argfun)
9: clusterApplyLB(cl, argsList, evalWrapper)
8: e$fun(obj, substitute(ex), parent.frame(), e$data)
7: list(args = distances(.doRNG.stream = list(c(10407L, -1970029627L, 
   882597314L, 1643048539L, 2093645280L, -1268044703L, 393508526L
   ), c(10407L, 1801228726L, -149073843L, 297875283L, -1442366624L, 
   1968374390L, -786929706L), c(10407L, 1825261028L, -1372265977L, 
   -256311815L, -1951992180L, 1942083457L, -895959705L))), argnames = c("i", 
   ".doRNG.stream"), evalenv = <environment>, specified = character(0), 
       combineInfo = list(fun = function (a, ...) 
       c(a, list(...)), in.order = TRUE, has.init = TRUE, init = list(), 
           final = NULL, multi.combine = TRUE, max.combine = 100), 
       errorHandling = "stop", packages = "doRNG", export = NULL, 
       noexport = NULL, options = list(), verbose = FALSE) %dopar% 
       {
           {
               rngtools::RNGseed(.doRNG.stream)
           }
           {
               try({
                   calculate_distance(dataset, i)
               })
           }
       }
6: do.call("%dopar%", list(obj, ex), envir = parent.frame())
5: foreach::foreach(i = distances) %dorng% {
       try({
           calculate_distance(dataset, i)
       })
   }
4: sc3_calc_dists(object)
3: sc3_calc_dists(object)
2: sc3(sce_info, ks = 5:8, biology = TRUE, n_cores = 8, gene_filter = F)
1: sc3(sce_info, ks = 5:8, biology = TRUE, n_cores = 8, gene_filter = F)

Session info

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SC3_1.15.1                  RColorBrewer_1.1-2          scran_1.14.5               
 [4] scater_1.14.0               SingleCellExperiment_1.8.0  SummarizedExperiment_1.16.1
 [7] DelayedArray_0.12.2         BiocParallel_1.20.1         matrixStats_0.55.0         
[10] Biobase_2.46.0              GenomicRanges_1.38.0        GenomeInfoDb_1.22.0        
[13] IRanges_2.20.1              S4Vectors_0.24.1            BiocGenerics_0.32.0        
[16] data.table_1.12.8           ggplot2_3.2.1               Rtsne_0.15                 

loaded via a namespace (and not attached):
  [1] ggbeeswarm_0.6.0         colorspace_1.4-1         ellipsis_0.3.0           class_7.3-15            
  [5] rprojroot_1.3-2          XVector_0.26.0           BiocNeighbors_1.4.1      fs_1.3.1                
  [9] rstudioapi_0.10          remotes_2.1.0            mvtnorm_1.0-12           fansi_0.4.1             
 [13] codetools_0.2-16         doParallel_1.0.15        robustbase_0.93-5        knitr_1.26              
 [17] pkgload_1.0.2            cluster_2.1.0            pheatmap_1.0.12          shiny_1.4.0             
 [21] rrcov_1.4-9              compiler_3.6.2           dqrng_0.2.1              backports_1.1.5         
 [25] fastmap_1.0.1            assertthat_0.2.1         Matrix_1.2-18            lazyeval_0.2.2          
 [29] limma_3.42.0             cli_2.0.1                later_1.0.0              BiocSingular_1.2.1      
 [33] htmltools_0.4.0          prettyunits_1.0.2        tools_3.6.2              rsvd_1.0.2              
 [37] igraph_1.2.4.2           gtable_0.3.0             glue_1.3.1               GenomeInfoDbData_1.2.2  
 [41] dplyr_0.8.3              doRNG_1.7.1              Rcpp_1.0.3               gdata_2.18.0            
 [45] iterators_1.0.12         DelayedMatrixStats_1.8.0 xfun_0.11                stringr_1.4.0           
 [49] ps_1.3.0                 testthat_2.3.1           mime_0.8                 lifecycle_0.1.0         
 [53] irlba_2.3.3              rngtools_1.4             gtools_3.8.1             WriteXLS_5.0.0          
 [57] devtools_2.2.1           statmod_1.4.32           DEoptimR_1.0-8           edgeR_3.28.0            
 [61] zlibbioc_1.32.0          scales_1.1.0             promises_1.1.0           yaml_2.2.0              
 [65] curl_4.3                 memoise_1.1.0            gridExtra_2.3            pkgmaker_0.27           
 [69] stringi_1.4.5            desc_1.2.0               pcaPP_1.9-73             foreach_1.4.7           
 [73] e1071_1.7-3              caTools_1.17.1.4         pkgbuild_1.0.6           bibtex_0.4.2.2          
 [77] rlang_0.4.2              pkgconfig_2.0.3          bitops_1.0-6             lattice_0.20-38         
 [81] ROCR_1.0-7               purrr_0.3.3              processx_3.4.1           tidyselect_0.2.5        
 [85] magrittr_1.5             R6_2.4.1                 gplots_3.0.1.2           pillar_1.4.3            
 [89] withr_2.1.2              RCurl_1.95-4.12          tibble_2.1.3             crayon_1.3.4            
 [93] KernSmooth_2.23-16       viridis_0.5.1            usethis_1.5.1            locfit_1.5-9.1          
 [97] grid_3.6.2               callr_3.4.0              digest_0.6.23            xtable_1.8-4            
[101] httpuv_1.5.2             munsell_0.5.0            beeswarm_0.2.3           registry_0.5-1          
[105] viridisLite_0.3.0        vipor_0.4.5              sessioninfo_1.1.1 
pati-ni commented 4 years ago

Hi @RebekkaWegmann ,

SC3 at its current iteration does not support sparse matrices. If memory is not an issue, try to cast it as a dense matrix.

RebekkaWegmann commented 4 years ago

Hi @pati-ni,

Thank you for your quick reply, that works.