hemberg-lab / SC3

A tool for the unsupervised clustering of cells from single cell RNA-Seq experiments
http://bioconductor.org/packages/SC3
GNU General Public License v3.0
118 stars 55 forks source link

sc3_run_svm has a prediction issue #110

Closed jmuribes closed 1 year ago

jmuribes commented 1 year ago

Describe the bug I have a SingleCellExperiment object with 29,390 cells and want to perform SC3 clustering for ks values between 2 and 12. The sc3() command with the svm_num_cells=5000 runs without any issue. Nevertheless, when I try to run the prediction with sc3_run_svm() it gives me the following error:

> sce <- sc3_run_svm(sce, ks = 2:12)
Error in UseMethod("predict") : 
  no applicable method for 'predict' applied to an object of class "logical"

I tried comparing the format of the sce object to the example from the SC3 manual and could not identify any difference.

Any insight will be much appreciated!

To Reproduce

> sce <- SingleCellExperiment(
    assays = list(
        counts = as.matrix(all.data.combined@assays$RNA@data),
        logcounts = as.matrix(all.data.combined@assays$RNA@scale.data)
    ), 
    colData = all.data.combined@meta.data
)
> rowData(sce)$feature_symbol <- rownames(sce)
> sce
class: SingleCellExperiment 
dim: 27675 29390 
metadata(1): sc3
assays(2): counts logcounts
rownames(27675): slc35a5 ccdc80 ... CU367852.14 CU929094.2
rowData names(46): feature_symbol sc3_gene_filter ... sc3_11_de_padj
  sc3_12_de_padj
colnames(29390): pooled1_2_76_10__s1 pooled1_1_53_63__s1 ...
  control3_45_63_76__s2 control3_48_35_23__s2
colData names(29): orig.ident nCount_RNA ... sc3_11_log2_outlier_score
  sc3_12_log2_outlier_score
reducedDimNames(2): PCA UMAP
mainExpName: NULL
altExpNames(0):
> sce <- sc3(sce, ks = 2:12, biology = TRUE, n_cores= 10, svm_num_cells = 5000)
> sce <- sc3_run_svm(sce, ks = 2:12)
Error in UseMethod("predict") : 
  no applicable method for 'predict' applied to an object of class "logical"

Session Info:

> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS

Matrix products: default
BLAS/LAPACK: /share/dennislab/programs/dennis-miniconda/envs/r_scirnaseq/lib/libopenblasp-r0.3
.21.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] class_7.3-22                mlbench_2.1-3.1
 [3] e1071_1.7-13                clustree_0.5.0
 [5] ggraph_2.1.0                scater_1.26.1
 [7] SC3_1.26.2                  MAST_1.24.1
 [9] cluster_2.1.4               intrinsicDimension_1.2.0
[11] yaImpute_1.0-33             SingleR_2.0.0
[13] scDblFinder_1.12.0          ggridges_0.5.4
[15] gghalves_0.1.4              ggforce_0.4.1
[17] viridis_0.6.3               viridisLite_0.4.2
[19] scran_1.26.2                scuttle_1.8.4
[21] SingleCellExperiment_1.20.1 SummarizedExperiment_1.28.0
[23] Biobase_2.58.0              GenomicRanges_1.50.2
[25] GenomeInfoDb_1.34.9         IRanges_2.32.0
[27] S4Vectors_0.36.2            BiocGenerics_0.44.0
[29] MatrixGenerics_1.10.0       matrixStats_0.63.0
[31] reshape2_1.4.4              dunn.test_1.3.5
[33] speckle_0.0.3               magrittr_2.0.3
[35] data.table_1.14.8           cowplot_1.1.1
[37] lubridate_1.9.2             forcats_1.0.0
[39] stringr_1.5.0               purrr_1.0.1
[41] readr_2.1.4                 tidyr_1.3.0
[43] tibble_3.2.1                tidyverse_2.0.0
[45] patchwork_1.1.2             ggplot2_3.4.2
[47] Matrix_1.5-4                dplyr_1.1.2
[49] SeuratObject_4.1.3          Seurat_4.3.0

loaded via a namespace (and not attached):
  [1] rtracklayer_1.58.0        scattermore_1.0
  [3] ragg_1.2.4                bit64_4.0.5
  [5] irlba_2.3.5.1             DelayedArray_0.24.0
  [7] KEGGREST_1.38.0           RCurl_1.98-1.12
  [9] doParallel_1.0.17         generics_0.1.3
 [11] org.Mm.eg.db_3.16.0       ScaledMatrix_1.6.0
 [13] RSQLite_2.3.1             RANN_2.6.1
 [15] proxy_0.4-27              future_1.32.0
 [17] bit_4.0.5                 tzdb_0.3.0
 [19] spatstat.data_3.0-1       httpuv_1.6.10
 [21] hms_1.1.3                 promises_1.2.0.1
 [23] DEoptimR_1.0-13           fansi_1.0.4
 [25] restfulr_0.0.15           igraph_1.4.2
 [27] DBI_1.1.3                 htmlwidgets_1.6.2        
 [29] spatstat.geom_3.1-0       ellipsis_0.3.2           
 [31] backports_1.4.1           deldir_1.0-6             
 [33] sparseMatrixStats_1.10.0  vctrs_0.6.2              
 [35] ROCR_1.0-11               abind_1.4-5              
 [37] cachem_1.0.8              withr_2.5.0              
 [39] robustbase_0.95-1         progressr_0.13.0         
 [41] checkmate_2.2.0           sctransform_0.3.5        
 [43] GenomicAlignments_1.34.1  goftest_1.2-3            
 [45] lazyeval_0.2.2            crayon_1.5.2             
 [47] spatstat.explore_3.1-0    labeling_0.4.2           
 [49] edgeR_3.40.2              pkgconfig_2.0.3          
 [51] tweenr_2.0.2              nlme_3.1-162             
 [53] vipor_0.4.5               rlang_1.1.1              
 [55] globals_0.16.2            lifecycle_1.0.3          
 [57] miniUI_0.1.1.1            rsvd_1.0.5               
 [59] polyclip_1.10-4           lmtest_0.9-40            
 [61] rngtools_1.5.2            zoo_1.8-12               
 [63] beeswarm_0.4.0            pheatmap_1.0.12          
 [65] png_0.1-8                 rjson_0.2.21             
 [67] bitops_1.0-7              KernSmooth_2.23-21       
 [69] Biostrings_2.66.0         blob_1.2.4               
 [71] DelayedMatrixStats_1.20.0 doRNG_1.8.6              
 [73] parallelly_1.35.0         spatstat.random_3.1-4    
 [75] beachmat_2.14.2           scales_1.2.1             
 [77] memoise_2.0.1             plyr_1.8.8               
 [79] ica_1.0-3                 zlibbioc_1.44.0          
 [81] compiler_4.2.2            dqrng_0.3.0              
 [83] BiocIO_1.8.0              RColorBrewer_1.1-3       
 [85] rrcov_1.7-2               fitdistrplus_1.1-11      
 [87] Rsamtools_2.14.0          cli_3.6.1                
 [89] XVector_0.38.0            listenv_0.9.0            
 [91] pbapply_1.7-0             MASS_7.3-60              
 [93] tidyselect_1.2.0          stringi_1.7.12           
 [95] textshaping_0.3.6         yaml_2.3.7               
 [97] BiocSingular_1.14.0       locfit_1.5-9.7           
 [99] ggrepel_0.9.3             grid_4.2.2               
[101] tools_4.2.2               timechange_0.2.0         
[103] future.apply_1.10.0       parallel_4.2.2           
[105] bluster_1.8.0             foreach_1.5.2            
[107] metapod_1.6.0             gridExtra_2.3            
[109] farver_2.1.1              Rtsne_0.16               
[111] BiocManager_1.30.20       digest_0.6.31            
[113] shiny_1.7.4               Rcpp_1.0.10              
[115] later_1.3.1               RcppAnnoy_0.0.20         
[117] WriteXLS_6.4.0            org.Hs.eg.db_3.16.0      
[119] httr_1.4.6                AnnotationDbi_1.60.2     
[121] colorspace_2.1-0          XML_3.99-0.14            
[123] tensor_1.5                reticulate_1.28          
[125] splines_4.2.2             uwot_0.1.14              
[127] statmod_1.5.0             spatstat.utils_3.0-3     
[129] graphlayouts_1.0.0        sp_1.6-0                 
[131] xgboost_1.7.5.1           plotly_4.10.1            
[133] systemfonts_1.0.4         xtable_1.8-4             
[135] jsonlite_1.8.4            tidygraph_1.2.3          
[137] R6_2.5.1                  pillar_1.9.0             
[139] htmltools_0.5.5           mime_0.12                
[141] glue_1.6.2                fastmap_1.1.1            
[143] BiocParallel_1.32.6       BiocNeighbors_1.16.0     
[145] codetools_0.2-19          pcaPP_2.0-3              
[147] mvtnorm_1.1-3             utf8_1.2.3               
[149] lattice_0.21-8            spatstat.sparse_3.0-1    
[151] ggbeeswarm_0.7.2          leiden_0.4.3             
[153] survival_3.5-5            limma_3.54.2             
[155] munsell_0.5.0             GenomeInfoDbData_1.2.9   
[157] iterators_1.0.14          gtable_0.3.3 
mhemberg commented 1 year ago

Thanks for raising this issue. Unfortunately, we are no longer able to provide detailed support for SC3 since the main developer (Vladimir Kiselev) is no longer in academia. Moreover, running SC3 for datasets this large is not optimal and instead we recommend using the new and faster SC3s. In terms of accuracy, SC3s has been benchmarked to be similar to SC3. The main difference is that SC3s is implemented in python which I recognize could be either a good thing or a bad thing.

jmuribes commented 1 year ago

Thank you for the reply! I actually found SC3s while trying to find a way around it and managed to use it successfully. I used sceasy (https://github.com/cellgeni/sceasy) to change my original SCE into an AnnData object, loaded it to python3, ran SC3s, saved it as an H5AD object, and used sceasy to load it back to R and convert it as an SCE. Probably there is a simpler way around it but this worked fine for me.