Closed yi6kim closed 2 years ago
Hi @e-junekim,
Thanks for posting this issue. SignacX trains neural networks with cell type markers that are based on genome-wide RNA-sequencing. The belief is that 1, 2 or 3 genes are probably too few to identify nuanced cell types, but using hundreds of genes together with classifiers should work.
So the problem here is that your data have too few genes for Signac to classify the cell types, because there are too few features to train the models. This method was intended to be used with genome wide panels, and not very small subsets of genes.
Hope this helps!
Hello,
I have a few custom datasets consisting of (47 markers) x (1,000 ~ 10,000 cells) on which I want to run a supervised cell annotation.
Initially, running Signac on my dataset gave me some errors, so I wanted to figure out what the issue is and wondered if it could be stemming from the small number of rows (47 markers) that I used, perhaps causing some mathematical/linear algebraic issues.
So I tried to size down the given "pbmc" dataset from the vignette (https://cran.r-project.org/web/packages/SignacX/vignettes/signac-Seurat_CITE-seq.html) to contain the first 47 markers only. This process is shown in the code below. (FYI: the original "pbmc" dataset from the vignette contains 33538 markers x 7865 cells.)
After running SignacX on this "smaller" pbmc dataset, I noticed that the same types of error are produced. Specifically, these error messages pop up after "SCTransform" and "Signac" functions, and I commented the exact error messages below. For SCTransform, I eventually skipped this step and instead used the "NormalizeData, FindVariableFeatures and ScaleData" sequence, which did not produce any error.
On a note, I tried to adjust the parameters such as "npcs", "nfeatures.print", and "dims" in the functions "RunPCA", "RunUMAP" and "FindNeighbors" functions wondering if these could be the issues, but no avail. However, I'm not too familiar with parameters in these functions, so my parameters may still be wrong.
What could possibly be the issue here and how can I reduce these error messages?
Thank you!
R version 4.1.3 (2022-03-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.2.1
Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] pbmc3k.SeuratData_3.1.4 SeuratData_0.2.2 SignacX_2.2.5 patchwork_1.1.1 [5] ggplot2_3.3.6 SeuratDisk_0.0.0.9020 sp_1.4-7 SeuratObject_4.1.0 [9] Seurat_4.1.1
loaded via a namespace (and not attached): [1] Rtsne_0.16 colorspace_2.0-3 deldir_1.0-6 ellipsis_0.3.2 [5] ggridges_0.5.3 rprojroot_2.0.3 fs_1.5.2 rstudioapi_0.13 [9] spatstat.data_2.2-0 farver_2.1.0 leiden_0.4.2 listenv_0.8.0 [13] remotes_2.4.2 ggrepel_0.9.1 bit64_4.0.5 RSpectra_0.16-1 [17] fansi_1.0.3 codetools_0.2-18 splines_4.1.3 cachem_1.0.6 [21] pkgload_1.2.4 polyclip_1.10-0 jsonlite_1.8.0 ica_1.0-2 [25] cluster_2.1.3 png_0.1-7 rgeos_0.5-9 uwot_0.1.11 [29] shiny_1.7.1 sctransform_0.3.3 spatstat.sparse_2.1-1 compiler_4.1.3 [33] httr_1.4.3 Matrix_1.4-1 fastmap_1.1.0 lazyeval_0.2.2 [37] cli_3.3.0 later_1.3.0 prettyunits_1.1.1 htmltools_0.5.2 [41] tools_4.1.3 igraph_1.3.1 gtable_0.3.0 glue_1.6.2 [45] RANN_2.6.1 reshape2_1.4.4 dplyr_1.0.9 rappdirs_0.3.3 [49] Rcpp_1.0.8.3 scattermore_0.8 vctrs_0.4.1 nlme_3.1-157 [53] progressr_0.10.0 lmtest_0.9-40 spatstat.random_2.2-0 stringr_1.4.0 [57] brio_1.1.3 ps_1.7.0 globals_0.15.0 testthat_3.1.4 [61] mime_0.12 miniUI_0.1.1.1 lifecycle_1.0.1 irlba_2.3.5 [65] devtools_2.4.3 goftest_1.2-3 future_1.26.1 MASS_7.3-57 [69] zoo_1.8-10 scales_1.2.0 spatstat.core_2.4-4 promises_1.2.0.1 [73] spatstat.utils_2.3-1 parallel_4.1.3 RColorBrewer_1.1-3 curl_4.3.2 [77] memoise_2.0.1 reticulate_1.25 pbapply_1.5-0 gridExtra_2.3 [81] rpart_4.1.16 stringi_1.7.6 desc_1.4.1 pkgbuild_1.3.1 [85] rlang_1.0.2 pkgconfig_2.0.3 matrixStats_0.62.0 lattice_0.20-45 [89] ROCR_1.0-11 purrr_0.3.4 tensor_1.5 htmlwidgets_1.5.4 [93] labeling_0.4.2 processx_3.5.3 cowplot_1.1.1 bit_4.0.4 [97] tidyselect_1.1.2 parallelly_1.31.1 RcppAnnoy_0.0.19 plyr_1.8.7 [101] magrittr_2.0.3 R6_2.5.1 generics_0.1.2 pillar_1.7.0 [105] withr_2.5.0 mgcv_1.8-40 fitdistrplus_1.1-8 survival_3.3-1 [109] abind_1.4-5 tibble_3.1.7 future.apply_1.9.0 crayon_1.5.1 [113] hdf5r_1.3.5 KernSmooth_2.23-20 utf8_1.2.2 spatstat.geom_2.4-0 [117] plotly_4.10.0 usethis_2.1.6 grid_4.1.3 data.table_1.14.2 [121] callr_3.7.0 digest_0.6.29 pbmcapply_1.5.1 xtable_1.8-4 [125] tidyr_1.2.0 httpuv_1.6.5 munsell_0.5.0 viridisLite_0.4.0 [129] sessioninfo_1.2.2 Warning messages: 1: ggrepel: 5 unlabeled data points (too many overlaps). Consider increasing max.overlaps 2: ggrepel: 5 unlabeled data points (too many overlaps). Consider increasing max.overlaps 3: ggrepel: 5 unlabeled data points (too many overlaps). Consider increasing max.overlaps 4: ggrepel: 5 unlabeled data points (too many overlaps). Consider increasing max.overlaps