Using small number of markers (n= 47) returns error

Hello,

I have a few custom datasets consisting of (47 markers) x (1,000 ~ 10,000 cells) on which I want to run a supervised cell annotation.

Initially, running Signac on my dataset gave me some errors, so I wanted to figure out what the issue is and wondered if it could be stemming from the small number of rows (47 markers) that I used, perhaps causing some mathematical/linear algebraic issues.

So I tried to size down the given "pbmc" dataset from the vignette (https://cran.r-project.org/web/packages/SignacX/vignettes/signac-Seurat_CITE-seq.html) to contain the first 47 markers only. This process is shown in the code below. (FYI: the original "pbmc" dataset from the vignette contains 33538 markers x 7865 cells.)

After running SignacX on this "smaller" pbmc dataset, I noticed that the same types of error are produced. Specifically, these error messages pop up after "SCTransform" and "Signac" functions, and I commented the exact error messages below. For SCTransform, I eventually skipped this step and instead used the "NormalizeData, FindVariableFeatures and ScaleData" sequence, which did not produce any error.

On a note, I tried to adjust the parameters such as "npcs", "nfeatures.print", and "dims" in the functions "RunPCA", "RunUMAP" and "FindNeighbors" functions wondering if these could be the issues, but no avail. However, I'm not too familiar with parameters in these functions, so my parameters may still be wrong.

What could possibly be the issue here and how can I reduce these error messages?

Thank you!

library(Seurat)
require(SignacX)
# Minimally reproducible example
E = Read10X_h5(filename = "fls/pbmc_10k_protein_v3_filtered_feature_bc_matrix.h5")
E.small <- E$`Gene Expression`[c(1:47),]

pbmc <- CreateSeuratObject(counts = E.small, project = "pbmc")

#pbmc <- SCTransform(pbmc) #, variable.features.n = 30

# Error message # 1:
#Calculating cell attributes from input UMI matrix: log_umi
#Error in make_cell_attr(umi, cell_attr, latent_var, batch_var, latent_var_nonreg,  : 
#                          cell attribute "log_umi" contains NA, NaN, or infinite value

pbmc <- NormalizeData(pbmc)
pbmc <- FindVariableFeatures(pbmc)
pbmc <- ScaleData(pbmc)

pbmc <- RunPCA(pbmc, npcs=10, nfeatures.print = 10)
pbmc <- RunUMAP(pbmc, dims = 1:10)
pbmc <- FindNeighbors(pbmc, dims = 1:10)

labels <- Signac(pbmc, verbose=T)

# Error message # 2:

#..........  Entry in Signac 
#..........  Running Signac on Seurat object :
#  nrow = 47
#  ncol = 7865
# |                                  |   0%, ETA NA

# Error in order(rownames(Z)) : argument 1 is not a vector

sessionInfo()

R version 4.1.3 (2022-03-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.2.1

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] pbmc3k.SeuratData_3.1.4 SeuratData_0.2.2 SignacX_2.2.5 patchwork_1.1.1 [5] ggplot2_3.3.6 SeuratDisk_0.0.0.9020 sp_1.4-7 SeuratObject_4.1.0 [9] Seurat_4.1.1

loaded via a namespace (and not attached): [1] Rtsne_0.16 colorspace_2.0-3 deldir_1.0-6 ellipsis_0.3.2 [5] ggridges_0.5.3 rprojroot_2.0.3 fs_1.5.2 rstudioapi_0.13 [9] spatstat.data_2.2-0 farver_2.1.0 leiden_0.4.2 listenv_0.8.0 [13] remotes_2.4.2 ggrepel_0.9.1 bit64_4.0.5 RSpectra_0.16-1 [17] fansi_1.0.3 codetools_0.2-18 splines_4.1.3 cachem_1.0.6 [21] pkgload_1.2.4 polyclip_1.10-0 jsonlite_1.8.0 ica_1.0-2 [25] cluster_2.1.3 png_0.1-7 rgeos_0.5-9 uwot_0.1.11 [29] shiny_1.7.1 sctransform_0.3.3 spatstat.sparse_2.1-1 compiler_4.1.3 [33] httr_1.4.3 Matrix_1.4-1 fastmap_1.1.0 lazyeval_0.2.2 [37] cli_3.3.0 later_1.3.0 prettyunits_1.1.1 htmltools_0.5.2 [41] tools_4.1.3 igraph_1.3.1 gtable_0.3.0 glue_1.6.2 [45] RANN_2.6.1 reshape2_1.4.4 dplyr_1.0.9 rappdirs_0.3.3 [49] Rcpp_1.0.8.3 scattermore_0.8 vctrs_0.4.1 nlme_3.1-157 [53] progressr_0.10.0 lmtest_0.9-40 spatstat.random_2.2-0 stringr_1.4.0 [57] brio_1.1.3 ps_1.7.0 globals_0.15.0 testthat_3.1.4 [61] mime_0.12 miniUI_0.1.1.1 lifecycle_1.0.1 irlba_2.3.5 [65] devtools_2.4.3 goftest_1.2-3 future_1.26.1 MASS_7.3-57 [69] zoo_1.8-10 scales_1.2.0 spatstat.core_2.4-4 promises_1.2.0.1 [73] spatstat.utils_2.3-1 parallel_4.1.3 RColorBrewer_1.1-3 curl_4.3.2 [77] memoise_2.0.1 reticulate_1.25 pbapply_1.5-0 gridExtra_2.3 [81] rpart_4.1.16 stringi_1.7.6 desc_1.4.1 pkgbuild_1.3.1 [85] rlang_1.0.2 pkgconfig_2.0.3 matrixStats_0.62.0 lattice_0.20-45 [89] ROCR_1.0-11 purrr_0.3.4 tensor_1.5 htmlwidgets_1.5.4 [93] labeling_0.4.2 processx_3.5.3 cowplot_1.1.1 bit_4.0.4 [97] tidyselect_1.1.2 parallelly_1.31.1 RcppAnnoy_0.0.19 plyr_1.8.7 [101] magrittr_2.0.3 R6_2.5.1 generics_0.1.2 pillar_1.7.0 [105] withr_2.5.0 mgcv_1.8-40 fitdistrplus_1.1-8 survival_3.3-1 [109] abind_1.4-5 tibble_3.1.7 future.apply_1.9.0 crayon_1.5.1 [113] hdf5r_1.3.5 KernSmooth_2.23-20 utf8_1.2.2 spatstat.geom_2.4-0 [117] plotly_4.10.0 usethis_2.1.6 grid_4.1.3 data.table_1.14.2 [121] callr_3.7.0 digest_0.6.29 pbmcapply_1.5.1 xtable_1.8-4 [125] tidyr_1.2.0 httpuv_1.6.5 munsell_0.5.0 viridisLite_0.4.0 [129] sessioninfo_1.2.2 Warning messages: 1: ggrepel: 5 unlabeled data points (too many overlaps). Consider increasing max.overlaps 2: ggrepel: 5 unlabeled data points (too many overlaps). Consider increasing max.overlaps 3: ggrepel: 5 unlabeled data points (too many overlaps). Consider increasing max.overlaps 4: ggrepel: 5 unlabeled data points (too many overlaps). Consider increasing max.overlaps

mathewchamberlain / SignacX

Using small number of markers (n= 47) returns error #14