satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 915 forks source link

`LabelClusters` showing wrong error message #6088

Closed Stfort52 closed 4 months ago

Stfort52 commented 2 years ago

Expected Behavior

LabelClusters should show correct number of clusters being labeled.

When testing with pbmc example, it should show:

r$> LabelClusters(plot, id='ident', labels = 1:1337)
Error in LabelClusters(plot, id = "ident", labels = 1:1337) : 
  Length of labels (1337) must be equal to the number of clusters being labeled (9).

Actual Outcome

LabelClusters is reporting wrong number of clusters being labeled.

When testing with pbmc example, it however shows:

r$> LabelClusters(plot, id='ident', labels = 1:1337)
Error in LabelClusters(plot, id = "ident", labels = 1:1337) : 
  Length of labels (1337) must be equal to the number of clusters being labeled (4).

Therefore puzzling error messages like these can happen:

r$> LabelClusters(plot, id='ident', labels = 1:4)
Error in LabelClusters(plot, id = "ident", labels = 1:4) : 
  Length of labels (4) must be equal to the number of clusters being labeled (4).

Possibly related code

  labels.loc <- do.call(what = 'rbind', args = labels.loc)
  labels.loc[, id] <- factor(x = labels.loc[, id], levels = levels(data[, id]))
  labels <- labels %||% groups
  if (length(x = unique(x = labels.loc[, id])) != length(x = labels)) {
    stop("Length of labels (", length(x = labels),  ") must be equal to the number of clusters being labeled (", length(x = labels.loc), ").")
  }

length(x = labels.loc) seems to be causing problem. Maybe you meant length(x = unique(x = labels.loc[, id]))?

Environment Info

$ mamba list | grep ^r
r-base                    4.1.3                h06d3f91_1    conda-forge
r-littler                 0.3.15            r41hcfec24a_0    conda-forge
r-rlang                   1.0.2             r41h7525677_0    conda-forge
radian                    0.6.3              pyhd8ed1ab_0    conda-forge
...(Not related)...

$ uname -a 
Linux CENSORED 5.4.0-104-generic #118-Ubuntu SMP Wed Mar 2 19:02:41 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Session Info

r$> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS/LAPACK: CENSORED/stfort/.mambaforge/envs/messieR/lib/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sp_1.5-0           SeuratObject_4.1.0 Seurat_4.1.1       magrittr_2.0.3    

loaded via a namespace (and not attached):
  [1] nlme_3.1-158          matrixStats_0.62.0    spatstat.sparse_2.1-1 RcppAnnoy_0.0.19      RColorBrewer_1.1-3    httr_1.4.3            sctransform_0.3.3     tools_4.1.3           utf8_1.2.2           
 [10] R6_2.5.1              irlba_2.3.5           rpart_4.1.16          KernSmooth_2.23-20    uwot_0.1.11           mgcv_1.8-40           rgeos_0.5-9           DBI_1.1.2             lazyeval_0.2.2       
 [19] colorspace_2.0-3      tidyselect_1.1.2      gridExtra_2.3         compiler_4.1.3        progressr_0.10.1      cli_3.3.0             plotly_4.10.0         labeling_0.4.2        scales_1.2.0         
 [28] lmtest_0.9-40         spatstat.data_2.2-0   ggridges_0.5.3        pbapply_1.5-0         goftest_1.2-3         stringr_1.4.0         digest_0.6.29         spatstat.utils_2.3-1  pkgconfig_2.0.3      
 [37] htmltools_0.5.2       parallelly_1.32.0     fastmap_1.1.0         htmlwidgets_1.5.4     rlang_1.0.2           shiny_1.7.1           farver_2.1.0          generics_0.1.2        zoo_1.8-10           
 [46] jsonlite_1.8.0        spatstat.random_2.2-0 ica_1.0-2             dplyr_1.0.9           patchwork_1.1.1       Matrix_1.4-1          Rcpp_1.0.8.3          munsell_0.5.0         fansi_1.0.3          
 [55] abind_1.4-5           reticulate_1.25       lifecycle_1.0.1       stringi_1.7.6         MASS_7.3-57           Rtsne_0.16            plyr_1.8.7            grid_4.1.3            parallel_4.1.3       
 [64] listenv_0.8.0         promises_1.2.0.1      ggrepel_0.9.1         crayon_1.5.1          deldir_1.0-6          miniUI_0.1.1.1        lattice_0.20-45       cowplot_1.1.1         splines_4.1.3        
 [73] tensor_1.5            pillar_1.7.0          igraph_1.3.2          spatstat.geom_2.4-0   future.apply_1.9.0    reshape2_1.4.4        codetools_0.2-18      leiden_0.4.2          glue_1.6.2           
 [82] data.table_1.14.2     png_0.1-7             vctrs_0.4.1           httpuv_1.6.5          polyclip_1.10-0       gtable_0.3.0          RANN_2.6.1            purrr_0.3.4           spatstat.core_2.4-4  
 [91] tidyr_1.2.0           scattermore_0.8       future_1.26.1         assertthat_0.2.1      ggplot2_3.3.6         mime_0.12             xtable_1.8-4          RSpectra_0.16-1       later_1.3.0          
[100] survival_3.3-1        viridisLite_0.4.0     tibble_3.1.7          cluster_2.1.3         globals_0.15.0        fitdistrplus_1.1-8    ellipsis_0.3.2        ROCR_1.0-11
Reproduce-able example

```r $ radian R version 4.1.3 (2022-03-10) -- "One Push-Up" Platform: x86_64-conda-linux-gnu (64-bit) r$> library(magrittr) r$> library(Seurat) Attaching SeuratObject Attaching sp r$> pbmc.data <- Read10X(data.dir = 'filtered_gene_bc_matrices/hg19/') r$> pbmc <- CreateSeuratObject(counts = pbmc.data, project = "pbmc3k", min.cells = 3, min.features = 200) Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-') r$> pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-") r$> pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5) r$> pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize", scale.factor = 10000) Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| r$> pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000) Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| r$> all.genes <- rownames(pbmc) pbmc <- ScaleData(pbmc, features = all.genes) Centering and scaling data matrix |======================================================================================================================================================================================================================| 100% r$> pbmc <- RunPCA(pbmc, features = VariableFeatures(object = pbmc)) PC_ 1 Positive: CST3, TYROBP, LST1, AIF1, FTL, FTH1, LYZ, FCN1, S100A9, TYMP FCER1G, CFD, LGALS1, S100A8, CTSS, LGALS2, SERPINA1, IFITM3, SPI1, CFP PSAP, IFI30, SAT1, COTL1, S100A11, NPC2, GRN, LGALS3, GSTP1, PYCARD Negative: MALAT1, LTB, IL32, IL7R, CD2, B2M, ACAP1, CD27, STK17A, CTSW CD247, GIMAP5, AQP3, CCL5, SELL, TRAF3IP3, GZMA, MAL, CST7, ITM2A MYC, GIMAP7, HOPX, BEX2, LDLRAP1, GZMK, ETS1, ZAP70, TNFAIP8, RIC3 PC_ 2 Positive: CD79A, MS4A1, TCL1A, HLA-DQA1, HLA-DQB1, HLA-DRA, LINC00926, CD79B, HLA-DRB1, CD74 HLA-DMA, HLA-DPB1, HLA-DQA2, CD37, HLA-DRB5, HLA-DMB, HLA-DPA1, FCRLA, HVCN1, LTB BLNK, P2RX5, IGLL5, IRF8, SWAP70, ARHGAP24, FCGR2B, SMIM14, PPP1R14A, C16orf74 Negative: NKG7, PRF1, CST7, GZMB, GZMA, FGFBP2, CTSW, GNLY, B2M, SPON2 CCL4, GZMH, FCGR3A, CCL5, CD247, XCL2, CLIC3, AKR1C3, SRGN, HOPX TTC38, APMAP, CTSC, S100A4, IGFBP7, ANXA1, ID2, IL32, XCL1, RHOC PC_ 3 Positive: HLA-DQA1, CD79A, CD79B, HLA-DQB1, HLA-DPB1, HLA-DPA1, CD74, MS4A1, HLA-DRB1, HLA-DRA HLA-DRB5, HLA-DQA2, TCL1A, LINC00926, HLA-DMB, HLA-DMA, CD37, HVCN1, FCRLA, IRF8 PLAC8, BLNK, MALAT1, SMIM14, PLD4, LAT2, IGLL5, P2RX5, SWAP70, FCGR2B Negative: PPBP, PF4, SDPR, SPARC, GNG11, NRGN, GP9, RGS18, TUBB1, CLU HIST1H2AC, AP001189.4, ITGA2B, CD9, TMEM40, PTCRA, CA2, ACRBP, MMD, TREML1 NGFRAP1, F13A1, SEPT5, RUFY1, TSC22D1, MPP1, CMTM5, RP11-367G6.3, MYL9, GP1BA PC_ 4 Positive: HLA-DQA1, CD79B, CD79A, MS4A1, HLA-DQB1, CD74, HLA-DPB1, HIST1H2AC, PF4, TCL1A SDPR, HLA-DPA1, HLA-DRB1, HLA-DQA2, HLA-DRA, PPBP, LINC00926, GNG11, HLA-DRB5, SPARC GP9, AP001189.4, CA2, PTCRA, CD9, NRGN, RGS18, GZMB, CLU, TUBB1 Negative: VIM, IL7R, S100A6, IL32, S100A8, S100A4, GIMAP7, S100A10, S100A9, MAL AQP3, CD2, CD14, FYB, LGALS2, GIMAP4, ANXA1, CD27, FCN1, RBP7 LYZ, S100A11, GIMAP5, MS4A6A, S100A12, FOLR3, TRABD2A, AIF1, IL8, IFI6 PC_ 5 Positive: GZMB, NKG7, S100A8, FGFBP2, GNLY, CCL4, CST7, PRF1, GZMA, SPON2 GZMH, S100A9, LGALS2, CCL3, CTSW, XCL2, CD14, CLIC3, S100A12, CCL5 RBP7, MS4A6A, GSTP1, FOLR3, IGFBP7, TYROBP, TTC38, AKR1C3, XCL1, HOPX Negative: LTB, IL7R, CKB, VIM, MS4A7, AQP3, CYTIP, RP11-290F20.3, SIGLEC10, HMOX1 PTGES3, LILRB2, MAL, CD27, HN1, CD2, GDI2, ANXA5, CORO1B, TUBA1B FAM110A, ATP1A1, TRADD, PPA1, CCDC109B, ABRACL, CTD-2006K23.1, WARS, VMO1, FYB r$> pbmc <- FindNeighbors(pbmc, dims = 1:10) pbmc <- FindClusters(pbmc, resolution = 0.5) Computing nearest neighbor graph Computing SNN Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck Number of nodes: 2638 Number of edges: 95965 Running Louvain algorithm... 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Maximum modularity in 10 random starts: 0.8723 Number of communities: 9 Elapsed time: 0 seconds r$> pbmc <- RunUMAP(pbmc, dims = 1:10) Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation' This message will be shown once per session 14:53:59 UMAP embedding parameters a = 0.9922 b = 1.112 14:53:59 Read 2638 rows and found 10 numeric columns 14:53:59 Using Annoy for neighbor search, n_neighbors = 30 14:53:59 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 14:54:00 Writing NN index file to temp file /tmp/RtmpE4wsTH/file4c8fa3a69dfea 14:54:00 Searching Annoy index using 1 thread, search_k = 3000 14:54:02 Annoy recall = 100% 14:54:02 Commencing smooth kNN distance calibration using 1 thread 14:54:04 Initializing from normalized Laplacian + noise 14:54:04 Commencing optimization for 500 epochs, with 105124 positive edges 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 14:54:11 Optimization finished r$> plot = DimPlot(pbmc, reduction = "umap") r$> LabelClusters(plot, id='ident', labels = 1:4) Error in LabelClusters(plot, id = "ident", labels = 1:4) : Length of labels (4) must be equal to the number of clusters being labeled (4). r$> ```

vgastaldi commented 2 years ago

I found the same issue when FeaturePlot calls LabelClusters.

parkjooyoung99 commented 1 year ago

I had 'ident' in my meta.data so have removed it. After that, the problem solved!

dcollins15 commented 4 months ago

Thanks for using Seurat!

It appears that this issue has gone stale. In an effort to keep our Issues board from getting more unruly than it already is, we’re going to begin closing out issues that haven’t had any activity since the release of v4.4.0.

If this issue is still relevant we strongly encourage you to reopen or repost it, especially if you didn’t initially receive a response from us.