Cannot RUN PCA on CellPlex data with Seurat

SDJ7007 commented 2 years ago

Hello, Finally 10X Genomics has given a code for analysis of CELL Plex (CMO) in Seurat. The code is written in https://www.10xgenomics.com/cn/resources/analysis-guides/tag-assignment-of-10x-Genomics-cellplex-data-using-seurats-htodemux-function and they have linked the vignette of https://satijalab.org/seurat/articles/hashing_vignette.html

I cannot run PCA on my data. I am getting this error: seurat.singlet <- RunPCA(seurat.singlet, features = VariableFeatures(object = seurat.singlet)) Error in irlba(A = t(x = object), nv = npcs, ...) : max(nu, nv) must be positive

How can I overcome this please? I read the other erros but none of it worked for me.

The codes prior is as follows: library(Seurat) Attaching SeuratObject Attaching sp

setwd("E:/Cell_Ranger_10x_Seurat") data_dir <- "E:/Cell_Ranger_10x_Seurat" data <- Read10X(data.dir = data_dir) 10X data contains more than one type and is being returned as a list containing matrices of each type. seurat_object= CreateSeuratObject(counts = data$Gene Expression, project = "10X_Seurat") seurat_object[["CMO"]] = CreateAssayObject(counts = data$Multiplexing Capture) head(seurat_object@meta.data) orig.ident nCount_RNA nFeature_RNA nCount_CMO nFeature_CMO AAACCCAAGAAACACT-1 10X_Seurat 1 1 0 0 AAACCCAAGAAACCAT-1 10X_Seurat 1 1 0 0 AAACCCAAGAAACCCA-1 10X_Seurat 7 7 0 0 AAACCCAAGAAACCCG-1 10X_Seurat 0 0 1 1 AAACCCAAGAAACCTG-1 10X_Seurat 1 1 0 0 AAACCCAAGAAACTAC-1 10X_Seurat 3 3 0 0 library(data.table) data.table 1.14.2 using 8 threads (see ?getDTthreads). Latest news: r-datatable.com head(seurat_object@assays) $RNA Assay data with 32285 features for 2893103 cells First 10 features: Xkr4, Gm1992, Gm19938, Gm37381, Rp1, Sox17, Gm37587, Gm37323, Mrpl15, Lypla1

$CMO Assay data with 12 features for 2893103 cells First 10 features: CMO301, CMO302, CMO303, CMO304, CMO305, CMO306, CMO307, CMO308, CMO309, CMO310

cells <- fread("assignment_confidence_table.csv",select = c("Barcodes"))

cells[1:5] Barcodes 1: AAACCCAAGACTCAAA-1 2: AAACCCAAGATACATG-1 3: AAACCCAAGATAGCAT-1 4: AAACCCAAGCCTGTCG-1 5: AAACCCAAGTGGCCTC-1 seurat_object_use <- subset(seurat_object, cells = cells$Barcodes) DefaultAssay(seurat_object_use)<-"CMO" seurat_object_use <- subset(seurat_object_use, features = c("CMO309","CMO310", "CMO311", "CMO312")) seurat_object_use = subset(x = seurat_object_use, subset = nCount_CMO > 0) seurat_object_norm <- NormalizeData(seurat_object_use, assay = "CMO", normalization.method = "CLR") Normalizing across features |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
seurat_object_demux <- HTODemux(seurat_object_norm, assay = "CMO", positive.quantile = 0.99) Cutoff for CMO309 : 1285 reads Cutoff for CMO310 : 966 reads Cutoff for CMO311 : 1028 reads Cutoff for CMO312 : 1056 reads table(seurat_object_demux$CMO_classification.global)

Doublet Negative Singlet 880 1599 9862

Idents(seurat_object_demux) <- "CMO_maxID"

RidgePlot(seurat_object_demux, assay = "CMO", features = rownames(seurat_object_demux[["CMO"]])

RidgePlot(seurat_object_demux, assay = "CMO", features = rownames(seurat_object_demux[["CMO"]])[1:4], ncol = 2) Picking joint bandwidth of 0.061 Picking joint bandwidth of 0.0577 Picking joint bandwidth of 0.0592 Picking joint bandwidth of 0.064 RidgePlot(seurat_object_demux, assay = "CMO", features = rownames(seurat_object_demux[["CMO"]])[1:2], ncol = 2) Picking joint bandwidth of 0.061 Picking joint bandwidth of 0.0577 FeatureScatter(seurat_object_demux, feature1 = "CMO309", feature2 = "CMO310") FeatureScatter(seurat_object_demux, feature1 = "CMO309", feature2 = "CMO310", label = TRUE) Error in FeatureScatter(seurat_object_demux, feature1 = "CMO309", feature2 = "CMO310", : unused argument (label = TRUE) FeatureScatter(seurat_object_demux, feature1 = "CMO309", feature2 = "CMO310") Idents(seurat_object_demux) <- "CMO_classification.global" VlnPlot(seurat_object_demux, features = "nCount_RNA", pt.size = 0.1, log = TRUE) seurat_object_demux_subset<- subset(seurat_object_demux, idents = "Negative", invert = TRUE)

cmo.dist.mtx <- as.matrix(dist(t(GetAssayData(object = seurat_object_demux_subset, assay = "CMO")))) seurat_object_demux_subset <- RunTSNE(seurat_object_demux_subset, distance.matrix = cmo.dist.mtx, perplexity = 100) Warning: Adding a command log without an assay associated with it DimPlot(seurat_object_demux_subset) HTOHeatmap(seurat_object_demux, assay = "CMO", ncells = 7000) Warning message: guides(<scale> = FALSE) is deprecated. Please use guides(<scale> = "none") instead. seurat.singlet <- subset(seurat_object_demux, idents = "Singlet") seurat.singlet <- FindVariableFeatures(seurat.singlet, selection.method = "mean.var.plot") Calculating gene means 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Calculating gene variance to mean ratios 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| seurat.singlet <- ScaleData(seurat.singlet, features = VariableFeatures(seurat.singlet)) Centering and scaling data matrix |===========================================================================================================| 100% seurat.singlet <- RunPCA(seurat.singlet, features = VariableFeatures(seurat.singlet)) Error in irlba(A = t(x = object), nv = npcs, ...) : max(nu, nv) must be positive ElbowPlot(seurat.singlet) Error: Cannot find 'pca' in this Seurat object ElbowPlot(seurat_object_demux) Error: Cannot find 'pca' in this Seurat object View(seurat.singlet) seurat.singlet <- RunPCA(seurat.singlet, features = VariableFeatures(object = seurat.singlet)) Error in irlba(A = t(x = object), nv = npcs, ...) : max(nu, nv) must be positive print(seurat.singlet[["pca"]], dims = 1:5, nfeatures = 5) Error: Cannot find 'pca' in this Seurat object

Can anyone help please?

Thank you.

Kind Regards, Shweta Johari

SDJ7007 commented 2 years ago

sessionInfo() R version 4.2.0 (2022-04-22 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] data.table_1.14.2 sp_1.4-7 SeuratObject_4.1.0 Seurat_4.1.1

loaded via a namespace (and not attached): [1] nlme_3.1-157 matrixStats_0.62.0 spatstat.sparse_2.1-1 RcppAnnoy_0.0.19 RColorBrewer_1.1-3
[6] httr_1.4.3 sctransform_0.3.3 tools_4.2.0 utf8_1.2.2 R6_2.5.1
[11] irlba_2.3.5 rpart_4.1.16 KernSmooth_2.23-20 uwot_0.1.11 mgcv_1.8-40
[16] rgeos_0.5-9 DBI_1.1.2 lazyeval_0.2.2 colorspace_2.0-3 withr_2.5.0
[21] tidyselect_1.1.2 gridExtra_2.3 compiler_4.2.0 progressr_0.10.0 cli_3.3.0
[26] plotly_4.10.0 labeling_0.4.2 scales_1.2.0 lmtest_0.9-40 spatstat.data_2.2-0
[31] ggridges_0.5.3 pbapply_1.5-0 goftest_1.2-3 stringr_1.4.0 digest_0.6.29
[36] spatstat.utils_2.3-1 pkgconfig_2.0.3 htmltools_0.5.2 parallelly_1.31.1 fastmap_1.1.0
[41] htmlwidgets_1.5.4 rlang_1.0.2 rstudioapi_0.13 shiny_1.7.1 farver_2.1.0
[46] generics_0.1.2 zoo_1.8-10 jsonlite_1.8.0 spatstat.random_2.2-0 ica_1.0-2
[51] dplyr_1.0.9 magrittr_2.0.3 patchwork_1.1.1 Matrix_1.4-1 Rcpp_1.0.8.3
[56] munsell_0.5.0 fansi_1.0.3 abind_1.4-5 reticulate_1.25 lifecycle_1.0.1
[61] stringi_1.7.6 MASS_7.3-56 Rtsne_0.16 plyr_1.8.7 grid_4.2.0
[66] parallel_4.2.0 listenv_0.8.0 promises_1.2.0.1 ggrepel_0.9.1 crayon_1.5.1
[71] deldir_1.0-6 miniUI_0.1.1.1 lattice_0.20-45 cowplot_1.1.1 splines_4.2.0
[76] tensor_1.5 pillar_1.7.0 igraph_1.3.1 spatstat.geom_2.4-0 future.apply_1.9.0
[81] reshape2_1.4.4 codetools_0.2-18 leiden_0.4.2 glue_1.6.2 png_0.1-7
[86] vctrs_0.4.1 httpuv_1.6.5 polyclip_1.10-0 gtable_0.3.0 RANN_2.6.1
[91] purrr_0.3.4 spatstat.core_2.4-4 tidyr_1.2.0 scattermore_0.8 future_1.25.0
[96] assertthat_0.2.1 ggplot2_3.3.6 mime_0.12 xtable_1.8-4 later_1.3.0
[101] survival_3.3-1 viridisLite_0.4.0 tibble_3.1.7 cluster_2.1.3 globals_0.15.0
[106] fitdistrplus_1.1-8 ellipsis_0.3.2 ROCR_1.0-11

samuel-marsh commented 2 years ago

Hi,

Not member of dev team but hopefully can be helpful. Did you switch default assay back to RNA before running RunPCA?

Best, Sam

SDJ7007 commented 2 years ago

Hey Thank you so much for your response. So that is something I thought about too, So I figured this out: In the 10X genomics code written, there is a command, https://www.10xgenomics.com/cn/resources/analysis-guides/tag-assignment-of-10x-Genomics-cellplex-data-using-seurats-htodemux-function : DefaultAssay(seurat_object_use)<-"CMO" Then further on, all the work is done by this object and using this assay. Then even if I try to get back to RNA as my assay. It doesn't work. It says no RNA assay.

I also tried going back to the object with both assays and then it doesn't work because the normalization and scaling is done with the other objects and assay as CMO.

Thank you for your response though.

Kind Regards, Shweta Johari

samuel-marsh commented 2 years ago

Hi,

So you don't want to run the analysis with CMO as the assay. That page from 10X is just describing setting up the object not the downstream analysis which needs to be run on gene expression data. That page/process is just for demultiplexing the samples and if you try and run analysis on it's not going to work. The reason the RunPCA step failed is because it by default it runs 50 PCs but your number of features can't be less than PCs and if CMO is still default assay then it will be (hence it failed). The reason that the RNA assay isn't present anymore is due to running subset on the CMO assay (see here for reasoning and proposed solution: https://github.com/satijalab/seurat/issues/5978#issuecomment-1133342063).

Best, Sam

SDJ7007 commented 2 years ago

Hello Dr.Samuel,

Thank you so much for your time and Thank you so much for guiding me to the solution. I was perplexed by the code as it added the vignette of integration too.

To clarify, the solution of the https://github.com/satijalab/seurat/issues/5978#issuecomment-1133342063, means: After normalization of the CMO/ before make the CMO as the default assay: To use: object_filtered = subset(object, features = c(rownames(object[["RNA"]]), c("CMO310","CMO311", "CMO312"))

or: S22_23_24_cells_filtered_CMO_norm_tags <- subset(S22_23_24_cells_filtered_CMO_norm, features = c("CMO310","CMO311", "CMO312")) Instead of this, To use: object_filtered = subset(object, features = c(rownames(object[["RNA"]]), c("CMO310","CMO311", "CMO312"))

So to understand it better according to the script given in the question:

Load the dataset
Create the Seurat Object
Create the two assays
Normalize RNA data with log normalization
Find and scale variable features
Then add the cells document
Remove the mt
VlnPlot: nFeatures_nCountRNA, nCountRNA_mt plot1 + plot2
Then filter the data: S22_23_24_cells_filtered <- subset(S22_23_24_cells, subset = nFeature_RNA > 200 & percent.mt < 10)
Scale S22_23_24_cells_filtered <- ScaleData(S22_23_24_cells_filtered, features = VariableFeatures(S22_23_24_cells)) 12 Normalize CMO data, here we use centered log-ratio (CLR) transformation 13 DefaultAssay(S22_23_24_cells_filtered_CMO_norm)<-"CMO" a) S22_23_24_cells_filtered_CMO_norm_tags <- subset(S22_23_24_cells_filtered_CMO_norm, features = c("CMO310","CMO311", "CMO312")) b) S22_23_24_cells_filtered_CMO_norm_tags_nozeros = subset(x = S22_23_24_cells_filtered_CMO_norm_tags, subset = nCount_CMO > 0)
HTODemux
table
Idents
RidgePlot
FeatureScatter
Idents
First, subset singlets 21.FindVariableFeatures 22.Scale Data
Calculate a distance matrix using HTO
Heatmap
Calculate tSNE embeddings with a distance matrix
DimPlot 27.HTOHeatMap

satijalab / seurat

Cannot RUN PCA on CellPlex data with Seurat #6001