satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 908 forks source link

Question about BPcells and unimodal umap projection #8430

Closed Flu09 closed 8 months ago

Flu09 commented 8 months ago

Hello, I am following this vignette https://satijalab.org/seurat/articles/seurat5_bpcells_interaction_vignette

because I have a large h5ad object

AnnData object with n_obs × n_vars = 2480956 × 59357 obs: 'ROIGroup', 'ROIGroupCoarse', 'ROIGroupFine', 'roi', 'organism_ontology_term_id', 'disease_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'sex_ontology_term_id', 'development_stage_ontology_term_id', 'donor_id', 'suspension_type', 'dissection', 'fraction_mitochondrial', 'fraction_unspliced', 'cell_cycle_score', 'total_genes', 'total_UMIs', 'sample_id', 'supercluster_term', 'cluster_id', 'subcluster_id', 'cell_type_ontology_term_id', 'tissue_ontology_term_id', 'is_primary_data', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid' var: 'Biotype', 'Chromosome', 'End', 'Gene', 'Start', 'feature_is_filtered', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length' uns: 'batch_condition', 'citation', 'schema_reference', 'schema_version', 'title' obsm: 'X_UMAP', 'X_tSNE'

which upon following this method in the vignette gave me a seurat object without any dimensional reduction

seurat_object An object of class Seurat 59357 features across 2480956 samples within 1 assay Active assay: RNA (59357 features, 0 variable features) 1 layer present: counts

Is there a way to keep the obsm slot. Also the rest of the slots and not only the metadata? var: 'Biotype', 'Chromosome', 'End', 'Gene', 'Start', 'feature_is_filtered', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length' uns: 'batch_condition', 'citation', 'schema_reference', 'schema_version', 'title' obsm: 'X_UMAP', 'X_tSNE'

Another question I have is if I want to use this object as a reference to transfer the cell annotation following https://satijalab.org/seurat/articles/integration_mapping

I will need the object's original umap ? or do I perform normalization,scaling, clustering etc.. myself then transfer the celltypes?

igrabski commented 8 months ago

Hi, unfortunately we don't have functionality to support reading in dimension reduction information as part of this conversion process. You could try the LoadH5AD function from Azimuth, and if that doesn't help, you could try saving the dimension reduction information as a text file, and just load that into R and add that information to your Seurat object as a DimReduc.

Flu09 commented 7 months ago

is there a tutorial somewhere showing how to save a dimension reduction information as a text file and then add it a Dim Reduc?

igrabski commented 7 months ago

We don't offer a specific tutorial for what you're doing since these are mostly non-standard steps, but hopefully the following information can be helpful. Your dimension reduction information is currently in an AnnData object, so while we don't have tutorials for that, there is information on the structure of these objects and how to access information in the AnnData tutorials. In this case, you just need the cell embeddings for your dimension reduction of interest. If the cell embeddings are stored as a Pandas dataframe, then you can use the to_csv function to save it as a CSV file.

Then, going into R, you can read in CSV files using the read.csv function. Once you have the cell embeddings loaded into R in this way, then you can follow the steps from the section "Storing a custom dimensional reduction calculation" in this tutorial. (Note that in that tutorial, they start by computing MDS; in your case, you can jump right to the line using CreateDimReducObject since you already have your dimensional reduction.)

Flu09 commented 7 months ago

Thank you. I just have another question. How to proceed for the unimondal umap projection after the UMAP was added to the object through CreateDimReducObject()? I must normalize, find variable features and then scale and then runPCA? How to make sure that the numbers reflect the imported UMAP for downstream analysis such as differential gene expression ?

Flu09 commented 7 months ago

Hi, I am still wondering if after the UMAP was added to the object through CreateDimReducObject() , I will need to normalize, findvariablefeatures, scale and then run pca or not ? . But the error is below when I reached the runPCA step. I use slurm and conda environment.

object of type 'S4' is not subsettable

Loading required package: SeuratObject Loading required package: sp

Attaching package: ‘SeuratObject’

The following object is masked from ‘package:base’:

intersect

Error in methods::slot(object = object, name = "layers")[[layer]][features, : object of type 'S4' is not subsettable Calls: RunPCA ... RunPCA.StdAssay -> PrepDR5 -> LayerData -> LayerData.Assay5 Execution halted

sessionInfo() R version 4.3.2 (2023-10-31) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Rocky Linux 9.1 (Blue Onyx)

Matrix products: default BLAS/LAPACK: /path/conda-environments/Rtools/lib/libopenblasp-r0.3.26.so; LAPACK version 3.12.0

locale: [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

tzcode source: system (glibc)

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] BPCells_0.1.0 Seurat_5.0.1 SeuratObject_5.0.1 sp_2.1-3

loaded via a namespace (and not attached): [1] deldir_2.0-4 pbapply_1.7-2 gridExtra_2.3
[4] rlang_1.1.3 magrittr_2.0.3 RcppAnnoy_0.0.22
[7] matrixStats_1.2.0 ggridges_0.5.6 compiler_4.3.2
[10] spatstat.geom_3.2-9 png_0.1-8 vctrs_0.6.5
[13] reshape2_1.4.4 stringr_1.5.1 pkgconfig_2.0.3
[16] fastmap_1.1.1 ellipsis_0.3.2 utf8_1.2.4
[19] promises_1.2.1 purrr_1.0.2 jsonlite_1.8.8
[22] goftest_1.2-3 later_1.3.2 spatstat.utils_3.0-4
[25] irlba_2.3.5.1 parallel_4.3.2 cluster_2.1.6
[28] R6_2.5.1 ica_1.0-3 stringi_1.8.3
[31] RColorBrewer_1.1-3 spatstat.data_3.0-4 reticulate_1.35.0
[34] parallelly_1.37.1 lmtest_0.9-40 scattermore_1.2
[37] Rcpp_1.0.12 tensor_1.5 future.apply_1.11.1
[40] zoo_1.8-12 sctransform_0.4.1 httpuv_1.6.14
[43] Matrix_1.6-3 splines_4.3.2 igraph_2.0.2
[46] tidyselect_1.2.0 abind_1.4-5 spatstat.random_3.2-3 [49] codetools_0.2-19 miniUI_0.1.1.1 spatstat.explore_3.2-6 [52] listenv_0.9.1 lattice_0.22-5 tibble_3.2.1
[55] plyr_1.8.9 shiny_1.8.0 ROCR_1.0-11
[58] Rtsne_0.17 future_1.33.1 fastDummies_1.7.3
[61] survival_3.5-8 polyclip_1.10-6 fitdistrplus_1.1-11
[64] pillar_1.9.0 KernSmooth_2.23-22 plotly_4.10.4
[67] generics_0.1.3 RcppHNSW_0.6.0 ggplot2_3.5.0
[70] munsell_0.5.0 scales_1.3.0 globals_0.16.3
[73] xtable_1.8-4 glue_1.7.0 lazyeval_0.2.2
[76] tools_4.3.2 data.table_1.15.2 RSpectra_0.16-1
[79] RANN_2.6.1 leiden_0.4.3.1 dotCall64_1.1-1
[82] cowplot_1.1.3 grid_4.3.2 tidyr_1.3.1
[85] colorspace_2.1-0 nlme_3.1-164 patchwork_1.2.0
[88] cli_3.6.2 spatstat.sparse_3.0-3 spam_2.10-0
[91] fansi_1.0.6 viridisLite_0.4.2 dplyr_1.1.4
[94] uwot_0.1.16 gtable_0.3.4 digest_0.6.34
[97] progressr_0.14.0 ggrepel_0.9.5 htmlwidgets_1.6.4
[100] htmltools_0.5.7 lifecycle_1.0.4 httr_1.4.7
[103] mime_0.12 MASS_7.3-60

Flu09 commented 7 months ago

The issue turns out not only runPCA() related. It shows up starting the NormalizeData(). I am using conda and slurm. submitted an issue here https://github.com/satijalab/seurat/issues/8604