feiyoung / PRECAST_Analysis

Main results in PRECAST
GNU General Public License v3.0
2 stars 0 forks source link

Sharing the data conversion for Brain12 #1

Closed boyiguo1 closed 2 years ago

boyiguo1 commented 2 years ago

I recently came across your preprint Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST. It is a great novel method and seems to have great potential. I would like to apply your method to my data.

Nevertheless, I'm having some hard time converting my data in SpatialExperiment class to an input for createPRECASTObject. Since you have done the analysis for spatialDLPFC data. Would you mind sharing the code how you convert the data in any form thatspatialDLPFC provides to createPRECASTObject?

Thank you very much!

feiyoung commented 2 years ago

Thank you for your attention! You can create a Seurat object (meta.data has two columns, named 'row' and 'col', for the spatial coordinates) using the count matrix and spatial coordinates in SpatialExperiment object for each DLPFC sample, then put them into a list object (seuList), finally use createPRECASTObject function. More details can be referred to https://feiyoung.github.io/PRECAST/articles/. Hope it can solve your problem.

boyiguo1 commented 2 years ago

That's great! Thanks for sharing and your prompt reply. I totally overlooked the articles when browsing the website, but I am grateful that you pointed out to me!

Yes, this solved my problem.

Just a side note, I am experimenting with the function SeuratObject::as.Seurat which seems to be another way to convert without manipulating the data matrices.

Overall, thank you very much!

boyiguo1 commented 2 years ago

Just a side note, I am experimenting with the function SeuratObject::as.Seurat which seems to be another way to convert without manipulating the data matrices.

One of the related problem of using SeuratObject::as.Seurat to convert a spatialExperiment object to seurat to be used in precast is that the assay name would not be RNA, which will create downstream problem when working with createPRECASTObject function.

For example, an error message could read

Filter spots and features from Raw count data...
 Error in [.data.frame(seu@meta.data, , col_name) :
undefined columns select

This is because in the createPRECASTObject function, the filtering steps requrire assay name to be RNA be default. (see https://github.com/feiyoung/PRECAST/blob/04c5e51d44299a81d9b71622d96671f7013f167e/R/SetClass.R#L222) .

I wrote these wrapper functions to convert a spatialExperiment (possibly multiple samples in the same object) to seuraList for anyone who wants to use

library(dplyr)
library(purrr)
library(Seurat)
library(SpatialExperiment)
library(PRECAST)

spe_to_seuratList(spe)

# One sample function
spe_to_seurat <- function(spe){
  ret <- CreateSeuratObject(
    counts=assays(spe)$counts,
    meta.data=data.frame(row=spatialCoords(spe)[,1],
                 col=spatialCoords(spe)[,2])
  )
  return(ret)

  ##### as.Seurat doesn't work well because of assay="RNA" in createPRECASTObject ######
  # sce <- SingleCellExperiment(list(counts=assays(spe)$counts,
  #                                  logcounts = assays(spe)$logcounts),
  #                             colData=DataFrame(row=colData(spe)$array_row,
  #                                               col=colData(spe)$array_col)
  #                             )
  # 
  # 
  #   ret <- as.Seurat(sce, assay="RNA", project ="SingleCellExperiment") 

}

# Multiple sample
spe_to_seuratList <- function(spe){
  uniq_sample_id <- colData(spe)$sample_id |> unique()

  # Create a seurate object for each unique sample_id
  map(uniq_sample_id,
      .f = function(smp_id, spe){
        # browser()
        ret_spe <- spe[, colData(spe)$sample_id == smp_id]
        ret_seurat <- spe_to_seurat(ret_spe)

        return(ret_seurat)
      },
      spe = spe)
}

Session Info R version 4.2.1 (2022-06-23) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.3

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] DataPRECAST_0.1.0 PRECAST_1.2 gtools_3.9.3 sp_1.5-0
[5] SeuratObject_4.1.2 Seurat_4.2.0 forcats_0.5.2 stringr_1.4.1
[9] dplyr_1.0.10 purrr_0.3.4 readr_2.1.2 tidyr_1.2.1
[13] tibble_3.1.8 ggplot2_3.3.6 tidyverse_1.3.2 SpatialExperiment_1.6.1
[17] SingleCellExperiment_1.18.0 SummarizedExperiment_1.26.1 Biobase_2.56.0 GenomicRanges_1.48.0
[21] GenomeInfoDb_1.32.4 IRanges_2.30.1 S4Vectors_0.34.0 BiocGenerics_0.42.0
[25] MatrixGenerics_1.8.1 matrixStats_0.62.0

loaded via a namespace (and not attached): [1] utf8_1.2.2 reticulate_1.26 R.utils_2.12.0 tidyselect_1.1.2
[5] htmlwidgets_1.5.4 grid_4.2.1 BiocParallel_1.30.3 Rtsne_0.16
[9] DropletUtils_1.16.0 ScaledMatrix_1.4.1 munsell_0.5.0 codetools_0.2-18
[13] ica_1.0-3 future_1.28.0 miniUI_0.1.1.1 withr_2.5.0
[17] spatstat.random_2.2-0 colorspace_2.0-3 progressr_0.11.0 rstudioapi_0.14
[21] ROCR_1.0-11 tensor_1.5 listenv_0.8.0 GenomeInfoDbData_1.2.8
[25] polyclip_1.10-0 rhdf5_2.40.0 parallelly_1.32.1 vctrs_0.4.1
[29] generics_0.1.3 ggthemes_4.2.4 R6_2.5.1 ggbeeswarm_0.6.0
[33] rsvd_1.0.5 locfit_1.5-9.6 bitops_1.0-7 rhdf5filters_1.8.0
[37] spatstat.utils_2.3-1 DelayedArray_0.22.0 assertthat_0.2.1 promises_1.2.0.1
[41] scales_1.2.1 googlesheets4_1.0.1 beeswarm_0.4.0 rgeos_0.5-9
[45] gtable_0.3.1 beachmat_2.12.0 globals_0.16.1 goftest_1.2-3
[49] rlang_1.0.6 splines_4.2.1 lazyeval_0.2.2 gargle_1.2.1
[53] spatstat.geom_2.4-0 broom_1.0.1 reshape2_1.4.4 abind_1.4-5
[57] modelr_0.1.9 backports_1.4.1 httpuv_1.6.6 tools_4.2.1
[61] ellipsis_0.3.2 spatstat.core_2.4-4 RColorBrewer_1.1-3 ggridges_0.5.4
[65] Rcpp_1.0.9 plyr_1.8.7 sparseMatrixStats_1.8.0 zlibbioc_1.42.0
[69] RCurl_1.98-1.8 rpart_4.1.16 deldir_1.0-6 viridis_0.6.2
[73] pbapply_1.5-0 cowplot_1.1.1 zoo_1.8-11 haven_2.5.1
[77] ggrepel_0.9.1 cluster_2.1.4 fs_1.5.2 GiRaF_1.0.1
[81] magrittr_2.0.3 data.table_1.14.2 magick_2.7.3 scattermore_0.8
[85] lmtest_0.9-40 reprex_2.0.2 RANN_2.6.1 googledrive_2.0.0
[89] fitdistrplus_1.1-8 hms_1.1.2 patchwork_1.1.2 mime_0.12
[93] xtable_1.8-4 mclust_5.4.10 readxl_1.4.1 gridExtra_2.3
[97] scater_1.24.0 compiler_4.2.1 KernSmooth_2.23-20 crayon_1.5.1
[101] R.oo_1.25.0 htmltools_0.5.3 mgcv_1.8-40 later_1.3.0
[105] tzdb_0.3.0 lubridate_1.8.0 DBI_1.1.3 dbplyr_2.2.1
[109] MASS_7.3-58.1 Matrix_1.5-1 cli_3.4.1 R.methodsS3_1.8.2
[113] igraph_1.3.5 DR.SC_2.9 pkgconfig_2.0.3 plotly_4.10.0
[117] scuttle_1.6.3 spatstat.sparse_2.1-1 xml2_1.3.3 vipor_0.4.5
[121] dqrng_0.3.0 XVector_0.36.0 CompQuadForm_1.4.3 rvest_1.0.3
[125] digest_0.6.29 sctransform_0.3.5 RcppAnnoy_0.0.19 spatstat.data_2.2-0
[129] cellranger_1.1.0 leiden_0.4.3 uwot_0.1.14 edgeR_3.38.4
[133] DelayedMatrixStats_1.18.0 curl_4.3.2 shiny_1.7.2 rjson_0.2.21
[137] lifecycle_1.0.2 nlme_3.1-159 jsonlite_1.8.0 Rhdf5lib_1.18.2
[141] BiocNeighbors_1.14.0 viridisLite_0.4.1 limma_3.52.3 fansi_1.0.3
[145] pillar_1.8.1 lattice_0.20-45 fastmap_1.1.0 httr_1.4.4
[149] survival_3.4-0 remotes_2.4.2 glue_1.6.2 png_0.1-7
[153] stringi_1.7.8 HDF5Array_1.24.2 BiocSingular_1.12.0 irlba_2.3.5
[157] future.apply_1.9.1

I also think a related problem is that all the assay argument defaults to RNA when creating PRECAST object, which should be highlighted in Vignettes or optimized in the next release of the PRECAST package.

Thanks.

feiyoung commented 2 years ago

Thank you, Boyi! We will update the package according to your valuable feedback.