lmweber / ggspavis

Visualization functions for spatial transcriptomics data in R
MIT License
2 stars 4 forks source link

plotSpots Overlays multiple samples #5

Closed boyiguo1 closed 2 years ago

boyiguo1 commented 2 years ago

Problem Statement: When calling plotSpots using a SpatialExperiment object that contains multiple samples, plotSpots will not create a panel of individual spot plots. Instead, it overlays all samples on top of each other within the same plot.

Minimium Replicating Example: (Coming shortly)

library(SpatialExperiment)
#> Loading required package: SingleCellExperiment
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#> 
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#> 
#>     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#>     colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#>     colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#>     colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#>     colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#>     colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#>     colWeightedMeans, colWeightedMedians, colWeightedSds,
#>     colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#>     rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#>     rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#>     rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#>     rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#>     rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#>     rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#>     rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
#>     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
#>     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
#>     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
#>     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
#>     union, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> 
#> Attaching package: 'S4Vectors'
#> The following objects are masked from 'package:base':
#> 
#>     expand.grid, I, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: Biobase
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> 
#> Attaching package: 'Biobase'
#> The following object is masked from 'package:MatrixGenerics':
#> 
#>     rowMedians
#> The following objects are masked from 'package:matrixStats':
#> 
#>     anyMissing, rowMedians
library(STexampleData)
#> Loading required package: ExperimentHub
#> Loading required package: AnnotationHub
#> Loading required package: BiocFileCache
#> Loading required package: dbplyr
#> 
#> Attaching package: 'AnnotationHub'
#> The following object is masked from 'package:Biobase':
#> 
#>     cache
spe <- Visium_humanDLPFC()

spe2 <- Visium_humanDLPFC()
spe2$sample_id <- "sample_replicate"
spe2 <- spe2[, colData(spe2)$in_tissue==TRUE]

# Create a shift on the y-axis
spatialCoords(spe2)[,"pxl_row_in_fullres"] <- 
  spatialCoords(spe2)[,"pxl_row_in_fullres"] + 200

# Create a spe object with 2 tissue sections
spe_multi <- cbind(spe, spe2)

library(ggspavis)
#> Loading required package: ggplot2
#> Registered S3 method overwritten by 'ggside':
#>   method from   
#>   +.gg   ggplot2
plotSpots(spe_multi, annotate = "sample_id")

Created on 2022-09-15 with reprex v2.0.2 Proposed Solution: Use an iterator to subset each sample from the SpatialExperiment object and create an individual plot. Afterward, use a function to tailor individual spot plots to a panel.

Temperary Solution

# Create Spot Plots for Mulitple Sample -----------------------------------

  library(ggspavis)
  library(purrr)        # Iterator for sample_ids
  library(ggpubr)       # Create a grid of plots

  unique_ids <- unique(spe_multi$sample_id)
  if(length(unique_ids)>1)
    STOP("Current version of ggspavis doesn't support plotting multiple samples.")

  # Extract in_tissue, x_coord and y_coord for each sample

  # Find an iterater to go over each sample_id

  tmp_id <- unique_ids[1]

  # Subset sample based on their id
  spe_sub <- spe_multi[, spe_multi$sample_id == tmp_id]

  plot_lst <- purrr::map(unique_ids, .f = function(tmp_id){
    # Subset sample based on their id
    spe_sub <- spe_multi[, spe_multi$sample_id == tmp_id]
    tmp_plot <- plotSpots(spe_sub) # TODO: passing in the other arguments
    tmp_plot <- tmp_plot + ggtitle(unique_ids)
  })

  output_plot <- ggarrange(plotlist = plot_lst)

  annotate_figure(output_plot, 
                  top = text_grob("Spatial Coordinates", 
                                  color = "black", face = "bold", size = 14))

Potential Optimization: It would be optimally to replace the current version of plotSpots with a version that is more robust to multiple-sample SpatialExperiment objects. To minimize the package dependency, it is possible to replace the iterator purrr::map with lapply, and ggpubr::ggarange with some function from the package grid.

Session Info: R version 4.2.1 (2022-06-23) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.3

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] shiny_1.7.2 ggspavis_1.2.0
[3] ggplot2_3.3.6 STexampleData_1.4.5
[5] ExperimentHub_2.4.0 AnnotationHub_3.4.0
[7] BiocFileCache_2.4.0 dbplyr_2.2.1
[9] SpatialExperiment_1.6.1 SingleCellExperiment_1.18.0 [11] SummarizedExperiment_1.26.1 Biobase_2.56.0
[13] GenomicRanges_1.48.0 GenomeInfoDb_1.32.4
[15] IRanges_2.30.1 S4Vectors_0.34.0
[17] BiocGenerics_0.42.0 MatrixGenerics_1.8.1
[19] matrixStats_0.62.0

loaded via a namespace (and not attached): [1] colorspace_2.0-3 rjson_0.2.21
[3] ellipsis_0.3.2 scuttle_1.6.3
[5] XVector_0.36.0 fs_1.5.2
[7] rstudioapi_0.14 farver_2.1.1
[9] bit64_4.0.5 interactiveDisplayBase_1.34.0 [11] AnnotationDbi_1.58.0 fansi_1.0.3
[13] codetools_0.2-18 R.methodsS3_1.8.2
[15] sparseMatrixStats_1.8.0 cachem_1.0.6
[17] knitr_1.40 jsonlite_1.8.0
[19] png_0.1-7 R.oo_1.25.0
[21] HDF5Array_1.24.2 BiocManager_1.30.18
[23] clipr_0.8.0 compiler_4.2.1
[25] httr_1.4.4 dqrng_0.3.0
[27] assertthat_0.2.1 Matrix_1.4-1
[29] fastmap_1.1.0 limma_3.52.2
[31] cli_3.4.0 later_1.3.0
[33] htmltools_0.5.3 tools_4.2.1
[35] gtable_0.3.1 glue_1.6.2
[37] GenomeInfoDbData_1.2.8 dplyr_1.0.10
[39] rappdirs_0.3.3 Rcpp_1.0.9
[41] jquerylib_0.1.4 vctrs_0.4.1
[43] Biostrings_2.64.1 rhdf5filters_1.8.0
[45] DelayedMatrixStats_1.18.0 xfun_0.32
[47] ps_1.7.1 beachmat_2.12.0
[49] mime_0.12 miniUI_0.1.1.1
[51] lifecycle_1.0.2 edgeR_3.38.4
[53] zlibbioc_1.42.0 scales_1.2.1
[55] promises_1.2.0.1 parallel_4.2.1
[57] rhdf5_2.40.0 yaml_2.3.5
[59] curl_4.3.2 memoise_2.0.1
[61] sass_0.4.2 RSQLite_2.2.16
[63] BiocVersion_3.15.2 highr_0.9
[65] filelock_1.0.2 BiocParallel_1.30.3
[67] ggside_0.2.1 rlang_1.0.5
[69] pkgconfig_2.0.3 bitops_1.0-7
[71] evaluate_0.16 lattice_0.20-45
[73] purrr_0.3.4 Rhdf5lib_1.18.2
[75] labeling_0.4.2 processx_3.7.0
[77] bit_4.0.4 tidyselect_1.1.2
[79] magrittr_2.0.3 R6_2.5.1
[81] magick_2.7.3 generics_0.1.3
[83] DelayedArray_0.22.0 DBI_1.1.3
[85] pillar_1.8.1 withr_2.5.0
[87] KEGGREST_1.36.3 RCurl_1.98-1.8
[89] tibble_3.1.8 crayon_1.5.1
[91] DropletUtils_1.16.0 utf8_1.2.2
[93] rmarkdown_2.16 locfit_1.5-9.6
[95] grid_4.2.1 callr_3.7.2
[97] blob_1.2.3 reprex_2.0.2
[99] digest_0.6.29 xtable_1.8-4
[101] httpuv_1.6.6 R.utils_2.12.0
[103] munsell_0.5.0 bslib_0.4.0

lmweber commented 2 years ago

Thank you! I will have a look at this. Yes, I think minimizing dependencies in general is a good idea.

MarcElosua commented 2 years ago

Same thing happens to me with plotMolecules but not with plotVisium. Would it make sense to standardize this?

plotMolecules(spe, molecule = "Cd3d", size = 1.5)

image

plotVisium(spe, fill = "nCount_Spatial", spots = TRUE)

image

lmweber commented 2 years ago

Thanks for the example code above for plotSpots() @boyiguo1 , and thanks also @MarcElosua for mentioning plotMolecules().

I have fixed this now using a simple solution using facet_wrap() along with a default column name containing sample IDs for facetting.

Addressed in commit 5c25378619c2b1f5bf27c9c1b416128cabba093c in ggspavis version 1.3.1.