JEFworks-Lab / MERINGUE

characterizing spatial gene expression heterogeneity in spatially resolved single-cell transcriptomics data with nonuniform cellular densities
https://jef.works/MERINGUE
GNU General Public License v3.0
57 stars 5 forks source link

Spatially-informed transcriptional clustering with MERINGUE tutorial #10

Closed orrzor closed 2 years ago

orrzor commented 2 years ago

Hi Jean and team,

Thanks for making this cool work available! I am working through your spatially-informed clustering tutorial, and I am not getting the expected results when I run the clustering, but everything before this step matches your tutorial outputs.

W <- getSpatialNeighbors(pos, filterDist = 2)
plotNetwork(pos, W)
com2 <- getSpatiallyInformedClusters(pcs, W=W, k=50)
table(com2)
plotEmbedding(pos, groups=com2, main='Spatially Aware Transcriptional Clusters', xlab='Spatial X', ylab='Spatial Y')

I am just copying and pasting code exactly from your tutorial. k50

If I reduce the k from 50 to 14, I can get 3 clusters but they are not spatially distinct k14

Below is my session info. Thanks for any help!! Best- Orr

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] uwot_0.1.10  Matrix_1.3-4 MERINGUE_1.0

loaded via a namespace (and not attached):
 [1] mclust_5.4.7          ggrepel_0.9.1         Rcpp_1.0.7            riverplot_0.10       
 [5] lattice_0.20-45       ica_1.0-2             tidyr_1.1.4           FNN_1.1.3            
 [9] assertthat_0.2.1      digest_0.6.28         foreach_1.5.1         utf8_1.2.2           
[13] R6_2.5.1              plyr_1.8.6            magic_1.5-9           dynamicTreeCut_1.63-1
[17] evaluate_0.14         httr_1.4.2            ggplot2_3.3.5         pillar_1.6.3         
[21] rlang_0.4.12          irlba_2.3.3           hdf5r_1.3.4           rmarkdown_2.11       
[25] splines_4.0.3         Rtsne_0.15            rliger_1.0.0          igraph_1.2.7         
[29] bit_4.0.4             munsell_0.5.0         tinytex_0.34          compiler_4.0.3       
[33] xfun_0.26             pkgconfig_2.0.3       mgcv_1.8-37           htmltools_0.5.2      
[37] tidyselect_1.1.1      tibble_3.1.5          RANN_2.6.1            codetools_0.2-18     
[41] fansi_0.5.0           crayon_1.4.1          dplyr_1.0.7           grid_4.0.3           
[45] nlme_3.1-153          gtable_0.3.0          lifecycle_1.0.1       DBI_1.1.1            
[49] magrittr_2.0.1        scales_1.1.1          remotes_2.4.1         sp_1.4-5             
[53] doParallel_1.0.16     akima_0.6-2.2         geometry_0.4.5        ellipsis_0.3.2       
[57] generics_0.1.0        vctrs_0.3.8           cowplot_1.1.1         RColorBrewer_1.1-2   
[61] iterators_1.0.13      tools_4.0.3           bit64_4.0.5           glue_1.4.2           
[65] purrr_0.3.4           abind_1.4-5           parallel_4.0.3        fastmap_1.1.0        
[69] yaml_2.2.1            colorspace_2.0-2      knitr_1.36            patchwork_1.1.1      
JEFworks commented 2 years ago

Dear Orr,

Great running into you! Hope all is well!

Thanks for the well documented issue report.

As you know, getSpatiallyInformedClusters integrates spatially informed weights in transcriptional graph-based community detection. The weight is defined as weight <- 1/(alpha + as.vector(pweight)) + beta where pweight is the Voronoi-graph distance. Note that there is an alpha and beta parameter.

It looks when we made the simulation, we had previously set the default alpha and beta to 0 but have now updated it to 1 (primarily to avoid 0 pweights seen in real data from causing crashes). So if you change alpha and beta back to 0 you should get the results you see in the tutorial.

com2 <- getSpatiallyInformedClusters(pcs, W=W, k=50, alpha=0, beta=0)
table(com2)
plotEmbedding(pos, groups=com2, main='Spatially Aware Transcriptional Clusters', xlab='Spatial X', ylab='Spatial Y')

image

Hope this helps!

Stay healthy and safe, Jean

orrzor commented 2 years ago

Hi Jean,

Thanks so much for your response! That completely fixes the issue, and it's helpful to know more about these weights. I'm looking forward to trying this on some Slide-Seq data. I hope you're doing great at Hopkins! Take care! -Orr

bmill3r commented 2 years ago

I explicitly added com2 <- getSpatiallyInformedClusters(pcs, W=W, k=50, alpha=0, beta=0) to the spatial_clustering tutorial to help with this confusion. Everything seems to be working now.

orrzor commented 2 years ago

Brendan, do you have any heuristics for how you set alpha and beta? I might try a range of values now that I know it makes a difference. thanks for any thoughts!

bmill3r commented 2 years ago

Hi Orr,

Thanks for your question! It's actually an interesting one that deserves some deeper exploration. We haven't thoroughly tested how modulating alpha and beta affect the spatially informed weights and formation of clusters but ultimately it has to do with how these spatially informed edge weights are incorporated into the clustering algorithm. By default getSpatiallyInformedClusters uses the igraph::cluster_louvain method, which might give some insight into this.

As mentioned above, pweights are Voronoi-graph distances between cells, in which two cells have a spatial distance of 1 if they are neighbors, 2 if they are neighbors of neighbors, etc. So given alpha = beta = 0, and distances of 1 and 2, the edge weights would be 1/1 = 1 and 1/2 = 0.5. Increasing alpha would make these smaller fractions and increasing beta would make these larger.

In the spatial_clustering tutorial, it seems that the 3 spatial clusters are identified as long as beta = 0 but doesn't seem to matter if alpha is 1, or 10, or 100 for that matter. To give you a sense of other values, for the spatially aware clustering of the ISH Drosophila melanogaster data, alpha = beta = 0.01. For the spatially aware clustering of the MERFISH mouse hypothalamic preoptic region data, alpha = beta = 1.

Hope this helps a little and please reach out if you have any other questions, Brendan