pachterlab / voyager

From geospatial to spatial -omics
https://pachterlab.github.io/voyager/
Artistic License 2.0
70 stars 8 forks source link

Plot rowGeometries #14

Closed lambdamoses closed 1 month ago

lambdamoses commented 11 months ago

It can be part of plotSpatialFeature and plotLocalResult, with one argument for which row geometry and another for which genes to plot. Then burning question: what palette to use to color the genes? I suppose I'll use point shape (and possibly line type since I can't predict how row geometries are used) and throw an error when there're more genes specified than there are shapes.

I also feel like plotSpatialFeature has too many arguments. While it's acceptable in R (see pheatmap and Seurat's plotting functions), I get it that it's a code smell in other languages. At present which geometry goes on which layer is also pretty inflexible. I think I can introduce a syntax like in ggplot2, plotly, or tmap, to build the plot layer by layer. Something like

plotSpatialFeature(sfe) +
    lyr_colGeometry(aes(fill = Myh2), data = cellSeg) +
    lyr_annotGeometry(data = tissueBoundary, fill = NA, color = "black", linewidth = 0.3) +
    lyr_rowGeometry(data = txSpots, aes(shape = gene), subset = c("Myh2", "Myh7"))

This way it's less confusing to customize the plot to those already familiar with ggplot2 and it's easier to arrange the layers. I think I'll also use tidyeval to be consistent with ggplot2, to reduce confusion. ggplot2 is based on the tidy data philosophy. While there is tidyomics, SFE is more complicated than tidy data, so maybe I need to think more carefully about the philosophy behind the data structure. Sounds like a lot of work, so I'm not sure if I can get it done for Bioc 3.18.

lambdamoses commented 11 months ago

Or maybe just another ggplot method for SFE, no separate functions for plotSpatialFeature and plotLocalResult. Those are just different fields of the SFE data, like glorified data frame columns. But meanwhile, I can't really do something like geom_polygon because any type of geometries can be used in SFE. Maybe geom_sfe, kind of like geom_sf, which works for different types of geometries? Then within geom_sfe I can get gene expression, colData, geometries, and etc. Like

ggplot(sfe) +
    geom_sfe(data = colGeometry, aes(fill = Myh2, geometry = cellSeg)) +
    geom_sfe(data = annotGeometry, aes(geometry = tissueBoundary)) +
    geom_sfe(data = rowGeometry, aes(geometry = txSpots, shape = gene), subset = c("Myh2", "Myh7"))
alikhuseynov commented 11 months ago

If I can add, probably the best to keep all withinplotSpatialFeature(), code smell probably, however Seurat::ImageFeaturePlot also has lot's of args, for gene and molecule plots. I think going with ggplot2 is better, since many users are familiar with it, plotly would be great for interactive plots, I use sometimes plotly::ggplotly()

Additionally, to customize plots, it would be useful to return the ggplot object or list of plots (if multiple). Eg, adding an arg combine = FALSE to plotSpatialFeature() or similar, if TRUE patchwork is used to combine plots, else ggplot object with $data etc.. is returned.

lambdamoses commented 11 months ago

Do you like Seurat's design with lots of arguments? Or do you find it confusing? I'd like to do better than Seurat. My problem here is the potential confusion when coloring colGeometry with one variable, and coloring annotGeometry with another variable. Adding rowGeometry makes the problem worse.

For transcript spots, I think another thing that can be done is to plot a kernel smoothed density heatmap for spots of one gene at a time. That should probably be a separate function.

alikhuseynov commented 11 months ago

Do you like Seurat's design with lots of arguments? Or do you find it confusing? I'd like to do better than Seurat. My problem here is the potential confusion when coloring colGeometry with one variable, and coloring annotGeometry with another variable. Adding rowGeometry makes the problem worse.

For transcript spots, I think another thing that can be done is to plot a kernel smoothed density heatmap for spots of one gene at a time. That should probably be a separate function.

Once one is familiar with Seurat's design, then it's not that difficult, but yeah it can be confusing or annoying for users look up for many args. Gaussian smoothing kernel or similar would work well, giving some kind of hotspots of molecules for one gene. By separate function, you mean it will return ggplot object that can be overlayed with plotSpatialFeature()? Eg: plotSpatialFeature() & plotMolecules()

lambdamoses commented 10 months ago

Maybe something like plotSpatialFeature() + plotMolecules(). Or maybe the kernel density heatmap can go into imgData, which SpatialExperiment authors probably didn't foresee. I can have it both ways, with both Seurat style and ggplot/Tidyomics style.

alikhuseynov commented 10 months ago

Maybe something like plotSpatialFeature() + plotMolecules(). Or maybe the kernel density heatmap can go into imgData, which SpatialExperiment authors probably didn't foresee. I can have it both ways, with both Seurat style and ggplot/Tidyomics style.

sound good 👍 kernel density heatmap into imgData, then having it as an overlay as well?

alikhuseynov commented 8 months ago

for future plotMolecules() function, the extent or bbox of transcripts in rowGeomentry is larger than that of colGeometry and of imgData. Something to consider when overlaying them for plots.

alikhuseynov commented 8 months ago

an example for potential plotMolecules() to overlay with plotSpatialFeature() Gaussian KDE per gene is calculated with MASS::kde2d, the resultant z is matrix and is used for aes(fill = z) Alternatively fields package can be used as well -> https://github.com/NCAR/fields/blob/master/vignette/smooth.Rmd 1st plot just points with densities 2nd KDE

example for 2 genes:

Bildschirmfoto 2023-12-01 um 11 39 26

lambdamoses commented 1 month ago

Already done in https://github.com/pachterlab/voyager/commit/06990b26eb4416814e298300a4004efa1e0d4a96, though not the KDE part yet