alanocallaghan / scater

Clone of the Bioconductor repository for the scater package.
https://bioconductor.org/packages/devel/bioc/html/scater.html
94 stars 40 forks source link

Quantile cutoffs in UMAP plots #197

Open jfreimer opened 1 year ago

jfreimer commented 1 year ago

Hi,

Both Seurat and Scanpy have a feature to pass a quantile cutoff when plotting gene expression on a UMAP. I was wondering if it would be possible to request that a similar feature be added to scater? This can be helpful to remove some cells with very low or high expression from skewing the scale bar.

In Seurat: Calculate feature-specific contrast levels based on quantiles of non-zero expression. FeaturePlot(pbmc3k.final, features = c("MS4A1", "PTPRCAP"), min.cutoff = "q10", max.cutoff = "q90")

In Scanpy: the maximum value plotted can be adjusted using vmax (similarly vmin can be used for the minimum value). In this case we use p99, which means to use as max value the 99 percentile. The max value can be a number or a list of numbers if the vmax wants to be set for multiple plots individually. sc.pl.umap(pbmc, color=['MS4A1', 'PTPRCAP'], vmin='p10', vmax='p99')

Thank you

alanocallaghan commented 1 year ago

Can you provide an example output plot?

jfreimer commented 1 year ago

Sure, here is a UMAP with and without the quantile cutoff from the Seurat tutorial:

SeuratData::InstallData("pbmc3k")
library(Seurat)
library(SeuratData)
library(ggplot2)
library(patchwork)
data("pbmc3k.final")

FeaturePlot(pbmc3k.final, features = c("PTPRCAP"), min.cutoff = "q10", max.cutoff = "q90")

FeaturePlot(pbmc3k.final, features = c("PTPRCAP"))

umap_with_cutoff.pdf umap_without_cutoff.pdf

alanocallaghan commented 1 year ago

So just to be clear, the quantile cutoffs define the values at which you truncate the range of the data. ie for q10, anything below q10 is shown as being exactly q10?

jfreimer commented 1 year ago

Yes, I believe that is how both Seurat and Scanpy do it.

From the Scanpy docs (https://scanpy.readthedocs.io/en/stable/generated/scanpy.pl.umap.html#scanpy.pl.umap):

vmin : Union[str, float, Callable[[Sequence[float]], float], Sequence[Union[str, float, Callable[[Sequence[float]], float]]], None] (default: None) The value representing the lower limit of the color scale. Values smaller than vmin are plotted with the same color as vmin. vmin can be a number, a string, a function or None. If vmin is a string and has the format pN, this is interpreted as a vmin=percentile(N). For example vmin=’p1.5’ is interpreted as the 1.5 percentile. If vmin is function, then vmin is interpreted as the return value of the function over the list of values to plot. For example to set vmin tp the mean of the values to plot, def my_vmin(values): return np.mean(values) and then set vmin=my_vmin. If vmin is None (default) an automatic minimum value is used as defined by matplotlib scatter function. When making multiple plots, vmin can be a list of values, one for each plot. For example vmin=[0.1, 'p1', None, my_vmin]