Closed ghoshal closed 6 years ago
Hi,
You can decrease the size of your points, and use pch.use = 20
to get smaller dots.
Regarding the plotted data, FeaturePlot()
plots the log-normalized data, whereas FeatureHeatmap()
plots scaled data.
Best, Leon
Hi Leon,
Thanks for your response. I used the recommended parameters but the contrast is still very poor. I was thinking of putting a min.cutoff to remove the lowly expressed cells. I couldn't find the cutoff format in the documentation. Can you please help me with that?
I want to select cells with an expression of >2.
Thanks,
Biba
Also, is there a way to plot a group of genes belonging to one pathway in a single feature plot or heatmap based on their average/median expression?
Hi,
I wrote a script that tries to do two things at once. It allows either to add a transparency to all your points, which you can define by setting a value below 1 to alpha.use
. This option will help you keep all your cells in the plots, but colours become difficult to read in this case. Or, you can set a gene expression cutoff by passing the desired cutoff value to expression.threshold
. In this latter case, cells having expression values < the cutoff will be removed from the plots. (You can also combine both options.) Provided that log transformed (using log(x + 1)
) normalized gene expression levels are plotted, the script assumes that the cutoff value passed to expression.threshold
is not log transformed (is.log1p.transformed = F
), and the value is log transformed using the default settings of the function. If you want the cutoff value to be applied on the log transformed data, you simply have to set is.log1p.transformed = TRUE
.
# Function to pseudo customize FeaturePlots
customize_Seurat_FeaturePlot <- function(p, alpha.use = 1, gradient.use = c("yellow", "red"), expression.threshold = NULL, is.log1p.transformed = F) {
#### Main function ####
main_function <- function(p = p, alpha.use = alpha.use, gradient.use = gradient.use, expression.threshold = expression.threshold, is.log1p.transformed = is.log1p.transformed) {
# Remove cells having an expression below a certain threshold
if (!is.null(expression.threshold)) {
if (isTRUE(is.log1p.transformed)) {
p$data <- p$data[p$data$gene >= expression.threshold,]
} else {
p$data <- p$data[p$data$gene >= log1p(expression.threshold),]
}
}
# Fill points using the gene expression levels
p$layers[[1]]$mapping$fill <- p$layers[[1]]$mapping$colour
# Define transparency of points
p$layers[[1]]$mapping$alpha <- alpha.use
# Change fill and colour gradient values
p <- p + scale_colour_gradientn(colours = gradient.use, guide = F) +
scale_fill_gradientn(colours = gradient.use, name = expression(atop(Expression, (log)))) +
scale_alpha_continuous(range = alpha.use, guide = F)
}
#### Execution of main function ####
# Apply main function on all features
p <- lapply(X = p, alpha.use = alpha.use, gradient.use = gradient.use,
expression.threshold = expression.threshold, is.log1p.transformed = is.log1p.transformed,
FUN = main_function)
# Arrange all plots using cowplot
# Adapted from Seurat
# https://github.com/satijalab/seurat/blob/master/R/plotting.R#L1100
# ncol argument from Josh O'Brien
# https://stackoverflow.com/questions/10706753/how-do-i-arrange-a-variable-list-of-plots-using-grid-arrange
cowplot::plot_grid(plotlist = p, ncol = ceiling(sqrt(length(p))))
}
# Plot features using FeaturePlot, and return individual plots using 'do.return = T'
# Store output in 'plot'
plot <- FeaturePlot(object = object, features.plot = c("Actb", "Rps9"), cols.use = c("yellow", "red"),
pt.size = 2, pch.use = 16, no.legend = F, do.return = T)
# Remove from plots all cells having an expression level below 2 (cutoff not log1p transformed)
customize_Seurat_FeaturePlot(p = plot, expression.threshold = 2)
Hope that helps!
Best, Leon
Regarding your second question, I am not aware of a Seurat function that can do that for you. I don't think that it is also common and good practice to compute median expression values of multiple genes for a given cell. In this case, you might want to use the scaled data from Seurat, and compute the median scaled expression levels of multiple genes. However, this is a completely different topic than what this issue was about.
Best, Leon
Hi again,
Sorry but I forgot that you were using FeatureHeatmap()
. The previous function works fine for FeaturePlot()
, but not for FeatureHeatmap()
. You can use the function below instead. There might be other ways of doing it too, or probably just integrate some of these functionalities in later releases of Seurat. But I agree that this is quite a particular case.
# Function to pseudo customize FeatureHeatmap
customize_Seurat_FeatureHeatmap <- function(p, alpha.use = 1, gradient.use = c("yellow", "red"), expression.threshold = 0, is.log1p.transformed = F) {
# Remove cells having an expression below a certain threshold
if (isTRUE(is.log1p.transformed)) {
p$data <- p$data[p$data$expression >= expression.threshold,]
} else {
p$data <- p$data[p$data$expression >= log1p(expression.threshold),]
}
# Fill points using the gene expression levels
p$layers[[1]]$mapping$fill <- p$layers[[1]]$mapping$colour
# Define transparency of points
p$layers[[1]]$mapping$alpha <- alpha.use
# Change fill and colour gradient values
p <- p + scale_colour_gradientn(colours = gradient.use, guide = F) +
scale_fill_gradientn(colours = gradient.use, name = expression(atop(Scaled, expression))) +
scale_alpha_continuous(range = alpha.use, guide = F)
# Return plot
p
}
# Plot features using FeatureHeatmap, and return plot using 'do.return = T'
# Store output in 'plot'
plot <- FeatureHeatmap(object = crypts_hi_res, features.plot = c("Clu", "Lgr5", "Mex3a", "Yap1"), group.by = "class", sep.scale = F, pt.size = 1.5, cols.use = c("gray97", "blue"), pch.use = 1)
# Remove from plot all cells having an expression level below 2 (cutoff not log1p transformed)
# Filter is applied on log normalized data, but scaled data is plotted
customize_Seurat_FeatureHeatmap(plot, gradient.use = c("gray97", "blue"), expression.threshold = 2)
Best, Leon
Thanks a lot Leon. You have been a great help.
Hi Leon,
Both the customized plots are working well. The only thing is that the background cells (ones without any expression of the specified genes) are disappearing from the plots. It is important for me to keep the tSNE plot whole to make the comparisons between the clusters. Can you please tell me what I can do to plot the background as well?
Thanks,
Biba
Hello,
Could you please share the code you are using? Providing also a plot before and after the operation you are doing using the script I wrote would be also helpful for me to troubleshoot your issue.
Best, Leon
I understand. My bad! I wrote you a script that allows you to filter out the cells that have an expression level of the gene below the threshold that you specify using the expression.threshold
argument. In fact, what you want to do is far simpler than what I thought you want to do. In this case, you can simply reorder the data that is being plotted by the expression level of the genes, and plot the reordered data.
#### For FeatureHeatmap() ####
# Plot features using FeatureHeatmap, and return plot using 'do.return = T'
# Store output in 'plot'
plot <- FeatureHeatmap(object = crypts_hi_res, features.plot = c("Clu", "Lgr5", "Mex3a", "Yap1"), group.by = "class", sep.scale = F, pt.size = 1.5, cols.use = c("gray97", "blue"), pch.use = 1)
# Order data by gene expresion level
plot$data <- plot$data[order(plot$data$expression),]
# plot with reordered data
plot
#### For FeaturePlot() ####
# Plot features using FeaturePlot, and return individual plots using 'do.return = T'
# Store output in 'plot'
plot <- FeaturePlot(object = object, features.plot = c("Actb", "Rps9"), cols.use = c("yellow", "red"),
pt.size = 2, pch.use = 16, no.legend = F, do.return = T)
# For each element of plot, reorder the data and return plot
plot <- lapply(X = plot, function(p) { p$data <- p$data[order(p$data$gene),]; p})
# Arrange all plots using cowplot
# Adapted from Seurat
# https://github.com/satijalab/seurat/blob/master/R/plotting.R#L1100
# ncol argument adapted from Josh O'Brien
# https://stackoverflow.com/questions/10706753/how-do-i-arrange-a-variable-list-of-plots-using-grid-arrange
cowplot::plot_grid(plotlist = plot, ncol = ceiling(sqrt(length(plot))))
Hope that helps!
Best, Leon
Hi Leon,
The plots were working as needed by using the customized scripts that you had sent. I was intending to use an expression cutoff that I could do using your script. The ordering of the data did not particularly improve on what I wanted. Here is an example.
This is the original command. x <- FeatureHeatmap(object = crypts, features.plot = c("Clu", "Lgr5", "Mex3a", "Yap1"), group.by = "class", sep.scale = T, pt.size = 0.5, cols.use = c("gray98", "red"), pch.use = 20, do.return = T)
x$data <- x$data[order(x$data$expression),] x
Then, I run your expression.threshhold using the following command and get this.
customize_Seurat_FeatureHeatmap(x, alpha.use = 0.8,gradient.use = c("orangered", "red4"), expression.threshold = 1, is.log1p.transformed = T)
This is really and what I actually want to see at the gene level FeatureHeatmap. However, I would want the background tSNE plot to be there as well. So technically all cells with 0 expressions in the x$data will have a solid colour, so that the final plot looks something like this.
Thanks,
Biba
Hello,
Here you go.
#### For FeaturePlot() ####
# Function to pseudo customize FeaturePlots
customize_Seurat_FeaturePlot <- function(p, alpha.use = 1, gradient.use = c("yellow", "red"), expression.threshold = 0, is.log1p.transformed = F) {
#### Main function ####
main_function <- function(p = p, alpha.use = alpha.use, gradient.use = gradient.use, expression.threshold = expression.threshold, is.log1p.transformed = is.log1p.transformed) {
# Order data by gene expresion level
p$data <- p$data[order(p$data$gene),]
# Define lower limit of gene expression level
if (isTRUE(is.log1p.transformed)) {
expression.threshold <- expression.threshold
} else {
expression.threshold <- log1p(expression.threshold)
}
# Compute maximum value in gene expression
max.exp <- max(p$data$gene)
# Fill points using the gene expression levels
p$layers[[1]]$mapping$fill <- p$layers[[1]]$mapping$colour
# Define transparency of points
p$layers[[1]]$mapping$alpha <- alpha.use
# Change fill and colour gradient values
p <- p + scale_colour_gradientn(colours = gradient.use, guide = F, limits = c(expression.threshold, max.exp), na.value = "grey") +
scale_fill_gradientn(colours = gradient.use, name = expression(atop(Expression, (log))), limits = c(expression.threshold, max.exp), na.value = "grey") +
scale_alpha_continuous(range = alpha.use, guide = F)
}
#### Execution of main function ####
# Apply main function on all features
p <- lapply(X = p, alpha.use = alpha.use, gradient.use = gradient.use,
expression.threshold = expression.threshold, is.log1p.transformed = is.log1p.transformed,
FUN = main_function)
# Arrange all plots using cowplot
# Adapted from Seurat
# https://github.com/satijalab/seurat/blob/master/R/plotting.R#L1100
# ncol argument adapted from Josh O'Brien
# https://stackoverflow.com/questions/10706753/how-do-i-arrange-a-variable-list-of-plots-using-grid-arrange
cowplot::plot_grid(plotlist = p, ncol = ceiling(sqrt(length(p))))
}
#### For FeatureHeatmap() ####
# Function to pseudo customize FeatureHeatmap
customize_Seurat_FeatureHeatmap <- function(p, alpha.use = 1, gradient.use = c("yellow", "red"), scaled.expression.threshold = NULL, is.log1p.transformed = F) {
# Order data by gene expresion level
p$data <- p$data[order(p$data$scaled.expression),]
# Define lower limit of scaled gene expression level
if (!is.null(scaled.expression.threshold)) {
if (isTRUE(is.log1p.transformed)) {
scaled.expression.threshold <- scaled.expression.threshold
} else {
scaled.expression.threshold <- log1p(scaled.expression.threshold)
}
} else if (is.null(scaled.expression.threshold)) {
scaled.expression.threshold <- min(p$data$scaled.expression)
}
# Compute maximum value in gene expression
max.scaled.exp <- max(p$data$scaled.expression)
# Fill points using the gene expression levels
p$layers[[1]]$mapping$fill <- p$layers[[1]]$mapping$colour
# Define transparency of points
p$layers[[1]]$mapping$alpha <- alpha.use
# Change fill and colour gradient values
p <- p + scale_colour_gradientn(colours = gradient.use, guide = F, limits = c(scaled.expression.threshold, max.scaled.exp), na.value = "grey") +
scale_fill_gradientn(colours = gradient.use, name = expression(atop(Scaled, expression)), limits = c(scaled.expression.threshold, max.scaled.exp), na.value = "grey") +
scale_alpha_continuous(range = alpha.use, guide = F)
# Return plot
p
}
Note that for FeatureHeatmap()
you filter for scaled expression. Also, scaled.expression.threshold
can be NULL (in which case it will take the lowest scaled expression level), whereas expression.threshold
requires a minimum value of 0.
Hope that helps!
Best, Leon
My bad! For FeatureHeatmap()
, you no longer need the is.log1p.transformed
argument.
#### For FeatureHeatmap() ####
# Function to pseudo customize FeatureHeatmap
customize_Seurat_FeatureHeatmap <- function(p, alpha.use = 1, gradient.use = c("yellow", "red"), scaled.expression.threshold = NULL) {
# Order data by scaled gene expresion level
p$data <- p$data[order(p$data$scaled.expression),]
# Define lower limit of scaled gene expression level
if (!is.null(scaled.expression.threshold)) {
scaled.expression.threshold <- scaled.expression.threshold
} else if (is.null(scaled.expression.threshold)) {
scaled.expression.threshold <- min(p$data$scaled.expression)
}
# Compute maximum value in gene expression
max.scaled.exp <- max(p$data$scaled.expression)
# Fill points using the scaled gene expression levels
p$layers[[1]]$mapping$fill <- p$layers[[1]]$mapping$colour
# Define transparency of points
p$layers[[1]]$mapping$alpha <- alpha.use
# Change fill and colour gradient values
p <- p + scale_colour_gradientn(colours = gradient.use, guide = F, limits = c(scaled.expression.threshold, max.scaled.exp), na.value = "grey") +
scale_fill_gradientn(colours = gradient.use, name = expression(atop(Scaled, expression)), limits = c(scaled.expression.threshold, max.scaled.exp), na.value = "grey") +
scale_alpha_continuous(range = alpha.use, guide = F)
# Return plot
p
}
Thanks a lot Leon. They work like dream.
Hi @ghoshal,
I had a similar situation where I needed to bring the points that correspond to high expression to the front in a feature plot. I tried the solutions above in this and the other thread (issue #757), but ultimately, what worked best for me is to directly re-order the "$data" data frame in increasing order of gene expressions.
Here is an example code snippet:
# First, the usual command to make the feature plots and return the ggplot2 objects
fp <- Seurat::FeaturePlot(object=p, features.plot=GeneList, cols.use = c("gray75", "red"),
min.cutoff="q25",max.cutoff="q75", reduction.use = "tsne",
do.return=TRUE)
# Here, fp is a list of objects, one per feature plot
# Then, re-order the "data" dataframe in increasing order of gene expressions
for (i in 1:length(fp)){
fp[[i]]$data <- fp[[i]]$data[order(fp[[i]]$data$gene),]
}
This did the trick for me! That is, I got all the gray points relegated to the back. Among the non-gray ones, the brightest red points were at the very front, and other shades of red at intermediate positions. If a dull red point and a bright red point overlap, then the bright red one would be at the front, which is exactly what I wanted.
Hope this helps. Sample plots attached below.
BEFORE
AFTER
Hi @CodeInTheSkies,
The principle is exactly the same in both cases.
Best, Leon
Hi @leonfodoulian, Wondering about whether it's possible to tweak the function to continue even if gene isn't found in the dataset. I'm working with dataset that is just single cell type so many genes aren't expressed. Would like to query my dataset against gene list from another publication but function stops when gene in the list isn't found in my Seurat object.
Thanks! Sam
Hello,
Here you go.
#### For FeaturePlot() #### # Function to pseudo customize FeaturePlots customize_Seurat_FeaturePlot <- function(p, alpha.use = 1, gradient.use = c("yellow", "red"), expression.threshold = 0, is.log1p.transformed = F) { #### Main function #### main_function <- function(p = p, alpha.use = alpha.use, gradient.use = gradient.use, expression.threshold = expression.threshold, is.log1p.transformed = is.log1p.transformed) { # Order data by gene expresion level p$data <- p$data[order(p$data$gene),] # Define lower limit of gene expression level if (isTRUE(is.log1p.transformed)) { expression.threshold <- expression.threshold } else { expression.threshold <- log1p(expression.threshold) } # Compute maximum value in gene expression max.exp <- max(p$data$gene) # Fill points using the gene expression levels p$layers[[1]]$mapping$fill <- p$layers[[1]]$mapping$colour # Define transparency of points p$layers[[1]]$mapping$alpha <- alpha.use # Change fill and colour gradient values p <- p + scale_colour_gradientn(colours = gradient.use, guide = F, limits = c(expression.threshold, max.exp), na.value = "grey") + scale_fill_gradientn(colours = gradient.use, name = expression(atop(Expression, (log))), limits = c(expression.threshold, max.exp), na.value = "grey") + scale_alpha_continuous(range = alpha.use, guide = F) } #### Execution of main function #### # Apply main function on all features p <- lapply(X = p, alpha.use = alpha.use, gradient.use = gradient.use, expression.threshold = expression.threshold, is.log1p.transformed = is.log1p.transformed, FUN = main_function) # Arrange all plots using cowplot # Adapted from Seurat # https://github.com/satijalab/seurat/blob/master/R/plotting.R#L1100 # ncol argument adapted from Josh O'Brien # https://stackoverflow.com/questions/10706753/how-do-i-arrange-a-variable-list-of-plots-using-grid-arrange cowplot::plot_grid(plotlist = p, ncol = ceiling(sqrt(length(p)))) } #### For FeatureHeatmap() #### # Function to pseudo customize FeatureHeatmap customize_Seurat_FeatureHeatmap <- function(p, alpha.use = 1, gradient.use = c("yellow", "red"), scaled.expression.threshold = NULL, is.log1p.transformed = F) { # Order data by gene expresion level p$data <- p$data[order(p$data$scaled.expression),] # Define lower limit of scaled gene expression level if (!is.null(scaled.expression.threshold)) { if (isTRUE(is.log1p.transformed)) { scaled.expression.threshold <- scaled.expression.threshold } else { scaled.expression.threshold <- log1p(scaled.expression.threshold) } } else if (is.null(scaled.expression.threshold)) { scaled.expression.threshold <- min(p$data$scaled.expression) } # Compute maximum value in gene expression max.scaled.exp <- max(p$data$scaled.expression) # Fill points using the gene expression levels p$layers[[1]]$mapping$fill <- p$layers[[1]]$mapping$colour # Define transparency of points p$layers[[1]]$mapping$alpha <- alpha.use # Change fill and colour gradient values p <- p + scale_colour_gradientn(colours = gradient.use, guide = F, limits = c(scaled.expression.threshold, max.scaled.exp), na.value = "grey") + scale_fill_gradientn(colours = gradient.use, name = expression(atop(Scaled, expression)), limits = c(scaled.expression.threshold, max.scaled.exp), na.value = "grey") + scale_alpha_continuous(range = alpha.use, guide = F) # Return plot p }
Note that for
FeatureHeatmap()
you filter for scaled expression. Also,scaled.expression.threshold
can be NULL (in which case it will take the lowest scaled expression level), whereasexpression.threshold
requires a minimum value of 0.Hope that helps!
Best, Leon
with Seurat V3, this is not working.
where the data slot in the ggplot object now? p<- FeaturePlot(seurat, features = c("Vgf", "Tac1"))
p$data$
Thank you!
with Seurat V3, this is not working.
where the data slot in the ggplot object now? p<- FeaturePlot(seurat, features = c("Vgf", "Tac1"))
there is no p$data$gene
p$data$
Thank you!
In Seurat v3, FeaturePlot now has baked in tools for ordering and filtering (so you don't need these functions for these plotting features): order = FALSE (default) set as TRUE will plot higher values on top of lower values.. doing the ordering for you. min.cutoff = NA (default), set as a number to remove cells below a certain expression level. max.cutoff = NA (default), set as a number to remove cells above a certain expression level.
The only thing it doesn't implement now is ALPHA. Which would be awesome.
Hi,
I am using FeatureHeatmap to visualize gene expression between two groups. When I plot, the cells with no expression are masking the cells that have an expression as you can see on the left side. Is there a way to not let this happen. FeatureHeatmap(object = crypts_hi_res, features.plot = c("Clu", "Lgr5", "Mex3a", "Yap1"), group.by = "class", sep.scale = F, pt.size = 1.5, cols.use = c("gray97", "blue"), pch.use = 1)
Also, when we are plotting gene expression in either FeaturePlot or FeatureHeatmap, are we plotting the scaled log normalized data?
Thanks,
Bibaswan