satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 906 forks source link

Negative average expression value on Dotplot #2379

Closed trueth1206 closed 4 years ago

trueth1206 commented 4 years ago

Hi, Thank you for creating this excellent tool for single cell RNA sequencing analysis. I do not quite understand why the average expression value on my dotplot starts from -1. Could anybody help me?

123

timoast commented 4 years ago

Are you plotting integrated data? There can be negative values in the integrated data and this is expected, and has been stated in other closed issues. In general we recommend differential expression and visualization of the uncorrected data.

trueth1206 commented 4 years ago

Hi, Thank you for your reply. Apologize for asking a similar question. Yes, I used an integrated data. But I used the RNA assay for dot plotting. Will I still get a negative value when using the RNA assay in integrated data?

timoast commented 4 years ago

If you use the RNA assay containing uncorrected data there should not be negative average expression values. Can you post the full code you're running?

trueth1206 commented 4 years ago

Hi, As you mentioned about it, I went back to check my data before integration. I found that the negative value exited before the integration of the two conditions. Here is the code I am using to manage a single condition of 4 biological replicates. It ended up with the plot below:

library("Seurat")
library("dplyr")
library("mclust")
library("cowplot")
D0_1.dgecounts <-
 readRDS("~/Desktop/snRNAseq_analysis/zUMIs/D0_1/expression/D0_1.dgecounts.rds")
D0_2.dgecounts <-
 readRDS("~/Desktop/snRNAseq_analysis/zUMIs/D0_2/expression/D0_2.dgecounts.rds")
D0_3.dgecounts <- 
readRDS("~/Desktop/snRNAseq_analysis/zUMIs/D0_3/expression/D0_3.dgecounts.rds")
D0_4.dgecounts <- 
readRDS("~/Desktop/snRNAseq_analysis/zUMIs/D0_4/expression/D0_4.dgecounts.rds")
rawCountsD0_1 <- D0_1.dgecounts[["umicount"]][["inex"]][["all"]]
seuratRawObjD0_1 <- CreateSeuratObject(rawCountsD0_1, min.cells = 3, min.features = 400)
seuratRawObjD0_1
rawCountsD0_2 <- D0_2.dgecounts[["umicount"]][["inex"]][["all"]]
seuratRawObjD0_2 <- CreateSeuratObject(rawCountsD0_2, min.cells = 3, min.features = 400)
seuratRawObjD0_2
rawCountsD0_3 <- D0_3.dgecounts[["umicount"]][["inex"]][["all"]]
seuratRawObjD0_3 <- CreateSeuratObject(rawCountsD0_3, min.cells = 3, min.features = 400)
seuratRawObjD0_3
rawCountsD0_4 <- D0_4.dgecounts[["umicount"]][["inex"]][["all"]]
seuratRawObjD0_4 <- CreateSeuratObject(rawCountsD0_4, min.cells = 3, min.features = 400)
seuratRawObjD0_4
seuratRawObjD0_1$D0 <- "D0_1"
seuratRawObjD0_2$D0 <- "D0_2"
seuratRawObjD0_3$D0 <- "D0_3"
seuratRawObjD0_4$D0 <- "D0_4"
D0_M1 <- 
merge(seuratRawObjD0_1, seuratRawObjD0_2, add.cell.ids = c("1", "2"), 
project = "SeuratProject")
D0_M2 <- 
merge(seuratRawObjD0_3, seuratRawObjD0_4, add.cell.ids = c("3", "4"), 
project = "SeuratProject")
seuratRawObj <- merge(D0_M1, D0_M2, project = "SeuratProject")
seuratRawObj
seuratRawObj[["percent.mt"]] <- PercentageFeatureSet(seuratRawObj, features = 
c("ENSMUSG00000064336", "ENSMUSG00000064337", "ENSMUSG00000064339", 
"ENSMUSG00000064341", "ENSMUSG00000064343", "ENSMUSG00000064344", 
"ENSMUSG00000064345", "ENSMUSG00000064347", 
"ENSMUSG00000064348","ENSMUSG00000064349", "ENSMUSG00000064351", 
"ENSMUSG00000064356", "ENSMUSG00000064360", "ENSMUSG00000065947", 
"ENSMUSG00000064365", "ENSMUSG00000064367", "ENSMUSG00000064368", 
"ENSMUSG00000064369", "ENSMUSG00000064370", "ENSMUSG00000064371", 
"ENSMUSG00000064372"))
seuratFilteredObj <- 
subset(seuratRawObj, subset = nFeature_RNA > 400 & nFeature_RNA < 7500 & percent.mt < 10)
seuratFilteredObj
seuratNormalisedObj <- NormalizeData(object = seuratFilteredObj, 
normalization.method = "LogNormalize", scale.factor = 10000)
seuratNormalised2Obj <- FindVariableFeatures(seuratNormalisedObj, 
selection.method = "vst", nfeatures = 2000)
top10 <- head(VariableFeatures(seuratNormalised2Obj), 10)
plot3 <- VariableFeaturePlot(seuratNormalised2Obj)
plot4 <- LabelPoints(plot = plot3, points = top10, repel = TRUE)
CombinePlots(plots = list(plot3, plot4))
all.genes <- rownames(seuratNormalised2Obj)
seuratScaledObj <- 
ScaleData(object = seuratNormalised2Obj, features = all.genes)
seuratScaled2Obj <- RunPCA(object = seuratScaledObj, 
features = VariableFeatures(object = seuratScaledObj), do.print = TRUE, 
pcs.print = 1:5, genes.print = 5)

print(seuratScaled2Obj[["pca"]], dims = 1:5, nfeatures = 5)

VizDimLoadings(seuratScaled2Obj, dims = 1:2, reduction = "pca")

DimPlot(seuratScaled2Obj, reduction = "pca")
seuratJackStraw <- JackStraw(seuratScaled2Obj, num.replicate = 100)
scoreJackStraw <- ScoreJackStraw(seuratJackStraw, dims = 1:20)
ElbowPlot(scoreJackStraw, ndims = 30)
seuratFindNeighbors <- FindNeighbors(
  scoreJackStraw,
  reduction = "pca",
  dims = 1:20,
  do.plot = FALSE,
  graph.name = NULL,
  k.param = 30
)
seuratFindClusters <- FindClusters(object = seuratFindNeighbors)
seusetTSNE <- RunTSNE(seuratFindClusters, reduction = "pca", 
dims = 1:20, tsne.method = "Rtsne")
TSNEPlot(object = seusetTSNE, label.size = 4, pt.size = 0.5, label = TRUE)
DefaultAssay(seuratFindClusters) <- "RNA"

markers.to.plot <- c("ENSMUSG00000030781", "ENSMUSG00000018459", 
"ENSMUSG00000024650", "ENSMUSG00000021490", "ENSMUSG00000031441", 
"ENSMUSG00000041052", "ENSMUSG00000040405", "ENSMUSG00000027962", 
"ENSMUSG00000020914", "ENSMUSG00000031004", "ENSMUSG00000004655", 
"ENSMUSG00000029082", "ENSMUSG00000028289", "ENSMUSG00000027202", 
"ENSMUSG00000030963", "ENSMUSG00000031766", "ENSMUSG00000054640", 
"ENSMUSG00000023013", "ENSMUSG00000031891", "ENSMUSG00000104445", 
"ENSMUSG00000028238", "ENSMUSG00000006574", "ENSMUSG00000020651", 
"ENSMUSG00000029648", "ENSMUSG00000062960", "ENSMUSG00000026365", 
"ENSMUSG00000070645", "ENSMUSG00000026395", "ENSMUSG00000006649")

DotPlot(seuratFindClusters, features = rev(markers.to.plot), cols = c("blue", "red"), 
dot.scale = 8) + RotatedAxis()

123

Thank you again for your kindly help.

timoast commented 4 years ago

I found that the negative value exited before the integration of the two conditions

If you have negative values in your raw data I think something odd has happened in your upstream processing

In any case, if the negative values are present in the data then this is not an issue with the DotPlot function and so I will close this now, I suggest you take a look at how the data is being processed upstream to result in negative values

Tushar-87 commented 4 years ago

could it be the problem of slot in the RNA assay? For a dot plot we just need the normalized data with no additional scaling...