Closed pbpayal closed 2 years ago
Hi @pbpayal ,
You can use the AverageExpression function on the Seurat object and then run cor
function from base R to generate the correlation matrix.
I have an integrated Seurat object with all the preprocessing done and I have annotated clusters. The weird thing is, when I want to extract the set of genes that i am interested in from AverageExpression, it says
Warning: The counts slot for the integrated assay is empty. Skipping assay.
I tried all - "counts","data","scale.data" slot. None of them work.
However, when I do DoHeatmap on "counts" slot for the same set of genes, it's able to plot a Heatmap. Do you have any idea, why that might be the case? For DoHeatmap, is the average expression of each gene getting plotted? Can I use that data for correlation analysis?
Can you share the code you are using to test the analysis?
#CTRL
sample_1 <- Read10X(data.dir = "../filtered_feature_bc_matrix")
sample_1_sobj <- CreateSeuratObject(counts = sample_1, project = "sample_1", min.cells = 3, min.features = 200)
sample_1_sobj$stim <- "CTRL"
sample_1_sobj[["percent.mt"]] <- PercentageFeatureSet(sample_1_sobj, pattern = "^MT-")
sample_1_sub <- subset(sample_1_sobj, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5 )
sample_1_sub <- NormalizeData(sample_1_sub, verbose = FALSE)
sample_1_sub <- FindVariableFeatures(sample_1_sub, selection.method = "vst", nfeatures = 2000)
# EXP
sample_2 <- Read10X(data.dir = "../filtered_feature_bc_matrix")
sample_2_sobj <- CreateSeuratObject(counts = sample_2, project = "sample_2", min.cells = 3, min.features = 200)
sample_2_sobj$stim <- "EXP"
sample_2_sobj[["percent.mt"]] <- PercentageFeatureSet(sample_2_sobj, pattern = "^MT-")
sample_2_sub <- subset(sample_2_sobj, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5 )
sample_2_sub <- NormalizeData(sample_2_sub, verbose = FALSE)
sample_2_sub <- FindVariableFeatures(sample_2_sub, selection.method = "vst", nfeatures = 2000)
I did this for 8 samples - 4 control and 4 experiment samples then integrated the CTRL and EXP samples
immune.anchors <- FindIntegrationAnchors(object.list = list(sample1_sub, sample2_sub,..,sample8_sub), dims = 1:20)
immune.combined.integrated <- IntegrateData(anchorset = immune.anchors.2, dims = 1:20)
DefaultAssay(immune.combined.integrated) <- "integrated"
# Run the standard workflow for visualization and clustering
immune.combined.integrated <- ScaleData(immune.combined.integrated, verbose = FALSE)
immune.combined.integrated <- RunPCA(immune.combined.integrated, npcs = 20, verbose = FALSE)
immune.combined.integrated <- RunUMAP(immune.combined.integrated, reduction = "pca", dims = 1:20)
immune.combined.integrated <- FindNeighbors(immune.combined.integrated, reduction = "pca", dims = 1:20)
# Resolution
immune.combined.integrated <- FindClusters(immune.combined.integrated, resolution = 0.2)
# Annotate clusters
new.cluster.ids <- c("Erythroid cells", "Memory Tcells/Tcells", "DC/Monocyte",
"NK" ,"Bcells", "CD8 Tcells","MAIT T cell",
"Plasma","Plasmacytoid DC", "RBC","Platelets")
names(new.cluster.ids) <- levels(immune.combined.integrated)
immune.combined.integrated_ <- RenameIdents(immune.combined.integrated, new.cluster.ids)
# Rename the seurat clusters in metadata
immune.combined.integrated[["seurat_clusters"]] <- Idents(object = immune.combined.integrated)
features_c1_subunit <- c("MT-ND1","MT-ND2","MT-ND3","MT-ND4","MT-ND4L","MT-ND5",
"MT-ND6","NDUFA1","NDUFA2","NDUFA6","NDUFA9","NDUFA10",
"NDUFA11","NDUFA12","NDUFA13","NDUFB3","NDUFB8","NDUFB9",
"NDUFB10","NDUFB11","NDUFS1","NDUFS2","NDUFS3","NDUFS4",
"NDUFS6","NDUFS7","NDUFS8","NDUFV1","NDUFV2")
DefaultAssay(immune.combined.integrated) <- "RNA"
heatmap <- DoHeatmap(immune.combined.integrated_1, features = features_c1_subunit, slot = "counts", size = 3.5)
avg_exp <- AverageExpression(immune.combined.integrated_1, slot = "data", features = features_c1_subunit)
Hello,
I want to create Pearson correlation matrix of a set of genes per cluster basis. I want to see how these set of genes are co-expressed in all clusters.
Something like this(Referenced from Deseq2 manual)
I know about FeatureScatter but that's only one gene against another. How to do for multiple genes? Also which data slot is appropriate for this kind of analysis?
Thanks, Payal