Open Carrey14 opened 2 months ago
Can you provide the steps for the analysis? Also, can you provide where are the subsetted cells of the second UMAP in the first one?
Can you provide the steps for the analysis? Also, can you provide where are the subsetted cells of the second UMAP in the first one?
I'm sorry for the late reply. Here is the code I analyzed and the cluster in the red circle in the figure is the T-cell cluster I extracted. `HCC_harmony <- NormalizeData(HCC_all) %>% FindVariableFeatures() %>% ScaleData() %>% RunPCA(npcs = 100,verbose=FALSE)
system.time({HCC_harmony2 <- RunHarmony(HCC_harmony, group.by.vars = "orig.ident")}) pdf("el.pdf", width = 10, height = 7) ElbowPlot(HCC_harmony2, ndims = 100) dev.off()
pc.num=1:39 HCC_harmony3 <- FindNeighbors(HCC_harmony2, reduction = "harmony", dims = pc.num) %>% FindClusters(resolution = 0.4) HCC_harmony4 <- RunUMAP(HCC_harmony3, reduction = "harmony", dims = pc.num) HCC_harmony5 <- RunTSNE(HCC_harmony4, reduction = "harmony", dims = pc.num)
T_cell <- subset(HCC_harmony5, ident= "T cells") sce = CreateSeuratObject(counts = T_cell@assays$RNA@counts, meta.data = T_cell@meta.data) names(sce@reductions)
T_cell2 <- NormalizeData(sce, normalization.method = "LogNormalize", scale.factor = 1e4)
GetAssay(T_cell2,assay = "RNA")
T_cell2 <- FindVariableFeatures(T_cell2, selection.method = "vst", nfeatures = 2000) T_cell2 <- ScaleData(T_cell2) T_cell2 <- RunPCA(object = T_cell2,npcs = 50,verbose=FALSE) system.time({T_cell_harmony <- RunHarmony(T_cell2, group.by.vars = "orig.ident", project.dim = F)}) dims = 1:15 T_cell_harmony2 <- FindNeighbors(T_cell_harmony, reduction = "harmony", dims = dims) T_cell_harmony2 <- FindClusters(T_cell_harmony2, resolution = 0.8) table(T_cell_harmony2@meta.data$seurat_clusters)
T_cell_harmony3 <- RunUMAP(T_cell_harmony2, dims = dims, reduction = "harmony") T_cell_harmony3 <- RunTSNE(T_cell_harmony3, dims = dims, reduction = "harmony")`
Hello, Thanks for developing an excellent tool for batch correction. When I used Harmony to correct batch between two datasets, I found that Harmony perfectly corrected the batch effects in the overall cell population.
However, when I extracted a small subset, such as the T cell population, and re-ran all the steps from scaling to Harmony on this subset, I observed that while samples from a single dataset integrated well, the T cells from the two datasets showed clear batch effects, resulting in two distinct T cell clusters corresponding to the original datasets. Why is this happening?How to solve this problem?
Samples from all belong to one data set, and II belongs to another. Thanks!