Inconsistencies with Louvain Clustering

rln0005 commented 3 months ago

Hi there, Thank you for developing such a useful and well-documented tool.

While processing my data, I have noticed inconsistencies in the transcriptional clustering.

Running this exact same block of code (adapted from the vignette) back-to-back on the same sample has resulted in several different figures, some with different numbers of clusters. See figures below.

pcs.info <- stats::prcomp(t(log10(as.matrix(counts)+1)), center=TRUE)
nPcs <- 8 
pcs <- pcs.info$x[,1:nPcs]
emb <- Rtsne::Rtsne(pcs,
                    is_distance=FALSE,
                    perplexity=30,
                    num_threads=1,
                    verbose=FALSE)$Y
rownames(emb) <- rownames(pcs)
colnames(emb) <- c("x", "y")
k <- 25
com <- MERINGUE::getClusters(pcs, k, weight=TRUE, method = igraph::cluster_louvain)
tempCom <- com
dat <- data.frame("emb1" = ppos[,"x"],
                  "emb2" = ppos[,"y"],
                  "Cluster" = tempCom)
plt <- ggplot2::ggplot(data = dat) +
  ggplot2::geom_point(ggplot2::aes(x = emb1, y = emb2,
                                   color = Cluster), size = 1.2) +
  ggplot2::scale_color_manual(values = rainbow(n = 15)) +
  # ggplot2::scale_y_continuous(expand = c(0, 0), limits = c( min(dat$emb2)-1, max(dat$emb2)+1)) +
  # ggplot2::scale_x_continuous(expand = c(0, 0), limits = c( min(dat$emb1)-1, max(dat$emb1)+1) ) +
  ggplot2::labs(title = "",
                x = "x",
                y = "y") +
  ggplot2::theme_classic() +
  ggplot2::theme(axis.text.x = ggplot2::element_text(size=15, color = "black"),
                 axis.text.y = ggplot2::element_text(size=15, color = "black"),
                 axis.title.y = ggplot2::element_text(size=15),
                 axis.title.x = ggplot2::element_text(size=15),
                 axis.ticks.x = ggplot2::element_blank(),
                 plot.title = ggplot2::element_text(size=15),
                 legend.text = ggplot2::element_text(size = 12, colour = "black"),
                 legend.title = ggplot2::element_text(size = 15, colour = "black", angle = 0, hjust = 0.5),
                 panel.background = ggplot2::element_blank(),
                 plot.background = ggplot2::element_blank(),
                 panel.grid.major.y =  ggplot2::element_blank(),
                 axis.line = ggplot2::element_line(size = 1, colour = "black")
                 # legend.position="none"
  ) +
  ggplot2::guides(colour = ggplot2::guide_legend(override.aes = list(size=2), ncol = 2)
  ) +
  ggplot2::coord_equal()
plt

Various figures produced: Rplot3 Rplot2 Rplot1

I know the tSNE/UMAP plots are often vary slightly in their arrangement, but I haven't encountered a situation where the number of clusters change. Can you clarify if this is normal, or a bug, or something I'm missing/doing incorrectly? (I apologize in advance if this is an issue I should have opened in the Meringue GitHub.)

Thank you!

szhorvat commented 2 months ago

Note that igraph::cluster_louvain implements a stochastic algorithm. The result is expected to be somewhat different between runs.

bmill3r commented 2 months ago

@szhorvat - thanks for answering!

rln0005 commented 2 months ago

Gotcha - Thank you for the quick response!!

JEFworks-Lab / STdeconvolve

Inconsistencies with Louvain Clustering #54