Open nfancy opened 3 years ago
There are several ways. Can you share MEGENA.output object via dropbox or something? I can have a look.
Hi, Thanks for your quick response. I am attaching the link for the rds object here. My objective is to get smaller clusters because some of the clusters are really large. Thanks.
https://www.dropbox.com/s/l0cpvyzb64mjan0/MEGENA.output.rds?dl=0
Hi,
Thanks very much for offering to look at the data. Did you manage to take a look?
Hi nfancy,
I took a look, and it seems that the big modules are dictated by highly connected genes such as IL1RAPL1 (which has connectivity of 729). Presence of such highly connected hub, by its nature, holds the entire module so compact, thus the network itself does not have any meaningful substructure in it. And this makes sense that MEGENA is not detecting anymore substructures in them, because it should not when network connectivity is so skewed. While presence of such hub is certainly possible, I still suspect that there is a strong confounding factor that drives the overall correlation structure, and recommend for you to examine such confounding variables carefully. Below is the code I used to examine the data:
rm(list = ls())
library(MEGENA)
MEGENA.output = readRDS("MEGENA.output.rds")
alpha.val = sort(unique(MEGENA.output$module.output$module.alpha[-1])) res = data.frame() for (i in 1:length(alpha.val)) { mods = get.union.cut(module.output = MEGENA.output$module.output,alpha.cut = alpha.val[i],output.plot = F, plotfname = "validModules_alpha",module.pval = 0.05,remove.unsig = F) df = data.frame(alpha.val = rep(alpha.val[i],length(mods)),module.size = sapply(mods,length)) res = rbind.data.frame(res,df) rm(mods,df) }
library(ggplot2) library(ggbeeswarm)
pobj = ggplot(data = res,aes(x = alpha.val,y = module.size,group = alpha.val)) + geom_quasirandom() + geom_boxplot(alpha = 0.2) + scale_x_log10() + scale_y_log10() + theme_bw() + geom_vline(xintercept = 0.33,colour= "red") pobj
summary.res = MEGENA.ModuleSummary(MEGENA.output, mod.pvalue = 0.05,hub.pvalue = 0.05, min.size = 10,max.size = 2500, annot.table = NULL,symbol.col = NULL,id.col = NULL, output.sig = TRUE)
head(summary.res$module.table)
htbl= summary.res$module.table[c("module.parent","module.id")]
hg = graph.data.frame(htbl,directed = TRUE)
hobj.o = get_module_hierarchy_graph(htbl,max.depth = 5,anchor.mid = NULL,h.scale = NULL,is.circular = TRUE,layout = "dendrogram") print(hobj.o$pobj)
hobj.i = get_module_hierarchy_graph(htbl,max.depth = 5,anchor.mid = "c1_2",h.scale = NULL,is.circular = TRUE,layout = "dendrogram") print(hobj.i$pobj)
mids = names(ego(graph = hg,order = 1,"c1_13",mode = "out")[[1]]) summary.res$module.table[match(mids,summary.res$module.table$module.id),]
Thank you so much for such a detailed explanattion. Do you have any recommendations on how to check the effect of any confound on the correlation?
Hi,
Thank you very much for this great package. Is there any control over the multi-scale clustering? For example, in my analysis, I have ended up with a couple of large clusters which I would like to split further. Is there any option within the function to control that?
Thanks in advance. Nurun