smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
316 stars 31 forks source link

all module genes have negative kME values #10

Closed hongsilv closed 2 years ago

hongsilv commented 2 years ago

Hi, I first thank you for this wonderful library:)

During my analysis on snRNA-seq data (about 100k cells from 50 samples), I found that in some modules (8 out of 15), almost all module genes had negative kME values. (ex: all genes assigned to module 1 has negative kME value for module 1) For such modules, since all module genes are in negative correlation with ME, I couldn't identify any meaningful hub genes.

However, such genes had positive kME for other modules For example, a gene assigned in M1 = blue module had kME-M1 = -0.09376377 (negative) but, kME-M3= +0.201993572 (positive), kME-M2= +0.013326623 (positive) and so on.

I followed the basic vignette of scWGCNA for the analysis. I changed parameter in "ModuleConnectivity" function (slot: default counts -> data), but it did not make much difference. I attached my modules (gene-module assignment-kME values) file and gene dendrogram below. ser_modules.csv

dendrogram

According to this article (https://support.bioconductor.org/p/101579/, written by WGCNA author) "Module assignment is based on TOM results, and kME is based on correlation, so for some genes, assigned module may differ from the module with highest kME. But practically genes will have a high kME to their assigned module." But for my case, most of the genes have low/negative kME for their assigned module, but high/positive kME for some other unassigned modules.

Is this due to the sparsity of single cell dataset? or Difference between meta cells and real cells? Can you recommend a way to handle this discrepancy?

Thank you, Sung Eun Hon

smorabit commented 2 years ago

I can think of a few reasons why this might happen, but it would help me to understand if I could see the code that you are running.

hongsilv commented 2 years ago

Hi, this is the code I was running

Thank you so much Best, SungEun


devtools::install_github('smorabit/scWGCNA', ref='dev')
install.packages("harmony")
install.packages("corrplot")

library("WGCNA")
library("scWGCNA")
library(Seurat)
library(tidyverse)
library(cowplot)
library(patchwork)
theme_set(theme_cowplot())
library(igraph)
library(enrichR)
library(GeneOverlap)
library(harmony)

setwd("/mnt/data")
hep.ser<- readRDS("./ser_obj.rds") # approx 50 samples, 100k cells, 1 celltype

set.seed(12345)
hep.ser <- SetupForWGCNA(hep.ser, gene_select="fraction", fraction=0.05, wgcna_name="hep_wgcna")
hep.ser <- MetacellsByGroups(hep.ser, group.by=c("orig.ident", "predicted.celltype.l2", "disease_group2"), k=25, ident.group="disease_group2") 
# orig.ident = individual samples; approx 50 samples, # of cells per sample ranging from 137-6933 (median=1740)
predicted.celltype.l2 = all have the same value
# disease_group2 = total 4 groups
hep.ser <- NormalizeMetacells(hep.ser)

hep.ser <- SetDatExpr(hep.ser, group.by="predicted.celltype.l2", group_name = "hepatocyte", use_metacells=TRUE, slot='data') 
# select soft-power threshold
hep.ser <- TestSoftPowers(hep.ser, setDatExpr=FALSE)
plot_list <- PlotSoftPowers(hep.ser)
wrap_plots(plot_list, ncol=2)

power_table <- GetPowerTable(hep.ser)
head(power_table)

# construct co-expression network
soft_power_n=8
hep.ser <- ConstructNetwork(hep.ser, setDatExpr=FALSE, tom_outdir = "./scWGCNA_results/", soft_power=soft_power_n)
PlotDendrogram(hep.ser, main="Hepatocyte scWGCNA Dendogram")
png("./scWGCNA_results/Dendrogram.png", width=8, height=9, dpi=300)
PlotDendrogram(hep.ser, main="scWGCNA Dendogram")
dev.off()

# Module eigengenes and intramodular connectivity
hep.ser <- Seurat::ScaleData(hep.ser, features=GetWGCNAGenes(hep.ser), vars.to.regress = c('nCount_RNA'))
hep.ser <- ModuleEigengenes(hep.ser, group.by.vars="orig.ident") # orig.ident means each sample
hMEs <- GetMEs(hep.ser)
MEs <- GetMEs(hep.ser, harmonized=FALSE)

# compute module connectivity
hep.ser <- ModuleConnectivity(hep.ser)
hep.ser <- ResetModuleNames(hep.ser, new_name="Hep-M")
modules <- GetModules(hep.ser)
saveRDS(hep.ser, file="./scWGCNA_results/scWGCNA_obj.rds") 

hep.ser <- ModuleExprScore(hep.ser, n_genes=25, method ="Seurat")```
allakarpova commented 2 years ago

Hi, I'll join this thread because I'm getting the same results with low or negative kME.

allakarpova commented 2 years ago

Hi, I'm trying to figure out this discrepancy in my data and I hope my observations will be helpful to you. I observed that for some modules ME scores in meta cells are inverted compared to ME scores in single cells.

I plotted ME scores for meta cells by groups in this plot: ME_in_meta_cells_by_cluster

and ME scores for single cells by groups in this plot: ME_in_single_cell_by_cluster

For example scores for black module follow the same pattern in both meta and single cells, but scores for blue, brown, turquoise and many other modules have the opposite patterns. I'm guessing negative kME scores for genes are an artifact of ME calculation for single cells. I hope it helps!

Thanks, Alla

smorabit commented 2 years ago

Hi,

Two things. First, I updated the package, including a fix to an issue that I found in the ModuleEigengene function that was causing this problem. Second, I think it is best to compute the kMEs just in the same cell population that the network was constructed in. Since kME is the correlation between the gene's expression and the module eigengene, you can imagine that the correlation might differ between different cell populations. The basics tutorial now suggests to run the ModuleConnectivity function using the same cell population as in ConstructNetwork.

Just to check, you can use a new function PlotKMEs to see the genes in each module ranked by kME. In all of the datasets that I have tested, they have all been positive after using the fixed ModuleEigengene function and running ModuleConnectivity as suggested.

kME_distributions_update

Please re-install the package, and run the ModuleEigengene function and the ModuleConnectivity functions as outlined in the basics tutorial.

Let me know if it helps!

hongsilv commented 2 years ago

Thank you for sharing your observations @allakarpova ! I reanalyzed my data, and found that kME values are now mostly positive, and hub genes seems reasonable. Thank you for the fix and again, for this wonderful library!

ps. I installed hdWGCNA with devtools::install_github('smorabit/hdWGCNA', ref='dev') but PlotKMEs function is not available (Could not find function)

Best, SH

smorabit commented 2 years ago

PlotKMEs should be there now if you update again, I initially forgot to add documentation for that function which makes it just not show up at all in the package I guess!