fogellab / multiWGCNA

an R package for deep mining gene co-expression networks in multi-trait expression data
13 stars 2 forks source link

Interpretation of HFD-associated modules #21

Open shasabhi1 opened 1 week ago

shasabhi1 commented 1 week ago

Hello,

I wanted some guidance on my multiWGCNA pipeline. I have data with two conditions - Diet (control and high fat diet) and Strain (1 and 2). I have constructed networks using Diet as the primary trait and Strain as the second trait. I have computed overlaps and found high fat diet associated modules that are associated with unique Strain 1 and Strain 2 modules, and those that are found overlapping with both Strain 1 and 2. I have selected the top 15 of these associations by p-adjusted value as the list is large. As my sample size is n=5, I did not perform module preservation analysis to see if any High Fat Diet modules were not found in Control diet.

Questions:

  1. Would it be appropriate to repeat the above steps in Control Diet, and then compare the results with High Fat Diet-associated modules? Is it this only possible descriptively? Or should a preservation analysis be done to even with this sample size to find High Fat Diet modules not found in Control diet.
  2. Can an interpretation be given for individual Trait modules ? For example, what does it mean to have a High Fat Diet Module - is the network constructed between all High Fat Diet animals regardless of strain? And if these broad modules across a diet have specific overlaps with a particular strain, can it be interpreted that these genes are unique to the strain in question?

-shasabhi1

shasabhi1 commented 1 week ago

Additionally, when trying to run module preservation analysis, I get the error Error in eval(expr, p): object 'name2' not found...

dariotommasini commented 1 day ago

Hi @shasabhi1

I would not do a preservation analysis unless you have at least 10 total samples in the network. So for your control network, if you have 5 control samples from strain 1 + 5 control samples from strain 2, then that would be fine, because they will sum to 10 samples. 10 samples is probably near the limit, usually 12 is the lowest number of samples we go because you simply don't get robust correlations.

Yes, the high fat diet network would be constructed with all the high fat diet samples (both strain 1 and 2). I would interpret the modules as groups of co-expressed genes, and their absence in the control network would indicate that those genes are not being co-regulated in the control condition. We discuss an example of this in the paper (M13/dM15 is an EAE-specific network which corresponds to astrocyte reactivity).

dariotommasini commented 1 day ago

Can you show me inputs and outputs and full error message for the error you get? Does the preservation example in the autism vignette work for you?