stefpeschel / NetCoMi

Network construction, analysis, and comparison for microbial compositional data
GNU General Public License v3.0
139 stars 24 forks source link

Looking for guidance on MANs Inference with samples from different metagenomics/metatranscriptomics studies #117

Open sayalaruano opened 3 months ago

sayalaruano commented 3 months ago

Hello,

I'm Sebastian, working on my MSc thesis, aiming to create a knowledge graph of the wastewater treatment microbiome. We're using Netcomi for constructing Microbial Association Networks (MANs) from MGnify database metagenomics and metatranscriptomics data. Specifically, we are using the CCLasso method. However, we are facing these challenges and have some questions:

Your expertise and advice would be incredibly valuable. Thanks for your time and assistance.

Sebastian

stefpeschel commented 3 months ago

Hi Sebastian,

If you conduct network analysis using differnt sample sizes you have to be aware that the research question changes. So, you have to point out that you're analyzing the microbial associations accross different data sets. But in principle there is nothing against it.

Regarding heterogeneity: NetCoMi cannot directly take heterogeneity into account. However, one thing you could do is performing a regression analysis with the microbial count data as response and do the network construction and analysis on the residuals.

Best, Stefanie

sayalaruano commented 3 months ago

Hello Stefanie,

Thanks a lot for the quick response. We will consider the fact that we are obtaining associations across studies to analyze the results and avoid making conclusions at the study level.

Also, your suggestion to handle data heterogeneity is great, we did not think about it. But, if we use the residuals to build the MANs, should we use the same normalization (fractions) and zero treatment (pseudoZO) methods we used for the abundance data?

Thanks again for your help.

Best,

Sebastian

stefpeschel commented 1 month ago

Sorry @sayalaruano, I completely forgot to answer. You should perform the normalization and zero replacement steps at the beginning of the workflow, so before the regression analysis. It is then no longer necessary for the network analysis.