jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
366 stars 79 forks source link

recommended procedure to explore differences between two conditions in metagenomic samples #567

Closed JuanmaMedina closed 1 year ago

JuanmaMedina commented 1 year ago

Good evening,

This is maybe more a biological than a technical issue, but I am asking here since I still have not found a proper answer.

I was wondering which is the correct method to run the SqueezeMeta pipeline when I have e.g. 20 whole-metagenomic samples, 10 with phenotype A and 10 with phenotype B. I have thought of two approaches:

A) I run the 20 samples in "co-assembly" mode, and use the realignment of the 20 individual samples against the resulting "average" metagenome to interrogate the samples about the signatures (if any) that could explain that differences in phenotype.

B) I run the two groups of samples in two separate "co-assembly" experiments, and then merge them in a posterior step to explore the differences between the two subsets. I think that this could lead to biased condition-based assemblies that will in the end confound my final results.

Maybe this is me asking in a late-night Friday, but while I am still making some tests to answer myself, I have not found a definite answer through the issues or the literature.

Thanks a lot in advance, and thanks once more for developing the workflow.

jtamames commented 1 year ago

Hello Juanma Here on a late-night Friday, I would recommend approach A. Having a common reference will facilitate all subsequent analyses Best, J