nf-core / mag

Assembly and binning of metagenomes
https://nf-co.re/mag
MIT License
216 stars 110 forks source link

Run co-assembly on pooled samples #48

Closed Puumanamana closed 4 years ago

Puumanamana commented 4 years ago

Hi,

First, I want to thank you for implementing this pipeline with Nextflow, it's extremely convenient to use, and I was able to run it really easily on a HPC.

When working on whole genome metagenome sequencing data, I usually do a co-assembly instead of assembling each metagenomic sample separately. In other words, I pool the reads from all samples and then use metaspades/megahit or others assemblers. It can provide more coverage for contigs that co-occur in multiple samples, therefore producing more complete genomes. It also makes sample easier to directly compare since they stem from the same assembly. Pooling can sometime hurt the assembly process depending on the data. However, having this option would be very valuable, or even performing both in parallel.

Unless I'm mistaken, this pipeline seems to assemble each sample separately. Would it be possible to include an option to do the assembly with the merged reads as well?

Thank you, Cédric

d4straub commented 4 years ago

I think this is a replicate of https://github.com/nf-core/mag/issues/21? Ill close it because of this. If you disagree, let me know. I agree with you, just someone needs take the time to implement that option.