franciscozorrilla / metaGEM

:gem: An easy-to-use workflow for generating context specific genome-scale metabolic models and predicting metabolic interactions within microbial communities directly from metagenomic data
https://franciscozorrilla.github.io/metaGEM/
MIT License
203 stars 42 forks source link

bug: binReassemble implementation suboptimal without --parallel flag #55

Closed franciscozorrilla closed 3 years ago

franciscozorrilla commented 3 years ago

The binReassemble rule is implemented sub-optimally without the --parallel flag. Although multiple threads are being used, only one genome is assembled at a time. For large genome batches, e.g. 349 MAGs (from single sample 😮) this results in a bottleneck and it is much faster to use the --parallel flag. Note that with this implementation there could be wasted/unused resources if the number of genomes < number of threads, however such a job is likely to finish in a short amount of time.

https://github.com/franciscozorrilla/metaGEM/blob/64ffedbbb0946511008fa1b019a8339eaab71048/Snakefile#L958-L968