franciscozorrilla / metaGEM

:gem: An easy-to-use workflow for generating context specific genome-scale metabolic models and predicting metabolic interactions within microbial communities directly from metagenomic data
https://franciscozorrilla.github.io/metaGEM/
MIT License
203 stars 42 forks source link

megahit (or other executable) in a background mode #105

Closed paristzou closed 2 years ago

paristzou commented 2 years ago

Dear Francisco, Is it possible to run megahit in a background mode, please ? Should I modify Snakefile megahit -t {config[cores][megahit]} \ --presets {config[params][assemblyPreset]} \ --verbose \ --min-contig-len {config[params][assemblyMin]} \ -1 $(basename {input.R1}) \ -2 $(basename {input.R2}) \ -o tmp; Thank you!!

franciscozorrilla commented 2 years ago

Hi @paristzou, could you tell me a little bit about the setup you are using to run metaGEM? In your previous posts I saw you using the --local flag, which is not the recommended usage. metaGEM is intended to run on HPC nodes, so all jobs would be submitted to the scheduler (e.g. slurm) and then run in "background mode" on the cluster. My recommendation is to run on the cluster without the --local flag. You probably dont have enough resources (i.e. cores + RAM) on a local machine to parallelize the assembly of a real-world metagenomic dataset in a reasonable amount of time.

A quick google search will suggest that you may try adding an & symbol at the end of the megahit command to get it running in the background, e.g.

megahit -t {config[cores][megahit]} --presets {config[params][assemblyPreset]} --verbose --min-contig-len {config[params][assemblyMin]} -1 $(basename {input.R1}) -2 $(basename {input.R2}) -o tmp &
paristzou commented 2 years ago

Dear Francisco Apparently I don't have HPC to run my work. I only depends on my own several PC with 32 nodes. I am trying "&" and hope it can work. Only hope that, after the finish of mega core work, the following commands will be implemented as stated. Thanks so much. Paris