PedroMTQ / mantis

A package to annotate protein sequences
MIT License
53 stars 6 forks source link

Setup process running out of memory and spawning too many processes #34

Open VGalata opened 2 years ago

VGalata commented 2 years ago

Mantis seems to spawn more processes than available cores and to run out of memory during the setup step. I ran it with 5 cores and 20 Gb, and during the metadata extraction with 5 works the job spawned even more sub-processes and crashed. The error message from the slurm job was "out of memory".

I will try to run the job with 12 cores and 48 Gb to see whether the memory issue will appear again.

Used version: 14f75ac

CMD:

python submodules/mantis/ setup_databases --mantis_config mantis/mantis.config

Config:

nog_dmnd_ref_folder=/work/projects/ecosystem_biology/data/mantis_references/NOG/
pfam_ref_folder=/work/projects/ecosystem_biology/data/mantis_references/pfam/
kofam_ref_folder=/work/projects/ecosystem_biology/data/mantis_references/kofam/
ncbi_ref_folder=/work/projects/ecosystem_biology/data/mantis_references/NCBI/
tcdb_ref_folder=/work/projects/ecosystem_biology/data/mantis_references/tcdb/
ncbi_weight=0.9
nog_weight=0.8
pfam_weight=0.9
uniprot_ec_weight=0.9

Conda YAML:

channels:
  - anaconda
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - cython=0.29.21
  - hmmer=3.3.1
  - nltk=3.5
  - numpy=1.19.1
  - psutil=5.7.2
  - python=3.8.5
  - requests=2.24.0
  - sqlite=3.33.0

Log file

Screenshots: Screenshot from 2021-12-07 16-10-12 Screenshot from 2021-12-07 16-14-37

VGalata commented 2 years ago

Update

The setup step worked with 12 cores and 48 Gb. Here is the slurm output for the job:

Job ID: 2564274
Cluster: iris
User/Group: vgalata/clusterusers
State: TIMEOUT (exit code 0)
Nodes: 1
Cores per node: 12
CPU Utilized: 06:37:48
CPU Efficiency: 55.13% of 12:01:36 core-walltime
Job Wall-clock time: 01:00:08
Memory Utilized: 23.96 GB
Memory Efficiency: 49.92% of 48.00 GB

The actual runtime:

This Assembler process finished running at 2021-12-07 17:15:13 and took 3400 seconds to complete.

The log output was almost the same as before, here is the part which was not there in the previous run because of the crash:

Metadata will be extracted with 12 workers!
Concatenating files into  /mnt/irisgpfs/users/vgalata/projects/imp3/submodules/mantis/References/NOG/NOGG/metadata.tsv
#  Will now split data into chunks!
Checking which HMMs need to be split, this may take a while...
Will split:  []
Database will be split with 0 workers!
Checking which custom hmms need to be pressed
HMMs will be pressed with 0 workers!
Preparing NLP Resources!
#  Finished setting up databases!
#  This Assembler process finished running at 2021-12-07 17:15:13 and took 3400 seconds to complete.
##########################################################################################################################
# Thank you for using Mantis, please make sure you cite the respective paper https://doi.org/10.1093/gigascience/giab042 #
##########################################################################################################################
PedroMTQ commented 2 years ago

Thank you @VGalata . I will look into this. Regards, Pedro