soedinglab / metaeuk

MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics
GNU General Public License v3.0
178 stars 23 forks source link

[Question] What are the recommended parameters to adjust for decreasing memory requirements? #59

Closed jolespin closed 1 year ago

jolespin commented 1 year ago

My MMSEQS2 database is 17GB. I've allocated ~100GB of memory with SLURM but it failed to call genes on a microeukaryote because of memory issues. Is there a way to lower the memory requirements via metaeuk easy-predict? I know it uses MMSEQS2 in the backend which may be able to do this with --splits so I wasn't sure if it translates.

I've seen this issue but I'm not sure if it was resolved: https://github.com/soedinglab/metaeuk/issues/37

milot-mirdita commented 1 year ago

--split-memory-limit 70G is likely what you need.

Set it to about 70% to 80% of the actual memory you want to make available to MetaEuk/mmseqs2, since it still needs per thread memory that is not accounted by this parameter. This parameter automatically adjust the number of splits to stay within the memory threshold.

jolespin commented 1 year ago

Is this mostly dependent on the database size or also the contigs going in as well?

milot-mirdita commented 1 year ago

Memory use is based on the database. The contigs/queries should play a minor role at most in some modules.

jolespin commented 1 year ago

Perfect so this should be easy to standardize for all my runs. Thanks for your help.