Open EmilieBruun opened 2 years ago
Hi @EmilieBruun ,
could you please share the specific emapper version and command used?
Thank you.
Best, Carlos
The command I used was:
emapper.py --mp_start_method forkserver -o $out_file --output_dir $outDir --override -m diamond --dmnd_ignore_warnings --dmnd_algo ctg -i $input_file --evalue 0.001 --score 60 --pident 40 --query_cover 20 --subject_cover 20 --itype proteins --tax_scope auto --target_orthologs all --go_evidence non-electronic --pfam_realign none --cpu 8
The emapper version is 2.1.6:
emapper-2.1.6 / Expected eggNOG DB version: 5.0.2 / Installed eggNOG DB version: 5.0.2 / Local diamond version: diamond version 2.0.11
Hi,
If you could reserve a bit more memory (maybe 44-48GB, https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.7#other-requirements) you could use the --dbmem option, which makes the annotation step much faster, specially for medium to large input data sets. Check https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.7#annotation-options
Besides that, you could check specific diamond options which are wrapped by eggnog-mapper, to try to make the search step faster. Check https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.7#diamond-search-options.
However, I guess that the --dbmem option is going to have more impact in running time.
I hope this is of help.
Best, Carlos
Thank you so much, this sped up the process at lot and the jobs finished around 10 hours instead of 7 days.
Glad to hear that! Thank you!
Hi,
I have outputs from prodigal (protein) that I would like annotate with eggNOG. Each input file has about 1.5 M proteins. I have tried running 5 jobs on 1 node, requesting 38gb mem and 8 cpu per job. However, this takes a long time to finish (~7 days) and only about 5% of the allocated CPU was used for each job. Do you have any idea why this is the case?
Thanks :) Best, Emilie