Closed liamfriar closed 9 months ago
Hi @liamfriar ,
The default parameters are different in the web and in the standalone versions. We made the web version more stringent, to avoid FPs, whereas in the standalone version we expect the user to adapt the parameters to their goals. Therefore, the local version has most of the filtering parameters disabled (or set to minimum 0).
I hope this makes sense and sorry for replying so late.
Best, Carlos
Thanks.
Hi,
Thank you creating and maintaining this wonderful tool.
I have about 300k protein sequences from ~50 cyanobacteria that I annotated using the online resource (http://eggnog-mapper.embl.de/) and also a local implementation of
eggnog-mapper
. I was hoping that one would clearly outperform the other, or that they would be essentially the same. Unfortunately, they are quite different and I am not confident of which is preferable after looking around. Do you have any thoughts on when one might be prefferable or why they might give different results?Some comparisons: Online gave a "preferred_name" to 26% of sequences, whereas the local instance of
emapper.py
gave a "preferred_name" to 40% of sequences. While 40% would certainly be better than 26%, when I poked around at some specific genes of interest, and compare them to expected results for gene copy number from related reference genomes, each appears to be more "accurate" for different genes.Online I used all default parameters:
I think all of the parameters were the same when I ran emapper.py locally, although the version is different and some of the argument flags have changed.
./emapper.py --data_dir /path/to/miniconda3/envs/eggnog-mapper/data -m diamond -i $infile -o $short_prefix
Version:/path/to/miniconda3/envs/eggnog-mapper/lib/python2.7/site-packages emapper-2.0.1
Installed on July 18, 2023:Any general thoughts on the local vs. online implementations or on the specific information I have given above would be hugely appreciated.
Thank you!