eggnogdb / eggnog-mapper

Fast genome-wide functional annotation through orthology assignment
http://eggnog-mapper.embl.de
GNU Affero General Public License v3.0
570 stars 106 forks source link

Changing headers, versions, etc. #302

Open nextgenusfs opened 3 years ago

nextgenusfs commented 3 years ago

In my genome annotation software (funannotate) it can parse EggNog annotations data -- this is an optional input that I have in supported the output for several years. I see now that in v2.1.2 the headers changed again from v2.1.0 -- it makes it difficult for me to continue to support if the format/headers is going to constantly change. I would also appreciate a larger version update if format is going to change, ie break backwards compatibility perhaps this should have been v2.2.0 -- but that isn't really all that important, just a little easier to understand how drastic the changes are.

I also had a user that is getting headers that have a hash as a version, this is impossible to parse to know which version of the annotation file I should be parsing, for example:

$ emapper.py --version
emapper-e6ac7f2 / Expected eggNOG DB version: 5.0.2 / Installed eggNOG DB version: 5.0.2 / Local diamond version: diamond version 2.0.4 / Local MMseqs2 version: 113e3212c137d026e297c7540e1fcd039f6812b1

And:

## Fri May 14 17:34:25 2021
## emapper-e6ac7f2
## funannotate/bin/emapper.py -m diamond -i genome.proteins.fasta -o eggnog --cpu 8
##
#query  seed_ortholog   evalue  score   eggNOG_OGs  max_annot_lvl   COG_category    Description Preferred_name  GOs EC  KEGG_ko KEGG_Pathway    KEGG_Module KEGG_Reaction   KEGG_rclass BRITE   KEGG_TC CAZy    BiGG_Reaction   PFAMs
FUN_000001-T1   64363.EME43602  3.5e-244    685.0   KOG0254@1|root,KOG0254@2759|Eukaryota,39TDK@33154|Opisthokonta,3NWTD@4751|Fungi,3QM8J@4890|Ascomycota,1ZZI9@147541|Dothideomycetes,3MH72@451867|Dothideomycetidae   4751|Fungi  U   Major Facilitator Superfamily   -   GO:0003674,GO:0005215,GO:0005575,GO:0005623,GO:0005886,GO:0006810,GO:0006811,GO:0006812,GO:0008150,GO:0008324,GO:0008519,GO:0015075,GO:0015696,GO:0016020,GO:0016021,GO:0022857,GO:0031224,GO:0034220,GO:0044425,GO:0044464,GO:0051179,GO:0051234,GO:0055085,GO:0071705,GO:0071944,GO:0072488,GO:0098655    -   -   -   -   -   -   -   -   -   -   MFS_1,Pkinase

I'm not entirely sure how this user installed this version or if this only shows up if you install master from git or something of that nature.

So my real question is are there additional plans to modify the columns/output for the annotations file output?

Cantalapiedra commented 3 years ago

Hi @nextgenusfs ,

We are sorry for all the trouble. As we are in a development cycle trying to address users/reviewers comments we cannot assure that there will be no changes in the output format. Hopefully, the current one will be kept at least for the short/mid term. This is something that we also want, since it simplifies documentation, support and also maintainance and design of our own downtream scripts, besides of course making things easier for other developers and users.

To try to minimize the hassle, we are making an effort to track changes for each version in the eggnog-mapper wiki. Regarding the hash version, yes, I guess that this is from a cloned version. We will check whether we can show also the tag version, which would definitely help a lot.

Thank you for your patience.

Best, Carlos

Cantalapiedra commented 3 years ago

After looking into the version issue, it looks like the hash-only version reported would be shown when no tags are available.

nextgenusfs commented 3 years ago

Thanks @Cantalapiedra -- didn't mean to sound like an annoying user above -- I was just frustrated. I know you guys are putting in lots of work and have many different users with a thousand different opinions and impossible to satisfy everybody.

So the hash only is a result of somebody not running git fetch or something on the repo?

Cantalapiedra commented 3 years ago

It is completely understable @nextgenusfs . Let's hope that we can manage to keep the formats stable. For us it is also fantastic that you are supporting the tool on your side.

To be honest, I was reviewing the code, and given the current ways to install the tool I am not sure in what cases the "only-hash" version is being shown. With the last commit I hope that if it happens again in the future, it will contain also the actual version (v2.1.3 for instance), but it would be nice to identify when this happens.

Best, Carlos

spock commented 3 years ago

@Cantalapiedra, this happened when eggnog-mapper was auto-installed by conda as a dependency of funannotate. (To be more specific it was mamba and not conda, but that should not have any effect on the outcome.)

Cantalapiedra commented 3 years ago

ah ok @spock , thank you very much! I guess is installing it from bioconda and not from pypi. We need to test this, since we have no control on the bioconda distribution, and thus we never tested it. Thus far, I don't understand why after installing it this way is showing the only-hash version. Maybe we should remove it completely, and only show either the tag+hash or the hardcoded version.

spock commented 3 years ago

exactly, bioconda channel.

installing (previous version) from pypi had the expected "2.1.1" version string.

Cantalapiedra commented 3 years ago

Thank you @spock !