eggnogdb / eggnog-mapper

Fast genome-wide functional annotation through orthology assignment
http://eggnog-mapper.embl.de
GNU Affero General Public License v3.0
570 stars 106 forks source link

Truncated header in annotation output and new OGs names #135

Closed nigiord closed 5 years ago

nigiord commented 5 years ago

Thank you for improving eggnog-mapper. I noticed that the output of the annotation phase have considerably changed.

In the current version of master (1.0.3-4), the header is now:

But according to the new documentation (https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2#Whats_new_in_eggNOGmapper_v2), we are missing:

In fact, when we look at the result file in details, it seems that the columns are present. It is just the header that is truncated.

https://github.com/eggnogdb/eggnog-mapper/blob/41a84980536d189d6f6325107ad39d8fd7625fef/eggnogmapper/common.py#L20

On the same topic : I'm interested by the COG id in the NOG database. Previously (in version 1.0.3-3) the ID where identified like this: 08PBW@bactNOG,0QNWK@gproNOG,176HX@proNOG,COG1167@NOG so I was parsing for COGXXXX@NOG in the OG field. It seems now that we have numbers in place of the name of the databases. For instance: 22VXN@171551,2FM1W@200643,4NEWZ@976,COG0088@1,COG0088@2

What are those numbers and what should I parse for in this field?

Cheers, Nils

jhcepas commented 5 years ago

Thanks for the report, Nils.

  1. True, the header line is incomplete in the current ouput. We will fix it asap.
  2. the taxonomic levels in eggnog v5 are now identified by NCBI taxid rather than by a code (i.e. euNOG). @2 stands for Bacteria (bactNOG) and @1 means LUCA (previously called NOG).
jhcepas commented 5 years ago

now solved in release 2.0.0

matrs commented 3 years ago

Just wanted to say that I run a job here: http://eggnog-mapper.embl.de/ and I understood that was version 2 because that's mentioned in the description of the website under "Method overview". Checking the output annotation file I noticed It has the problem described here and the same output reads: # emapper version: emapper-1.0.3-35-g63c274b emapper DB: 2.0

Cantalapiedra commented 3 years ago

Hi @matrs,

Thank you for reporting. There has been some confusion with emapper and eggnog DB versions, due to this metadata included in the outputs. I would say the web version uses eggnog-mapper 2.0 and eggnog DB 5.0.

Best, Carlos

matrs commented 3 years ago

Hi @matrs,

Thank you for reporting. There has been some confusion with emapper and eggnog DB versions, due to this metadata included in the outputs. I would say the web version uses eggnog-mapper 2.0 and eggnog DB 5.0.

Best, Carlos

Hello, but my point is that the output from the web has this problem, an incomplete header.....so I'm guessing that the version the server is running is older than this fix. (17 fields in the header, 22 in the results)

awk -F '\t' 'NR<5 {print NF}' query_seqs.fa.emapper.annotations.tsv 
17
22
22
22
Cantalapiedra commented 3 years ago

Ah sorry, I didn't understand. Yes, the server version will be updated soon (hopefully), although it is likely that both header and fields will be more similar to current "refactor" version.

Thank you again for reporting this.

Best, Carlos

athulmenon commented 3 years ago

Hi,

I think the headers are still having the issue in the web version. Will it be fixed in this update? Can you just tell me what will be the header names likely to be as I would like to parse the input to another program.

Thanks, Athul

Cantalapiedra commented 3 years ago

Hi,

we hope to update soon the web app, fixing this problem. I don't know the header names yet, but the columns will be likely similar to what you can find here:

https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.0.2-v2.0.6#Annotations_file

Sorry for any inconvenience.

Best, Carlos