nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
301 stars 82 forks source link

Empty eggnogg.annotations.txt - EggNog Parse ERROR #892

Closed CesarBertinetti closed 1 year ago

CesarBertinetti commented 1 year ago

Are you using the latest release? Yes

Describe the bug I am missing the annotation of certain genes which seem to be not properly parsed by EggNog. Please any advice would be helpful I'm almost there just missing some important genes

What command did you issue? funannotate annotate -i ${tmp_dir} --cpus $NSLOTS --busco_db actinopterygii --iprscan bluegill1.proteins.fa.xml \ -s "bluegill"

Interproscan is run externally using:

./interproscan.sh -i /scratch365/cbertine/funanotate/BLG_EYE/fun/predict_results/bluegill1.proteins.fa \ -f XML -goterms -pa --cpu $NSLOTS

Logfiles - Annotate command seems fine:

[Mar 13 12:15 PM]: OS: Red Hat Enterprise Linux 8.7, 24 cores, ~ 263 GB RAM. Python: 3.8.15 [Mar 13 12:15 PM]: Running 1.8.14 [Mar 13 12:15 PM]: No NCBI SBT file given, will use default, however if you plan to submit to NCBI, create one and pass it here '--sbt' [Mar 13 12:15 PM]: Parsing input files [Mar 13 12:15 PM]: Existing tbl found: /tmp/185292.1.long/BLG_EYETRANS_ANNOT_V1/update_results/Lepomis_machrochirus.tbl [Mar 13 12:19 PM]: Adding Functional Annotation to Lepomis machrochirus, NCBI accession: None [Mar 13 12:19 PM]: Annotation consists of: 44,391 gene models [Mar 13 12:19 PM]: 44,064 protein records loaded [Mar 13 12:19 PM]: Running HMMer search of PFAM version 35.0 [Mar 13 12:32 PM]: 44,578 annotations added [Mar 13 12:32 PM]: Running Diamond blastp search of UniProt DB version 2022_05 [Mar 13 12:33 PM]: 9,394 valid gene/product annotations from 12,206 total [Mar 13 12:33 PM]: Running Eggnog-mapper [Mar 13 01:17 PM]: Parsing EggNog Annotations [Mar 13 01:17 PM]: EggNog version parsed as 2.1.10 [Mar 13 01:17 PM]: Combining UniProt/EggNog gene and product names using Gene2Product version 1.88 [Mar 13 01:17 PM]: 9,394 gene name and product description annotations added [Mar 13 01:17 PM]: Running Diamond blastp search of MEROPS version 12.0 [Mar 13 01:18 PM]: 1,413 annotations added [Mar 13 01:18 PM]: Annotating CAZYmes using HMMer search of dbCAN version 11.0 [Mar 13 01:26 PM]: 364 annotations added [Mar 13 01:26 PM]: Annotating proteins with BUSCO actinopterygii models [Mar 13 01:36 PM]: 4,482 annotations added [Mar 13 01:36 PM]: Skipping phobius predictions, try funannotate remote -m phobius [Mar 13 01:36 PM]: Skipping secretome: neither SignalP nor Phobius searches were run [Mar 13 01:36 PM]: 0 secretome and 0 transmembane annotations added [Mar 13 01:36 PM]: Parsing InterProScan5 XML file [Mar 13 01:57 PM]: Found 0 duplicated annotations, adding 999,840 valid annotations [Mar 13 01:58 PM]: Converting to final Genbank format, good luck! [Mar 13 02:03 PM]: Creating AGP file and corresponding contigs file [Mar 13 02:04 PM]: Writing genome annotation table. [Mar 13 02:22 PM]: Funannotate annotate has completed successfully!

    We need YOUR help to improve gene names/product descriptions:
       0 gene/products names MUST be fixed, see /tmp/185292.1.long/BLG_EYETRANS_ANNOT_V1/annotate_results/Gene2Products.must-fix.txt
       0 gene/product names need to be curated, see /tmp/185292.1.long/BLG_EYETRANS_ANNOT_V1/annotate_results/Gene2Products.need-curating.txt
       749 gene/product names passed but are not in Database, see /tmp/185292.1.long/BLG_EYETRANS_ANNOT_V1/annotate_results/Gene2Products.new-names-passed.txt

    Please consider contributing a PR at https://github.com/nextgenusfs/gene2product

- But many genes show the EggNog Parse ERROR (just one example):

[03/28/23 11:40:11]: EggNog Parse ERROR: FUN_057458-T1 215358.XP_010734348.1 2.11e-289 792.0 COG2126@1|root,KOG3713@2759|Eukaryota,38HR0@33154|Opisthokonta,3BC0S@33208|Metazoa,3CX9N@33213|Bilateria,48437@7711|Chordata,497IX@7742|Vertebrata,4A0YR@7898|Actinopterygii 33208|Metazoa P Potassium voltage-gated channel, subfamily G, member KCNG3 GO:0003674,GO:0005215,GO:0005216,GO:0005244,GO:0005249,GO:0005251,GO:0005261,GO:0005267,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005783,GO:0005886,GO:0006810,GO:0006811,GO:0006812,GO:0006813,GO:0008150,GO:0008324,GO:0009987,GO:0012505,GO:0015075,GO:0015077,GO:0015079,GO:0015267,GO:0015318,GO:0015672,GO:0016020,GO:0022803,GO:0022832,GO:0022836,GO:0022838,GO:0022839,GO:0022843,GO:0022857,GO:0022890,GO:0030001,GO:0034220,GO:0043226,GO:0043227,GO:0043229,GO:0043231,GO:0044424,GO:0044444,GO:0044464,GO:0046873,GO:0051179,GO:0051234,GO:0055085,GO:0071804,GO:0071805,GO:0071944,GO:0098655,GO:0098660,GO:0098662 - ko:K04902 - - - - ko00000,ko04040 1.A.1.2.17 - - BTB_2,Ion_trans

- The eggnog.annotations.txt in predict_misc is empty

OS/Install Information

Checking dependencies for 1.8.14

You are running Python v 3.8.15. Now checking python packages... biopython: 1.76 goatools: 1.2.3 matplotlib: 3.4.3 natsort: 8.2.0 numpy: 1.24.2 pandas: 1.5.3 psutil: 5.7.0 requests: 2.28.2 scikit-learn: 1.2.1 scipy: 1.10.0 seaborn: 0.12.2 All 11 python packages installed

You are running Perl v b'5.032001'. Now checking perl modules... Carp: 1.50 Clone: 0.46 DBD::SQLite: 1.72 DBD::mysql: 4.046 DBI: 1.643 DB_File: 1.855 Data::Dumper: 2.183 File::Basename: 2.85 File::Which: 1.24 Getopt::Long: 2.54 Hash::Merge: 0.302 JSON: 4.10 LWP::UserAgent: 6.67 Logger::Simple: 2.0 POSIX: 1.94 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.14 Tie::File: 1.06 URI::Escape: 5.12 YAML: 1.30 threads: 2.25 threads::shared: 1.61 ERROR: local::lib not installed, install with cpanm local::lib

Checking Environmental Variables... $FUNANNOTATE_DB=/scratch365/cbertine/funanotate/funannotate_db $PASAHOME=/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/opt/pasa-2.5.2 $TRINITY_HOME=/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/opt/trinity-2.8.5 $EVM_HOME=/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/config/ ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir

Checking external dependencies... ERROR: pslDnaFiler found but error running: pslCDnaFilter: error while loading shared libraries: libssl.so.1.0.0: cannot open shared object file: No such file or directory

PASA: 2.5.2 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.5.0 bamtools: bamtools 2.5.1 bedtools: bedtools v2.30.0 blat: BLAT v35 diamond: 2.0.15 emapper.py: 2.1.1 ete3: 3.1.2 exonerate: exonerate 2.4.0 fasta: 36.3.8g glimmerhmm: 3.0.4 gmap: 2021-08-25 hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 17.0.3-internal kallisto: 0.46.1 mafft: v7.515 (2023/Jan/15) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.24-r1122 pigz: 2.6 proteinortho: 6.1.7 salmon: salmon 0.14.1 samtools: samtools 1.16.1 snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.11 (Oct 2022) tantan: tantan 40 tbl2asn: 25.8 tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 ERROR: gmes_petap.pl not installed ERROR: pslCDnaFilter not installed ERROR: signalp not installed

nextgenusfs commented 1 year ago

Thanks for detailed report. Eggnog mapper must have changed formats again.... sigh.

CesarBertinetti commented 1 year ago

That's what I imagined from going through previous issues. I'm just not versed enough to figure out how to reformat it...

nextgenusfs commented 1 year ago

We probably have to write yet another parser for the new format, can you include the header and the first 10 records from the eggnog annotations output so we can see what needs to change in the parser?

CesarBertinetti commented 1 year ago

That would be awesome. Thanks for your quick reply.

eggnog.emapper.annotations-10records.txt

Tue Mar 28 11:30:07 2023

emapper-2.1.10

/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/bin/emapper.py -m diamond -i /tmp/219324.1.long/annotate_misc/genome.proteins.fasta -o eggnog --cpu 10 --scratch_dir /tmp/emapper-43affc2b --temp_dir /tmp --dbmem

query seed_ortholog evalue score eggNOG_OGs max_annot_lvl COG_category Description Preferred_name GOs EC KEGG_ko KEGG_Pathway KEGG_Module KEGG_Reaction KEGG_rclass BRITE KEGG_TC CAZy BiGG_Reaction PFAMs

FUN_000008-T1 215358.XP_010754240.1 1.73e-117 350.0 COG1960@1|root,KOG0135@2759|Eukaryota,39M88@33154|Opisthokonta,3B9Y8@33208|Metazoa,3CV4R@33213|Bilateria,483B5@7711|Chordata,4915E@7742|Vertebrata,49RG3@7898|Actinopterygii 33208|Metazoa IQ acyl-coenzyme A oxidase 3 ACOX3 GO:0000166,GO:0003674,GO:0003824,GO:0003995,GO:0003997,GO:0005102,GO:0005488,GO:0005504,GO:0005515,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005777,GO:0005782,GO:0005829,GO:0006082,GO:0006605,GO:0006625,GO:0006629,GO:0006631,GO:0006635,GO:0006810,GO:0006886,GO:0006996,GO:0007031,GO:0008104,GO:0008150,GO:0008152,GO:0008289,GO:0009056,GO:0009062,GO:0009987,GO:0015031,GO:0015833,GO:0016042,GO:0016043,GO:0016054,GO:0016402,GO:0016491,GO:0016627,GO:0016634,GO:0019395,GO:0019752,GO:0030258,GO:0031406,GO:0031907,GO:0031974,GO:0032787,GO:0033036,GO:0033293,GO:0033365,GO:0033539,GO:0033540,GO:0034440,GO:0034613,GO:0036094,GO:0042579,GO:0042886,GO:0043167,GO:0043168,GO:0043177,GO:0043226,GO:0043227,GO:0043229,GO:0043231,GO:0043233,GO:0043436,GO:0043574,GO:0044237,GO:0044238,GO:0044242,GO:0044248,GO:0044255,GO:0044281,GO:0044282,GO:0044422,GO:0044424,GO:0044438,GO:0044439,GO:0044444,GO:0044446,GO:0044464,GO:0045184,GO:0046395,GO:0046907,GO:0048037,GO:0050660,GO:0050662,GO:0051179,GO:0051234,GO:0051641,GO:0051649,GO:0055114,GO:0070013,GO:0070727,GO:0071702,GO:0071704,GO:0071705,GO:0071840,GO:0072329,GO:0072594,GO:0072662,GO:0072663,GO:0097159,GO:1901265,GO:1901363,GO:1901575 1.3.3.ko:K00232 ko00071,ko00592,ko01040,ko01100,ko01110,ko01212,ko03320,ko04024,ko04146,map00071,map00592,map01040,map01100,map01110,map01212,map03320,map04024,map04146 M00087,M00113 R01175,R01279,R03777,R03857,R03990,R04751,R04754,R07888,R07892,R07896,R07934,R07950 RC00052,RC00076 ko00000,ko00001,ko00002,ko01000 - - - ACOX,Acyl-CoA_dh_1,Acyl-CoA_dh_M,Acyl-CoA_ox_N FUN_000010-T1 69293.ENSGACP00000000988 2.41e-103 318.0 COG1960@1|root,KOG0135@2759|Eukaryota,39M88@33154|Opisthokonta,3B9Y8@33208|Metazoa,3CV4R@33213|Bilateria,483B5@7711|Chordata,4915E@7742|Vertebrata,49RG3@7898|Actinopterygii 33208|Metazoa IQ acyl-coenzyme A oxidase 3 ACOX3 GO:0000166,GO:0003674,GO:0003824,GO:0003995,GO:0003997,GO:0005102,GO:0005488,GO:0005504,GO:0005515,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005777,GO:0005782,GO:0005829,GO:0006082,GO:0006605,GO:0006625,GO:0006629,GO:0006631,GO:0006635,GO:0006810,GO:0006886,GO:0006996,GO:0007031,GO:0008104,GO:0008150,GO:0008152,GO:0008289,GO:0009056,GO:0009062,GO:0009987,GO:0015031,GO:0015833,GO:0016042,GO:0016043,GO:0016054,GO:0016402,GO:0016491,GO:0016627,GO:0016634,GO:0019395,GO:0019752,GO:0030258,GO:0031406,GO:0031907,GO:0031974,GO:0032787,GO:0033036,GO:0033293,GO:0033365,GO:0033539,GO:0033540,GO:0034440,GO:0034613,GO:0036094,GO:0042579,GO:0042886,GO:0043167,GO:0043168,GO:0043177,GO:0043226,GO:0043227,GO:0043229,GO:0043231,GO:0043233,GO:0043436,GO:0043574,GO:0044237,GO:0044238,GO:0044242,GO:0044248,GO:0044255,GO:0044281,GO:0044282,GO:0044422,GO:0044424,GO:0044438,GO:0044439,GO:0044444,GO:0044446,GO:0044464,GO:0045184,GO:0046395,GO:0046907,GO:0048037,GO:0050660,GO:0050662,GO:0051179,GO:0051234,GO:0051641,GO:0051649,GO:0055114,GO:0070013,GO:0070727,GO:0071702,GO:0071704,GO:0071705,GO:0071840,GO:0072329,GO:0072594,GO:0072662,GO:0072663,GO:0097159,GO:1901265,GO:1901363,GO:1901575 1.3.3.6 ko:K00232 ko00071,ko00592,ko01040,ko01100,ko01110,ko01212,ko03320,ko04024,ko04146,map00071,map00592,map01040,map01100,map01110,map01212,map03320,map04024,map04146 M00087,M00113 R01175,R01279,R03777,R03857,R03990,R04751,R04754,R07888,R07892,R07896,R07934,R07950 RC00052,RC00076 ko00000,ko00001,ko00002,ko01000 - - - ACOX,Acyl-CoA_dh_1,Acyl-CoA_dh_M,Acyl-CoA_ox_N FUN_000016-T1 69293.ENSGACP00000000988 1.08e-13 72.4 COG1960@1|root,KOG0135@2759|Eukaryota,39M88@33154|Opisthokonta,3B9Y8@33208|Metazoa,3CV4R@33213|Bilateria,483B5@7711|Chordata,4915E@7742|Vertebrata,49RG3@7898|Actinopterygii 33208|Metazoa IQ acyl-coenzyme A oxidase 3 ACOX3 GO:0000166,GO:0003674,GO:0003824,GO:0003995,GO:0003997,GO:0005102,GO:0005488,GO:0005504,GO:0005515,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005777,GO:0005782,GO:0005829,GO:0006082,GO:0006605,GO:0006625,GO:0006629,GO:0006631,GO:0006635,GO:0006810,GO:0006886,GO:0006996,GO:0007031,GO:0008104,GO:0008150,GO:0008152,GO:0008289,GO:0009056,GO:0009062,GO:0009987,GO:0015031,GO:0015833,GO:0016042,GO:0016043,GO:0016054,GO:0016402,GO:0016491,GO:0016627,GO:0016634,GO:0019395,GO:0019752,GO:0030258,GO:0031406,GO:0031907,GO:0031974,GO:0032787,GO:0033036,GO:0033293,GO:0033365,GO:0033539,GO:0033540,GO:0034440,GO:0034613,GO:0036094,GO:0042579,GO:0042886,GO:0043167,GO:0043168,GO:0043177,GO:0043226,GO:0043227,GO:0043229,GO:0043231,GO:0043233,GO:0043436,GO:0043574,GO:0044237,GO:0044238,GO:0044242,GO:0044248,GO:0044255,GO:0044281,GO:0044282,GO:0044422,GO:0044424,GO:0044438,GO:0044439,GO:0044444,GO:0044446,GO:0044464,GO:0045184,GO:0046395,GO:0046907,GO:0048037,GO:0050660,GO:0050662,GO:0051179,GO:0051234,GO:0051641,GO:0051649,GO:0055114,GO:0070013,GO:0070727,GO:0071702,GO:0071704,GO:0071705,GO:0071840,GO:0072329,GO:0072594,GO:0072662,GO:0072663,GO:0097159,GO:1901265,GO:1901363,GO:1901575 1.3.3.6 ko:K00232 ko00071,ko00592,ko01040,ko01100,ko01110,ko01212,ko03320,ko04024,ko04146,map00071,map00592,map01040,map01100,map01110,map01212,map03320,map04024,map04146 M00087,M00113 R01175,R01279,R03777,R03857,R03990,R04751,R04754,R07888,R07892,R07896,R07934,R07950 RC00052,RC00076 ko00000,ko00001,ko00002,ko01000 - - - ACOX,Acyl-CoA_dh_1,Acyl-CoA_dh_M,Acyl-CoA_ox_N FUN_000017-T1 69293.ENSGACP00000000988 9.96e-77 247.0 COG1960@1|root,KOG0135@2759|Eukaryota,39M88@33154|Opisthokonta,3B9Y8@33208|Metazoa,3CV4R@33213|Bilateria,483B5@7711|Chordata,4915E@7742|Vertebrata,49RG3@7898|Actinopterygii 33208|Metazoa IQ acyl-coenzyme A oxidase 3 ACOX3 GO:0000166,GO:0003674,GO:0003824,GO:0003995,GO:0003997,GO:0005102,GO:0005488,GO:0005504,GO:0005515,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005777,GO:0005782,GO:0005829,GO:0006082,GO:0006605,GO:0006625,GO:0006629,GO:0006631,GO:0006635,GO:0006810,GO:0006886,GO:0006996,GO:0007031,GO:0008104,GO:0008150,GO:0008152,GO:0008289,GO:0009056,GO:0009062,GO:0009987,GO:0015031,GO:0015833,GO:0016042,GO:0016043,GO:0016054,GO:0016402,GO:0016491,GO:0016627,GO:0016634,GO:0019395,GO:0019752,GO:0030258,GO:0031406,GO:0031907,GO:0031974,GO:0032787,GO:0033036,GO:0033293,GO:0033365,GO:0033539,GO:0033540,GO:0034440,GO:0034613,GO:0036094,GO:0042579,GO:0042886,GO:0043167,GO:0043168,GO:0043177,GO:0043226,GO:0043227,GO:0043229,GO:0043231,GO:0043233,GO:0043436,GO:0043574,GO:0044237,GO:0044238,GO:0044242,GO:0044248,GO:0044255,GO:0044281,GO:0044282,GO:0044422,GO:0044424,GO:0044438,GO:0044439,GO:0044444,GO:0044446,GO:0044464,GO:0045184,GO:0046395,GO:0046907,GO:0048037,GO:0050660,GO:0050662,GO:0051179,GO:0051234,GO:0051641,GO:0051649,GO:0055114,GO:0070013,GO:0070727,GO:0071702,GO:0071704,GO:0071705,GO:0071840,GO:0072329,GO:0072594,GO:0072662,GO:0072663,GO:0097159,GO:1901265,GO:1901363,GO:1901575 1.3.3.6 ko:K00232 ko00071,ko00592,ko01040,ko01100,ko01110,ko01212,ko03320,ko04024,ko04146,map00071,map00592,map01040,map01100,map01110,map01212,map03320,map04024,map04146 M00087,M00113 R01175,R01279,R03777,R03857,R03990,R04751,R04754,R07888,R07892,R07896,R07934,R07950 RC00052,RC00076 ko00000,ko00001,ko00002,ko01000 - - - ACOX,Acyl-CoA_dh_1,Acyl-CoA_dh_M,Acyl-CoA_ox_N FUN_000020-T1 144197.XP_008286603.1 1.57e-161 474.0 COG1960@1|root,KOG0135@2759|Eukaryota,39M88@33154|Opisthokonta,3B9Y8@33208|Metazoa,3CV4R@33213|Bilateria,483B5@7711|Chordata,4915E@7742|Vertebrata,49RG3@7898|Actinopterygii 33208|Metazoa IQ acyl-coenzyme A oxidase 3 ACOX3 GO:0000166,GO:0003674,GO:0003824,GO:0003995,GO:0003997,GO:0005102,GO:0005488,GO:0005504,GO:0005515,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005777,GO:0005782,GO:0005829,GO:0006082,GO:0006605,GO:0006625,GO:0006629,GO:0006631,GO:0006635,GO:0006810,GO:0006886,GO:0006996,GO:0007031,GO:0008104,GO:0008150,GO:0008152,GO:0008289,GO:0009056,GO:0009062,GO:0009987,GO:0015031,GO:0015833,GO:0016042,GO:0016043,GO:0016054,GO:0016402,GO:0016491,GO:0016627,GO:0016634,GO:0019395,GO:0019752,GO:0030258,GO:0031406,GO:0031907,GO:0031974,GO:0032787,GO:0033036,GO:0033293,GO:0033365,GO:0033539,GO:0033540,GO:0034440,GO:0034613,GO:0036094,GO:0042579,GO:0042886,GO:0043167,GO:0043168,GO:0043177,GO:0043226,GO:0043227,GO:0043229,GO:0043231,GO:0043233,GO:0043436,GO:0043574,GO:0044237,GO:0044238,GO:0044242,GO:0044248,GO:0044255,GO:0044281,GO:0044282,GO:0044422,GO:0044424,GO:0044438,GO:0044439,GO:0044444,GO:0044446,GO:0044464,GO:0045184,GO:0046395,GO:0046907,GO:0048037,GO:0050660,GO:0050662,GO:0051179,GO:0051234,GO:0051641,GO:0051649,GO:0055114,GO:0070013,GO:0070727,GO:0071702,GO:0071704,GO:0071705,GO:0071840,GO:0072329,GO:0072594,GO:0072662,GO:0072663,GO:0097159,GO:1901265,GO:1901363,GO:1901575 1.3.3.ko:K00232 ko00071,ko00592,ko01040,ko01100,ko01110,ko01212,ko03320,ko04024,ko04146,map00071,map00592,map01040,map01100,map01110,map01212,map03320,map04024,map04146 M00087,M00113 R01175,R01279,R03777,R03857,R03990,R04751,R04754,R07888,R07892,R07896,R07934,R07950 RC00052,RC00076 ko00000,ko00001,ko00002,ko01000 - - - ACOX,Acyl-CoA_dh_1,Acyl-CoA_dh_M,Acyl-CoA_ox_N FUN_000021-T1 69293.ENSGACP00000000988 7.73e-174 504.0 COG1960@1|root,KOG0135@2759|Eukaryota,39M88@33154|Opisthokonta,3B9Y8@33208|Metazoa,3CV4R@33213|Bilateria,483B5@7711|Chordata,4915E@7742|Vertebrata,49RG3@7898|Actinopterygii 33208|Metazoa IQ acyl-coenzyme A oxidase 3 ACOX3 GO:0000166,GO:0003674,GO:0003824,GO:0003995,GO:0003997,GO:0005102,GO:0005488,GO:0005504,GO:0005515,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005777,GO:0005782,GO:0005829,GO:0006082,GO:0006605,GO:0006625,GO:0006629,GO:0006631,GO:0006635,GO:0006810,GO:0006886,GO:0006996,GO:0007031,GO:0008104,GO:0008150,GO:0008152,GO:0008289,GO:0009056,GO:0009062,GO:0009987,GO:0015031,GO:0015833,GO:0016042,GO:0016043,GO:0016054,GO:0016402,GO:0016491,GO:0016627,GO:0016634,GO:0019395,GO:0019752,GO:0030258,GO:0031406,GO:0031907,GO:0031974,GO:0032787,GO:0033036,GO:0033293,GO:0033365,GO:0033539,GO:0033540,GO:0034440,GO:0034613,GO:0036094,GO:0042579,GO:0042886,GO:0043167,GO:0043168,GO:0043177,GO:0043226,GO:0043227,GO:0043229,GO:0043231,GO:0043233,GO:0043436,GO:0043574,GO:0044237,GO:0044238,GO:0044242,GO:0044248,GO:0044255,GO:0044281,GO:0044282,GO:0044422,GO:0044424,GO:0044438,GO:0044439,GO:0044444,GO:0044446,GO:0044464,GO:0045184,GO:0046395,GO:0046907,GO:0048037,GO:0050660,GO:0050662,GO:0051179,GO:0051234,GO:0051641,GO:0051649,GO:0055114,GO:0070013,GO:0070727,GO:0071702,GO:0071704,GO:0071705,GO:0071840,GO:0072329,GO:0072594,GO:0072662,GO:0072663,GO:0097159,GO:1901265,GO:1901363,GO:1901575 1.3.3.6 ko:K00232 ko00071,ko00592,ko01040,ko01100,ko01110,ko01212,ko03320,ko04024,ko04146,map00071,map00592,map01040,map01100,map01110,map01212,map03320,map04024,map04146 M00087,M00113 R01175,R01279,R03777,R03857,R03990,R04751,R04754,R07888,R07892,R07896,R07934,R07950 RC00052,RC00076 ko00000,ko00001,ko00002,ko01000 - - - ACOX,Acyl-CoA_dh_1,Acyl-CoA_dh_M,Acyl-CoA_ox_N FUN_000022-T1 69293.ENSGACP00000000988 3.76e-148 439.0 COG1960@1|root,KOG0135@2759|Eukaryota,39M88@33154|Opisthokonta,3B9Y8@33208|Metazoa,3CV4R@33213|Bilateria,483B5@7711|Chordata,4915E@7742|Vertebrata,49RG3@7898|Actinopterygii 33208|Metazoa IQ acyl-coenzyme A oxidase 3 ACOX3 GO:0000166,GO:0003674,GO:0003824,GO:0003995,GO:0003997,GO:0005102,GO:0005488,GO:0005504,GO:0005515,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005777,GO:0005782,GO:0005829,GO:0006082,GO:0006605,GO:0006625,GO:0006629,GO:0006631,GO:0006635,GO:0006810,GO:0006886,GO:0006996,GO:0007031,GO:0008104,GO:0008150,GO:0008152,GO:0008289,GO:0009056,GO:0009062,GO:0009987,GO:0015031,GO:0015833,GO:0016042,GO:0016043,GO:0016054,GO:0016402,GO:0016491,GO:0016627,GO:0016634,GO:0019395,GO:0019752,GO:0030258,GO:0031406,GO:0031907,GO:0031974,GO:0032787,GO:0033036,GO:0033293,GO:0033365,GO:0033539,GO:0033540,GO:0034440,GO:0034613,GO:0036094,GO:0042579,GO:0042886,GO:0043167,GO:0043168,GO:0043177,GO:0043226,GO:0043227,GO:0043229,GO:0043231,GO:0043233,GO:0043436,GO:0043574,GO:0044237,GO:0044238,GO:0044242,GO:0044248,GO:0044255,GO:0044281,GO:0044282,GO:0044422,GO:0044424,GO:0044438,GO:0044439,GO:0044444,GO:0044446,GO:0044464,GO:0045184,GO:0046395,GO:0046907,GO:0048037,GO:0050660,GO:0050662,GO:0051179,GO:0051234,GO:0051641,GO:0051649,GO:0055114,GO:0070013,GO:0070727,GO:0071702,GO:0071704,GO:0071705,GO:0071840,GO:0072329,GO:0072594,GO:0072662,GO:0072663,GO:0097159,GO:1901265,GO:1901363,GO:1901575 1.3.3.6 ko:K00232 ko00071,ko00592,ko01040,ko01100,ko01110,ko01212,ko03320,ko04024,ko04146,map00071,map00592,map01040,map01100,map01110,map01212,map03320,map04024,map04146 M00087,M00113 R01175,R01279,R03777,R03857,R03990,R04751,R04754,R07888,R07892,R07896,R07934,R07950 RC00052,RC00076 ko00000,ko00001,ko00002,ko01000 - - - ACOX,Acyl-CoA_dh_1,Acyl-CoA_dh_M,Acyl-CoA_ox_N FUN_000023-T1 244447.XP_008330185.1 0.000599 45.4 KOG4282@1|root,KOG4282@2759|Eukaryota 2759|Eukaryota S transcription regulator activity - - - ko:K22255 - ko00000,ko04131 - - - Myb_DNA-bind_4 FUN_000024-T1 215358.XP_010753035.1 6.33e-229 632.0 28PIY@1|root,2QW73@2759|Eukaryota,39TRK@33154|Opisthokonta,3BDAX@33208|Metazoa,3D1FS@33213|Bilateria,48A3I@7711|Chordata,490RX@7742|Vertebrata,49RNJ@7898|Actinopterygii 33208|Metazoa S N-acetyltransferase 16 NAT16 GO:0003674,GO:0003824,GO:0006464,GO:0006473,GO:0006807,GO:0008080,GO:0008150,GO:0008152,GO:0009987,GO:0016407,GO:0016410,GO:0016740,GO:0016746,GO:0016747,GO:0019538,GO:0036211,GO:0043170,GO:0043412,GO:0043543,GO:0044237,GO:0044238,GO:0044260,GO:0044267,GO:0047981,GO:0071704,GO:1901564 - - - - - - - - Acetyltransf_1 FUN_000029-T1 144197.XP_008286600.1 0.0 1017.0 KOG4471@1|root,KOG4471@2759|Eukaryota,3AG74@33154|Opisthokonta,3B9A1@33208|Metazoa,3CSTK@33213|Bilateria,486R0@7711|Chordata,48VXK@7742|Vertebrata,49WXX@7898|Actinopterygii 33208|Metazoa IU Belongs to the protein-tyrosine phosphatase family. Non-receptor class myotubularin subfamily MTM1 GO:0000278,GO:0001726,GO:0003674,GO:0003824,GO:0004438,GO:0004721,GO:0005488,GO:0005515,GO:0005543,GO:0005546,GO:0005575,GO:0005622,GO:0005623,GO:0005737,GO:0005768,GO:0005770,GO:0005829,GO:0005886,GO:0005938,GO:0006464,GO:0006470,GO:0006629,GO:0006644,GO:0006650,GO:0006661,GO:0006793,GO:0006796,GO:0006807,GO:0006810,GO:0006950,GO:0006996,GO:0007005,GO:0007010,GO:0007034,GO:0007041,GO:0007049,GO:0007059,GO:0008138,GO:0008150,GO:0008152,GO:0008289,GO:0008333,GO:0008610,GO:0008654,GO:0009058,GO:0009611,GO:0009653,GO:0009892,GO:0009894,GO:0009895,GO:0009966,GO:0009968,GO:0009987,GO:0010506,GO:0010507,GO:0010605,GO:0010639,GO:0010646,GO:0010648,GO:0012505,GO:0016020,GO:0016043,GO:0016192,GO:0016197,GO:0016202,GO:0016241,GO:0016242,GO:0016311,GO:0016787,GO:0016788,GO:0016791,GO:0019215,GO:0019222,GO:0019538,GO:0019637,GO:0019725,GO:0022607,GO:0023051,GO:0023057,GO:0030016,GO:0030017,GO:0030029,GO:0030030,GO:0030031,GO:0030036,GO:0030100,GO:0030162,GO:0030175,GO:0030258,GO:0030865,GO:0030866,GO:0031252,GO:0031323,GO:0031324,GO:0031329,GO:0031330,GO:0031410,GO:0031674,GO:0031982,GO:0032006,GO:0032007,GO:0032268,GO:0032269,GO:0032434,GO:0032435,GO:0032456,GO:0032502,GO:0032879,GO:0032989,GO:0032990,GO:0033043,GO:0034593,GO:0035091,GO:0036211,GO:0040008,GO:0042176,GO:0042177,GO:0042578,GO:0042592,GO:0042995,GO:0043167,GO:0043168,GO:0043170,GO:0043226,GO:0043227,GO:0043228,GO:0043229,GO:0043232,GO:0043292,GO:0043412,GO:0044085,GO:0044087,GO:0044088,GO:0044237,GO:0044238,GO:0044249,GO:0044255,GO:0044260,GO:0044267,GO:0044422,GO:0044424,GO:0044444,GO:0044448,GO:0044449,GO:0044464,GO:0044877,GO:0045017,GO:0045103,GO:0045104,GO:0045109,GO:0045177,GO:0045179,GO:0045806,GO:0045844,GO:0045861,GO:0045927,GO:0046474,GO:0046486,GO:0046488,GO:0046716,GO:0046839,GO:0046856,GO:0046907,GO:0048311,GO:0048518,GO:0048519,GO:0048523,GO:0048583,GO:0048585,GO:0048631,GO:0048633,GO:0048634,GO:0048636,GO:0048638,GO:0048639,GO:0048641,GO:0048643,GO:0048856,GO:0048869,GO:0050764,GO:0050765,GO:0050789,GO:0050793,GO:0050794,GO:0050896,GO:0051049,GO:0051051,GO:0051094,GO:0051128,GO:0051129,GO:0051171,GO:0051172,GO:0051179,GO:0051234,GO:0051239,GO:0051240,GO:0051246,GO:0051248,GO:0051640,GO:0051641,GO:0051646,GO:0051649,GO:0051896,GO:0051898,GO:0052629,GO:0052744,GO:0052866,GO:0060099,GO:0060101,GO:0060249,GO:0060255,GO:0060627,GO:0061136,GO:0065007,GO:0065008,GO:0070584,GO:0071704,GO:0071840,GO:0071944,GO:0080090,GO:0090407,GO:0097435,GO:0097708,GO:0098858,GO:0099080,GO:0099081,GO:0099512,GO:0099568,GO:0099738,GO:0106018,GO:0120025,GO:0140096,GO:1901074,GO:1901075,GO:1901564,GO:1901576,GO:1901799,GO:1901861,GO:1901863,GO:1901981,GO:1902115,GO:1902116,GO:1902531,GO:1902532,GO:1902902,GO:1902936,GO:1903050,GO:1903051,GO:1903362,GO:1903363,GO:1905153,GO:1905154,GO:2000026,GO:2000058,GO:2000059,GO:2000425,GO:2000426,GO:2000785 3.1.3.64,3.1.3.95 ko:K01108,ko:K18081 ko00562,ko01100,ko04070,map00562,map01100,map04070 - R03330,R03363,R06875 RC00078 ko00000,ko00001,ko01000,ko01009 - - - GRAM,Myotub-related

nextgenusfs commented 1 year ago

So I think this might have just been a bug in the parsing of the versions -- can you install from master and see if that fixes it?

CesarBertinetti commented 1 year ago

*Thanks. I went ahead and update the code from master

python -m pip install git+https://github.com/nextgenusfs/funannotate.git

and then run the same code as described above (1st comment):

This time the annotate command was not run successfully though and the annotations.eggnog.txt still is empty.

The error I got is:

Start_time = Sun Apr 2 10:51:58 EDT 2023 [Apr 02 10:52 AM]: OS: Red Hat Enterprise Linux 8.7, 24 cores, ~ 263 GB RAM. Python: 3.8.15 [Apr 02 10:52 AM]: Running 1.8.14 [Apr 02 10:52 AM]: No NCBI SBT file given, will use default, however if you plan to submit to NCBI, create one and pass it here '--sbt' [Apr 02 10:52 AM]: Parsing input files [Apr 02 10:52 AM]: Existing tbl found: /tmp/225474.1.long/predict_results/bluegill1.tbl [Apr 02 10:56 AM]: Adding Functional Annotation to bluegill, NCBI accession: None [Apr 02 10:56 AM]: Annotation consists of: 65,119 gene models [Apr 02 10:56 AM]: 62,703 protein records loaded [Apr 02 10:56 AM]: Running HMMer search of PFAM version 35.0 [Apr 02 11:10 AM]: 52,048 annotations added [Apr 02 11:10 AM]: Running Diamond blastp search of UniProt DB version 2022_05 [Apr 02 11:11 AM]: 10,647 valid gene/product annotations from 13,829 total [Apr 02 11:11 AM]: Running Eggnog-mapper [Apr 02 12:04 PM]: Parsing EggNog Annotations [Apr 02 12:04 PM]: EggNog version parsed as 2.1.1

Traceback (most recent call last): File "/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/bin/funannotate", line 8, in sys.exit(main()) File "/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 716, in main mod.main(arguments) File "/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/lib/python3.8/site-packages/funannotate/annotate.py", line 837, in main EggNog = parseEggNoggMapper(eggnog_result, eggnog_out, GeneProducts) File "/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/lib/python3.8/site-packages/funannotate/annotate.py", line 275, in parseEggNoggMapper OGs = cols[DBi].split(',') TypeError: list indices must be integers or slices, not NoneType

nextgenusfs commented 1 year ago

Can you add --upgrade --force --no-deps to the pip command to make sure it updates? On some systems it won't update because the version isn't bumped.

CesarBertinetti commented 1 year ago

Sorry, same error.....

[Apr 03 11:19 AM]: Running Eggnog-mapper [Apr 03 12:16 PM]: Parsing EggNog Annotations [Apr 03 12:16 PM]: EggNog version parsed as 2.1.1

Traceback (most recent call last): File "/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/bin/funannotate", line 8, in sys.exit(main()) File "/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 716, in main mod.main(arguments) File "/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/lib/python3.8/site-packages/funannotate/annotate.py", line 1055, in main EggNog = parseEggNoggMapper(eggnog_result, eggnog_out, GeneProducts) File "/afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/lib/python3.8/site-packages/funannotate/annotate.py", line 367, in parseEggNoggMapper OGs = cols[DBi].split(",") TypeError: list indices must be integers or slices, not NoneType

nextgenusfs commented 1 year ago

Okay thanks. Will try to find some time to figure out what the next problem is...

shenyu10 commented 1 year ago

Okay thanks. Will try to find some time to figure out what the next problem is...

Thanks for your great work. I wonder the right version of EggNOG so that I could make the code run correctly without more work. I have looked for document, Installation - Dependencies, but I've got only "Checking dependencies for funannotate v1.4.0". I would be appreciated with your help.

shenyu10 commented 1 year ago

Okay thanks. Will try to find some time to figure out what the next problem is...

Thanks for your great work. I wonder the right version of EggNOG so that I could make the code run correctly without more work. I have looked for document, Installation - Dependencies, but I've got only "Checking dependencies for funannotate v1.4.0". I would be appreciated with your help.

OK. Seemed EggNOG 2.1.9 works for funannotate 1.8.14. Thanks again for your great work!

CesarBertinetti commented 1 year ago

OK. Seemed EggNOG 2.1.9 works for funannotate 1.8.14. Thanks again for your great work!

I tried with that version and it worked too! Many thanks

nextgenusfs commented 1 year ago

@CesarBertinetti are you able to post the same records from v2.1.9 versus 2.1.10 so we can identify the change in format?

CesarBertinetti commented 1 year ago

Sure! eggnog.emapper.annotations-10records-2.1.9.txt

nextgenusfs commented 1 year ago

Thanks @CesarBertinetti but I guess whatever is/was causing the error must be a specific record then as these data are identical for the first 10 records

$ diff eggnog.emapper.annotations-10records.txt eggnog.emapper.annotations-10records-2.1.9.txt 
1,3c1,3
< ## Tue Mar 28 11:30:07 2023
< ## emapper-2.1.10
< ## /afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/bin/emapper.py -m diamond -i /tmp/219324.1.long/annotate_misc/genome.proteins.fasta -o eggnog --cpu 10 --scratch_dir /tmp/emapper-43affc2b --temp_dir /tmp --dbmem
---
> ## Tue Apr  4 12:17:10 2023
> ## emapper-2.1.9
> ## /afs/crc.nd.edu/user/c/cbertine/.conda/envs/funannotate/bin/emapper.py -m diamond -i /tmp/230065.1.long/annotate_misc/genome.proteins.fasta -o eggnog --cpu 8 --scratch_dir /tmp/emapper-51bc5c17 --temp_dir /tmp --dbmem