Closed eyalbenda closed 7 years ago
Hi @eyalbenda. The documents are out of date, sorry. When I get the time I need to update the docs. What version of funannotate are you using?
The EggNog databases have been dropped in the most current version of funannotate in favor of eggnog-mapper
which is here. When I first started writing this there wasn't a method to map proteins to the EggNog database other than just doing a HMM search, so I had incorporated it into funannotate. However, now the EggNog developers have a nice tool to query their database. funannotate annotate
can incorporate that data using the --eggnog
flag.
Here is help menu for funannotate annotate
:
Usage: funannotate annotate <arguments>
version: 0.7.2
Description: Script functionally annotates the results from funannotate predict. It pulls
annotation from PFAM, InterPro, EggNog, UniProtKB, MEROPS, CAZyme, and GO ontology.
Required: -i, --input Folder from funannotate predict
or
--genbank Genome in GenBank format
-o, --out Output folder for results
or
--gff Genome GFF3 annotation file
--fasta Genome in multi-fasta format
--proteins Genome proteins in multi-fasta format
-s, --species Species name, use quotes for binomial, e.g. "Aspergillus fumigatus"
-o, --out Output folder for results
Optional: --sbt NCBI submission template file. (Recommended)
--eggnog Eggnog-mapper annotations file.
--antismash antiSMASH secondary metabolism results, GBK file.
--iprscan InterProScan XML file
--phobius Phobius pre-computed results.
--isolate Isolate name, e.g. Af293
--strain Strain name
--busco_db BUSCO models. Default: dikarya
-t, --tbl2asn Additional parameters for tbl2asn. Example: "-l paired-ends"
--force Force over-write of output folder
--cpus Number of CPUs to use. Default: 2
ENV Vars: By default loaded from your $PATH, however you can specify at run-time if not in PATH
--AUGUSTUS_CONFIG_PATH
Written by Jon Palmer (2016-2017) nextgenusfs@gmail.com
Thanks! I'm using the latest one, and indeed I see that annotate does have the eggnog option. So if I understand correctly, I would need to manually download the relevant eggnog database using the mapper you link to? Then, which file specifically do I point to using the --eggnog option?
You can run the eggnog mapper tool on your protein fasta file (use the HMM search). There is a script in the eggnog-mapper distribution for downloading/maintaining the databases, more information
Then you would run something like the following:
emapper.py -i proteins.fa --output result -d fuNOG --cpu 12
This will produce output files described here. You would then pass the file result.emapper.annotations
to the --eggnog
option.
Reading the faq about running funannotate on other organisms, one of the things I've noticed is that I should specify "--eggnog_db". I downloaded the relevant database using "funannotate eggnog", but how do I specify which db to use, and at which step (predict or annotate)?