Assign proteins to orthologous groups (eggNOG 5) on CPUs or GPUs with deep networks. DeepNOG is much faster than alignment-based methods, providing accuracy similar to HMMER.
The easiest way to install DeepNOG is to obtain it from PyPI:
pip install deepnog
Alternatively, you can clone or download bleeding edge versions from GitHub and run
pip install /path/to/DeepNOG
If you plan to extend DeepNOG as a developer, run
pip install -e /path/to/DeepNOG
instead.
deepnog
can also be installed from bioconda like this:
conda install deepnog
Call the deepnog
command line tool with a
protein sequence file in FASTA format.
Example usages:
deepnog infer proteins.faa
deepnog infer proteins.faa --out prediction.csv
deepnog infer proteins.faa -db eggNOG5 -t 1236 -V 3 -c 0.99
deepnog train train.fa val.fa train.csv val.csv -a deepnog -e 15 --shuffle -r 123 -db eggNOG5 -t 3 -o /path/to/outdir
The individual models for OG predictions are not stored on GitHub or PyPI,
because they exceed file size limitations (up to 200M).
deepnog
automatically downloads the models, and puts them into a
cache directory (default ~/deepnog_data/
). You can change this directory
by setting the DEEPNOG_DATA
environment variable.
For help and advanced options, call deepnog --help
,
and deepnog infer --help
or deepnog train --help
for specific options
for inference or training, respectively.
See also the user & developer guide.
Preferred: FASTA (raw, .gz, or .xz)
DeepNOG supports protein sequences stored in all file formats listed in https://biopython.org/wiki/SeqIO, but is tested for the FASTA-file format only.
deepnog
builds upon the following packages:
See also requirements/*.txt
for platform-specific recommendations
(sometimes, specific versions might be required due to platform-specific
bugs in the deepnog requirements)
This research is supported by the Austrian Science Fund (FWF): P27703, P31988; and by the GPU grant program of Nvidia corporation.
If you use DeepNOG, please consider citing our research article (click here for bibtex):
Roman Feldbauer, Lukas Gosch, Lukas Lüftinger, Patrick Hyden, Arthur Flexer, Thomas Rattei, DeepNOG: Fast and accurate protein orthologous group assignment, Bioinformatics, 2020, btaa1051, https://doi.org/10.1093/bioinformatics/btaa1051