eggnogdb / eggnog-mapper

Fast genome-wide functional annotation through orthology assignment
http://eggnog-mapper.embl.de
GNU Affero General Public License v3.0
556 stars 105 forks source link

the right way to use hmmer search #447

Closed Saber39 closed 1 year ago

Saber39 commented 1 year ago

Hi, I used download_eggnog_data.py to download the insect hmmer search file.(python download_eggnog_data.py -H -d 2759 -y --data_dir /home/data/eggnog/annotation/hmmer/50557 ) I then got a folder with the name 50557, and then I ran the following command emapper.py -i ../myprotein.fa -m hmmer --output 230306anno_test1 -d /home/data/eggnog/annotation/hmmer/50557 --usemem --cpu 60 and got an error Database /home/data/eggnog/annotation/hmmer/50557 not found. Does this mean I entered the wrong command? How can I specify a specific database for annotation? Thanks.

Cantalapiedra commented 1 year ago

Hi @Saber39 ,

I don't exactly where the hmmer database is located in your system, but if I recall correctly the --data_dir option should point to the base data directory for eggnog-mapper, and not to the final hmmer path. So maybe you should check whether the database is actually located under /home/data/eggnog/annotation/hmmer/50557/hmmer/50557 or something similar.

You may further check the use of --data_dir and other options here https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.10#user-content-Setup or checking the help section of the tool you are using (e.g. download_eggnog_data.py --help).

I hope this is of help.

Best, Carlos

Saber39 commented 1 year ago

Hi @Cantalapiedra , I appreciate your quick response.

I rechecked my download address and found that I had made a mistake, and after fixing the address it worked. But I also encountered the same problem with the issue 390.,which the '--usemem' seems doesn't work.

emapper-2.1.3

emapper.py -i ../Locust.fa -m hmmer --output 230306anno_test11111 -d /home/data/t040403/eggnog/annotation/hmmer/50557/50557.hmm --usemem --cpu 60

hmmer.py:search DB: /home/data/t040403/eggnog/annotation/hmmer/50557/50557.hmm ESC[32mPreparing to query custom database /home/data/t040403/eggnog/annotation/hmmer/50557/50557.hmmESC[0m hmmer.py:search DB: /home/data/t040403/eggnog/annotation/hmmer/50557/50557.hmm, name /home/data/t040403/eggnog/annotation/hmmer/50557/50557.hmm, path /home/data/t040403/eggnog/annotation/hmmer/50557/50557.hmm, host localhost, port 51700, endport 53200, idmap /home/data/t040403/eggnog/annotation/hmmer/50557/50557.hmm.idmap create_servers: hmmdb:/home/data/t040403/eggnog/annotation/hmmer/50557/50557.hmm:localhost:51700-53200 Creating server number 1/1 ESC[1;34mLoading server at localhost, port 51700-51701ESC[0m Creating hmmpgmd server at port 51700 ... Creating hmmpgmd workers (1) at port 51701 ...

This process lasted for over 12 hours, and when I unused --usememnohup emapper.py -i ../Locust.fa -m hmmer --output 230306anno_test1 -d ~/eggnog/annotation/hmmer/50557/50557.hmm --cpu 60 & emapper.py worked fine.

However, the program is working well.

Thanks again for your reply to me, it's really a great software!

Cantalapiedra commented 1 year ago

Hi @Saber39 ,

Which version of hmmer are you using? We usually have problems with versions which are not the ones installed by default with eggnog-mapper.

Thank you for your kind words.

Best, Carlos

Saber39 commented 1 year ago

Hi @Saber39 ,

Which version of hmmer are you using? We usually have problems with versions which are not the ones installed by default with eggnog-mapper.

Thank you for your kind words.

Best, Carlos

Hi, @Cantalapiedra

I just installed the right version of eggnog-mapper with pip install (2.1.10) and run download_eggnog_data.py to get hmmer database. Everything goes well with diamond search. However, hmmer search is still not working with the following code.

nohup emapper.py -i ./hmmer/50557/my.pep.fa -m hmmer --output 230311anno_hmm1 -d ./hmmer/50557/50557.hmm --num_servers 60 --cpu 60 --override > 20230311test_hmm1.out 2>&1 & Here is the .out file:

#  emapper-2.1.10
# emapper.py  -i ./hmmer/50557/my.pep.fa -m hmmer --output 230311anno_hmm1 -d ./hmmer/50557/50557.hmm --num_servers 60 --cpu 60 --override
hmmer.py:search DB: ./hmmer/50557/50557.hmm
ESC[32mPreparing to query custom database ./hmmer/50557/50557.hmmESC[0m
hmmer.py:search DB: ./hmmer/50557/50557.hmm, name ./hmmer/50557/50557.hmm, path ./hmmer/50557/50557.hmm, host None, port 51700, endport 53200, idmap None
ESC[32mSequence mapping starts now!ESC[0m
# /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/eggnogmapper/bin/hmmscan  --cpu 60 -o /dev/null --domtblout '/home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data/emappertmp_hmmcmd_ro7ilws4/tmplk965rjs' './hmmer/50557/50557.hmm' '/home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data/emappertmp_hmmcmd_ro7ilws4/tmpum6jdp2p'
ESC[1;34m Processed queries:17580 total_time:16590.666355371475 rate:1.06 q/sESC[0m
Could not find ./hmmer/50557/50557.hmm among eggnog databases. Skipping seed ortholog detection.
Could not find hits to annotate.
Metadata-Version: 2.1
Name: eggnog-mapper
Version: 2.1.10
Summary: Fast functional annotation of novel sequences using eggNOG orthology assignments.
Home-page: http://eggnog-mapper.embl.de
Author: Jaime Huerta-Cepas
Author-email: jhcepas@gmail.com
Maintainer: Jaime Huerta-Cepas
Maintainer-email: huerta@embl.de
License: GPLv3
Keywords: functional annotation,orthology,eggNOG
Platform: OS Independent
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License (GPL)
Classifier: Natural Language :: English
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.7
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: biopython (==1.76)
Requires-Dist: psutil (==5.7.0)
Requires-Dist: xlsxwriter (==1.4.3)

About the version of hmmer , I tried these codes but got errors:

(testemapperinstall) t040403@vm-9c70-cfc9f5802384:~/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data$ hmmsearch -h

Command 'hmmsearch' not found, but can be installed with:

apt install hmmer
Please ask your administrator.

(testemapperinstall) t040403@vm-9c70-cfc9f5802384:~/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data$ hmmpress -h

Command 'hmmpress' not found, but can be installed with:

apt install hmmer
Please ask your administrator.

(testemapperinstall) t040403@vm-9c70-cfc9f5802384:~/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data$ hmmer -h

Command 'hmmer' not found, did you mean:

  command 'phmmer' from deb hmmer (3.3+dfsg2-1)
  command 'nhmmer' from deb hmmer (3.3+dfsg2-1)

Try: apt install <deb name>

(testemapperinstall) t040403@vm-9c70-cfc9f5802384:~/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data$ hmmbuild

Command 'hmmbuild' not found, but can be installed with:

apt install hmmer
Please ask your administrator.

Looking forward to your reply

Cantalapiedra commented 1 year ago

Hi @Saber39 ,

The error says Could not find ./hmmer/50557/50557.hmm among eggnog databases. Is it really there? Which are the contents of these directories?

Also, I am not really sure that you want to specify the path to the database, or just the tax ID. Check this please, https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.10#user-content-HMMER_search_options

The hmmer command is trying to run is /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/eggnogmapper/bin/hmmscan. It seems it is the hmmer installed in your conda environment. You could check the version with something like conda list | grep hmmer.

Best, Carlos

Saber39 commented 1 year ago

Hi @Saber39 ,

The error says Could not find ./hmmer/50557/50557.hmm among eggnog databases. Is it really there? Which are the contents of these directories?

Hi @Cantalapiedra ,

Thanks for your reply.

The contents of /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data are:

(testemapperinstall) t040403@vm-9c70-cfc9f5802384:~/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data$ ll
total 49748476
drwxrwxr-x  5 t040403 t040403        4096 3月  11 15:59 ./
drwxrwxr-x 26 t040403 t040403        4096 3月  10 15:37 ../
-rw-rw-r--  1 t040403 t040403       24897 3月  11 20:36 20230311test_hmm1.out
-rw-rw-r--  1 t040403 t040403     1313169 3月  11 20:36 230311anno_hmm1.emapper.hits
-rw-rw-r--  1 t040403 t040403 41370988544 3月   2  2021 eggnog.db
-rw-rw-r--  1 t040403 t040403  9285439161 3月   2  2021 eggnog_proteins.dmnd
-rw-r--r--  1 t040403 t040403   278003712 11月 11  2020 eggnog.taxa.db
-rw-r--r--  1 t040403 t040403     6628719 11月 11  2020 eggnog.taxa.db.traverse.pkl
drwx------  2 t040403 t040403        4096 3月  11 20:36 emappertmp_hmmcmd_ro7ilws4/
drwx------  2 t040403 t040403        4096 3月  11 15:59 emappertmp_phmmer_cuu9fdhe/
drwxrwxr-x  3 t040403 t040403        4096 3月  12 19:20 hmmer/
(testemapperinstall) t040403@vm-9c70-cfc9f5802384:~/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data$ tree -d hmmer/
hmmer/
└── 50557

1 directory

Two emappertmp folders, 20230311test_hmm1.out and 230311anno_hmm1.emapper.hits were created under the command nohup emapper.py -i ./hmmer/50557/my.pep.fa -m hmmer --output 230311anno_hmm1 -d ./hmmer/50557/50557.hmm --num_servers 60 --cpu 60 --override > 20230311test_hmm1.out 2>&1 &. I don't know if I understand correctly, but I think this folder should be the default eggnog database. And, the contents of :~/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data/hmmer/50557 are fasta files and 5 5.557.hmm files

...
-rw-rw-r-- 1 t040403 t040403        351 3月  10 15:53 3T1B4.fa
-rw-rw-r-- 1 t040403 t040403        366 3月  10 15:53 3T1B5.fa
-rw-rw-r-- 1 t040403 t040403       1859 3月  10 15:53 3T1B6.fa
-rw-rw-r-- 1 t040403 t040403        411 3月  10 15:53 3T1B7.fa
-rw-rw-r-- 1 t040403 t040403       1088 3月  10 15:53 3T1B8.fa
-rw-rw-r-- 1 t040403 t040403        224 3月  10 15:53 3T1B9.fa
-rw-rw-r-- 1 t040403 t040403        596 3月  10 15:53 3T1BA.fa
-rw-rw-r-- 1 t040403 t040403  895440096 3月  10 15:49 50557.hmm.h3f
-rw-rw-r-- 1 t040403 t040403     730216 3月  10 15:49 50557.hmm.h3i
-rw-rw-r-- 1 t040403 t040403 2342215380 3月  10 15:49 50557.hmm.h3m
-rw-rw-r-- 1 t040403 t040403 2748383232 3月  10 15:49 50557.hmm.h3p
-rw-rw-r-- 1 t040403 t040403     262686 3月  10 15:49 50557.hmm.idmap

I also tried putting all the eggnog files(eggnog.db eggnog_proteins.dmnd eggnog.taxa.db eggnog.taxa.db.traverse.pkl) into the folder ./hmmer/50557/ and run the same command under this folder, but I still got the same result which is [Could not find ./hmmer/50557/50557.hmm among eggnog databases. Skipping seed ortholog detection. Could not find hits to annotate.].

Also, I am not really sure that you want to specify the path to the database, or just the tax ID. Check this please, https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.10#user-content-HMMER_search_options

I didn't specify the path to the database, I downloaded the Insecta database by download_eggnog_data.py -H -d 2759. Here is the end of the download.out file:

ESC[32mDownloading "eggnog.db" at /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data...ESC[0m
ESC[36mcd /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data && wget -nH --user-agent=Mozilla/5.0 --relative --no-parent --reject "index.html*" --cut-dirs=4 -e robots=off -O eggnog.db.gz http://eggnogdb.embl.de/download/emapperdb-5.0.2/eggnog.db.gz && echo Decompressing... && gunzip eggnog.db.gz ESC[0m
ESC[32mDownloading "eggnog.taxa.db" at /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data...ESC[0m
ESC[36mcd /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data && wget -nH --user-agent=Mozilla/5.0 --relative --no-parent --reject "index.html*" --cut-dirs=4 -e robots=off -O eggnog.taxa.tar.gz http://eggnogdb.embl.de/download/emapperdb-5.0.2/eggnog.taxa.tar.gz && echo Decompressing... && tar -zxf eggnog.taxa.tar.gz && rm eggnog.taxa.tar.gzESC[0m
ESC[32mDownloading fasta files " at /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data...ESC[0m
ESC[36mcd /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data && wget -nH --user-agent=Mozilla/5.0 --relative --no-parent --reject "index.html*" --cut-dirs=4 -e robots=off -O eggnog_proteins.dmnd.gz http://eggnogdb.embl.de/download/emapperdb-5.0.2/eggnog_proteins.dmnd.gz && echo Decompressing... && gunzip eggnog_proteins.dmnd.gz ESC[0m
ESC[1;34mSkipping novel families diamond database (or already present). Use -F and -f to force downloadESC[0m
ESC[1;34mSkipping Pfam database (or already present). Use -P and -f to force downloadESC[0m
ESC[1;34mSkipping MMseqs2 database (or already present). Use -M and -f to force downloadESC[0m
ESC[1;34mHMMER database 50557 already present at /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data/hmmer/50557. Use "-f" to force downloadESC[0m

The hmmer command is trying to run is /home/data/t040403/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/eggnogmapper/bin/hmmscan. It seems it is the hmmer installed in your conda environment. You could check the version with something like conda list | grep hmmer.

I check the version with conda list | grep hmmer and got nothing. Then I only check conda list and got :

(testemapperinstall) t040403@vm-9c70-cfc9f5802384:~$ conda list| grep hmmer
(testemapperinstall) t040403@vm-9c70-cfc9f5802384:~$ conda list
# packages in environment at /home/data/t040403/miniconda3/envs/testemapperinstall:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
biopython                 1.76                     pypi_0    pypi
ca-certificates           2023.01.10           h06a4308_0  
certifi                   2022.12.7        py37h06a4308_0  
eggnog-mapper             2.1.10                   pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1  
libffi                    3.4.2                h6a678d5_6  
libgcc-ng                 11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libstdcxx-ng              11.2.0               h1234567_1  
ncurses                   6.4                  h6a678d5_0  
numpy                     1.21.6                   pypi_0    pypi
openssl                   1.1.1t               h7f8727e_0  
pip                       22.3.1           py37h06a4308_0  
psutil                    5.7.0                    pypi_0    pypi
python                    3.7.16               h7a1cb2a_0  
readline                  8.2                  h5eee18b_0  
setuptools                65.6.3           py37h06a4308_0  
sqlite                    3.40.1               h5082296_0  
tk                        8.6.12               h1ccaba5_0  
wheel                     0.38.4           py37h06a4308_0  
xlsxwriter                1.4.3                    pypi_0    pypi
xz                        5.2.10               h5eee18b_1  
zlib                      1.2.13               h5eee18b_0

It looks like hmmer is not installed in my conda environment.But this is strange, because the hmm.hit file 230311anno_hmm1.emapper.hits is generated. Should I install hmmer manually?

Thanks.

Cantalapiedra commented 1 year ago

Hi,

Oh, you are right about the hmmer command. It is using the one bundled along with eggnog-mapper, so the version should be ok.

Regarding the path to the DB, yes, you are specifying the path to the database since you are writing -d ./hmmer/50557/50557.hmm. I am not mistaken, it should be enough with -d 50557?

Please, give that a try and tell me how it goes.

Thank you for your patience.

Best, Carlos

Saber39 commented 1 year ago

Regarding the path to the DB, yes, you are specifying the path to the database since you are writing -d ./hmmer/50557/50557.hmm. I am not mistaken, it should be enough with -d 50557?

Hi,

I have been wrong about the usage of -d, after using -d 50557 it works successfully. I always thought that -d needed to specify the location of the .hmm file. Many thanks for your kind and warm help. ^_^

(base) t040403@vm-9c70-cfc9f5802384:~/miniconda3/envs/testemapperinstall/lib/python3.7/site-packages/data$ ll
total 49770524
drwxrwxr-x  5 t040403 t040403        4096 3月  13 01:14 ./
drwxrwxr-x 26 t040403 t040403        4096 3月  10 15:37 ../
-rw-rw-r--  1 t040403 t040403       24897 3月  11 20:36 20230311test_hmm1.out
-rw-rw-r--  1 t040403 t040403       55401 3月  13 01:14 20230312test_hmm1.out
-rw-rw-r--  1 t040403 t040403     1313169 3月  11 20:36 230311anno_hmm1.emapper.hits
-rw-rw-r--  1 t040403 t040403    20604233 3月  13 01:14 230312anno_hmm1.emapper.annotations
-rw-rw-r--  1 t040403 t040403     1313151 3月  13 01:10 230312anno_hmm1.emapper.hits
-rw-rw-r--  1 t040403 t040403      582847 3月  13 01:14 230312anno_hmm1.emapper.seed_orthologs
-rw-rw-r--  1 t040403 t040403 41370988544 3月   2  2021 eggnog.db
-rw-rw-r--  1 t040403 t040403  9285439161 3月   2  2021 eggnog_proteins.dmnd
-rw-r--r--  1 t040403 t040403   278003712 11月 11  2020 eggnog.taxa.db
-rw-r--r--  1 t040403 t040403     6628719 11月 11  2020 eggnog.taxa.db.traverse.pkl
drwx------  2 t040403 t040403        4096 3月  11 20:36 emappertmp_hmmcmd_ro7ilws4/
drwx------  2 t040403 t040403        4096 3月  11 15:59 emappertmp_phmmer_cuu9fdhe/
drwxrwxr-x  3 t040403 t040403        4096 3月  12 19:20 hmmer/
Cantalapiedra commented 1 year ago

Hi,

Yes, the usage of -d is a bit complex, since it can be used for a database name or tax ID installed under the EGGNOG_DATA_DIR, for a database running in a hmmpgmd server, specifying db:host:port, and in hmm_mapper it can be used for custom .hmm databases also, using the path, if I am not mistaken.

Anyway, I am glad that it finally worked for you!

Best, Carlos