eggnogdb / eggnog-mapper

Fast genome-wide functional annotation through orthology assignment
http://eggnog-mapper.embl.de
GNU Affero General Public License v3.0
557 stars 105 forks source link

struct.error: unpack requires a buffer of 16 bytes #394

Open taniagmangolini opened 2 years ago

taniagmangolini commented 2 years ago

I am facing the error 'struct.error: unpack requires a buffer of 16 bytes' when running Eggnog-mapper (v2.1.7) with Hmmer 3.1b2.

Follow the command that is being used:

emapper.py -m hmmer --genepred prodigal --itype metagenome -d eggnog_db_v0.2 --dbtype hmmdb --usemem --num_servers 2 --evalue 1.0E-10 --data_dir /tmp/scratch/eggnog_db_v0.2 --dbmem --pfam_realign denovo -i /tmp/scratch/cromwell-prod/cromwell-execution/Virome/a78f78ec-8273-4089-9e62-606c6ad26112/call-FilterFasta/contigs_filtered.fasta -o 21027-28D_S13 --cpu 28 --excel

Follow a part of the stacktrace:

2022-06-29T10:43:14.615-03:00 Creating server number 1/2

2022-06-29T10:43:14.615-03:00 Loading server at localhost, port 51700-51701

2022-06-29T10:43:14.615-03:00 Creating hmmpgmd server at port 51700 ...

2022-06-29T10:43:14.620-03:00 Creating hmmpgmd workers (1) at port 51701 ...

2022-06-29T10:43:15.623-03:00 Waiting for server to become ready at localhost:51700 ...

2022-06-29T10:43:59.747-03:00 ........Server ready at localhost:51700

2022-06-29T10:43:59.747-03:00 Creating server number 2/2

2022-06-29T10:43:59.747-03:00 Loading server at localhost, port 51702-51703

2022-06-29T10:43:59.747-03:00 Creating hmmpgmd server at port 51702 ...

2022-06-29T10:43:59.749-03:00 Creating hmmpgmd workers (1) at port 51703 ...

2022-06-29T10:44:00.751-03:00 Waiting for server to become ready at localhost:51702 ...

2022-06-29T10:44:55.318-03:00 .........Server ready at localhost:51702

2022-06-29T10:44:55.318-03:00 Created 2 out of 2

2022-06-29T10:44:55.318-03:00 Sequence mapping starts now!

2022-06-29T10:45:08.394-03:00 Searching queries with a pool of 28 CPUs

2022-06-29T10:45:08.394-03:00 Parsing fasta file /tmp/scratch/emappertmp_prod_gxf3c75u/output.faa...

2022-06-29T10:45:08.394-03:00 Fasta file /tmp/scratch/emappertmp_prod_gxf3c75u/output.faa parsing complete.

2022-06-29T10:45:08.394-03:00 multiprocessing.pool.RemoteTraceback:

2022-06-29T10:45:08.394-03:00 """

2022-06-29T10:45:08.394-03:00 Traceback (most recent call last):

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/multiprocessing/pool.py", line 121, in worker

2022-06-29T10:45:08.394-03:00 result = (True, func(*args, **kwds))

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/site-packages/eggnogmapper/search/hmmer/hmmer_search_hmmpgmd.py", line 64, in iter_seq

2022-06-29T10:45:08.394-03:00 fixed_Z=fixed_Z)

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/site-packages/eggnogmapper/search/hmmer/hmmer_search_hmmpgmd.py", line 173, in scan_hits

2022-06-29T10:45:08.394-03:00 st, msg_len = struct.unpack("I 4x Q", status)

2022-06-29T10:45:08.394-03:00 struct.error: unpack requires a buffer of 16 bytes

2022-06-29T10:45:08.394-03:00 """

2022-06-29T10:45:08.394-03:00 The above exception was the direct cause of the following exception:

2022-06-29T10:45:08.394-03:00 Traceback (most recent call last):

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/bin/emapper.py", line 701, in

2022-06-29T10:45:08.394-03:00 n, elapsed_time = emapper.run(args, args.input, args.annotate_hits_table, args.cache_file)

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/site-packages/eggnogmapper/emapper.py", line 335, in run

2022-06-29T10:45:08.394-03:00 searcher, searcher_name, hits, queries_file = self.search(args, infile, predictor)

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/site-packages/eggnogmapper/emapper.py", line 167, in search

2022-06-29T10:45:08.394-03:00 raise(e)

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/site-packages/eggnogmapper/emapper.py", line 164, in search

2022-06-29T10:45:08.394-03:00 pjoin(self._current_dir, self.search_out_file))

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/site-packages/eggnogmapper/search/hmmer/hmmer.py", line 209, in search

2022-06-29T10:45:08.394-03:00 self.dump_hmm_matches(in_file, hmm_hits_file, dbpath, port, hosts, idmap_file)

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/site-packages/eggnogmapper/search/hmmer/hmmer.py", line 286, in dump_hmm_matches

2022-06-29T10:45:08.394-03:00 trans_table=self.trans_table):

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/site-packages/eggnogmapper/search/hmmer/hmmer_search_hmmpgmd.py", line 32, in iter_seq_hits

2022-06-29T10:45:08.394-03:00 silent=silent, trans_table=trans_table)))):

2022-06-29T10:45:08.394-03:00 File "/opt/conda/envs/emapper/lib/python3.7/multiprocessing/pool.py", line 748, in next

2022-06-29T10:45:08.394-03:00 raise value

2022-06-29T10:45:08.394-03:00. struct.error: unpack requires a buffer of 16 bytes

Cantalapiedra commented 2 years ago

Hi @taniagmangolini ,

which version of eggNOG-mapper and hmmer are you using? which is your OS?

Thank you.

Carlos

taniagmangolini commented 2 years ago

Eggnog-mapper: v2.1.7 Hmmer 3.1b2 OS: Linux Ubuntu

Cantalapiedra commented 2 years ago

Hi @taniagmangolini ,

Sorry for the delay answering. What is the "eggnog_db_v0.2" database and how (and with which hmmer version) did you create it?

taniagmangolini commented 2 years ago

Also sorry about the delay. The eggnog_db_v0.2 is a customized database created with all microrganisms available in eggnog v5.0 (Virus, Bacteria and Fungi). Each HMM database was downloaded separately and then brought together and indexed with hmmpress using HMMER version 3.1b2. The .fa files were obtained using the command download_eggnog_data.py. The .idmap file was created as discussed in issue #390 (each .fa file name was changed).

Cantalapiedra commented 2 years ago

Thank you for the info. So far I don't know what could be happening. It seems that you are running emapper.py from a conda environment. Could you show the output of conda list for the environment that you are using to run it? Maybe also the content of the PATH variable (the output of echo $PATH).

erick-dorlass commented 2 years ago

Hello @Cantalapiedra, I'm with the same team as @taniagmangolini and also looking for reasons for this error.

As requested, here is the output of conda list:

packages in environment at /opt/conda/envs/emapper:

_libgcc_mutex 0.1 main
biopython 1.76 ca-certificates 2022.4.26 h06a4308_0
certifi 2021.10.8 py37h06a4308_2
eggnog-mapper 2.1.7 hmmer 3.1b2 3 bioconda ld_impl_linux-64 2.35.1 h7274673_9
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
ncurses 6.3 h7f8727e_2
numpy 1.21.6 openssl 1.1.1o h7f8727e_0
pip 21.2.2 py37h06a4308_0
psutil 5.7.0 python 3.7.13 h12debd9_0
readline 8.1.2 h7f8727e_1
setuptools 61.2.0 py37h06a4308_0
sqlite 3.38.3 hc218d9a_0
tk 8.6.11 h1ccaba5_1
wheel 0.37.1 pyhd3eb1b0_0
XlsxWriter 1.4.3 xz 5.2.5 h7f8727e_1
zlib 1.2.12 h7f8727e_2

Here is the PATH variable: /opt/conda/envs/emapper/bin:/opt/conda/envs/emapper/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/conda/bin:/home/biodocker/bin

The presented struct error is intermittent, appearing only in some eggnog runs. The running eggnog-mapper have the same code and fa file name adjustments as seen in issue #390.

Cantalapiedra commented 2 years ago

Hi @erick-dorlass ,

Thank you for the info. Everything looks fine to me.

In the eggnog runs with the struct error, does the error always rises at the same point?

Did you check whether you are running out of memory? I guess this is not the problem, but just in case. Even when the database is already loaded into memory, as more queries are processed the hmmpgmd server seems to hold more memory and could finally crash (at least this is something that happened to me with some large data sets).

Also, does this happen when using a smaller hmmer database, for example only the virus one?

Another thing that you may try is installing eggnog-mapper without conda, a Release, for example, and add to the PATH the path to eggnog-mapper bin and to eggnog-mapper/eggnogmapper/bin (where the bundled hmmpgmd command is), and try to run it with it.

I am just trying to guess whether is some error related to a different hmmer compilation or to a specific hmm database. I am currently running something very similar with PFAM 35 as hmm database and is running fine thus far.

Sorry for not being very helpful so far.

Best, Carlos