soedinglab / hh-suite

Remote protein homology detection suite.
GNU General Public License v3.0
529 stars 133 forks source link

alphafold2: HHblits failed #304

Open nick-youngblut opened 2 years ago

nick-youngblut commented 2 years ago

I've tried using the standard alphafold2 setup via docker (converted to a singularity container) via the setup described at, and both result in the following error:

E1210 12:01:01.009660 22603932526400] - 11:49:18.512 INFO: Iteration 1
E1210 12:01:01.009703 22603932526400] - 11:49:19.070 INFO: Prefiltering database
E1210 12:01:01.009746 22603932526400] - 11:58:30.028 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 1100976
E1210 12:01:01.009794 22603932526400] - 12:00:53.386 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 377309
E1210 12:01:01.009837 22603932526400] - 12:00:58.424 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 2313
E1210 12:01:01.009882 22603932526400] - 12:00:58.424 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 2313
E1210 12:01:01.009924 22603932526400] - 12:00:58.424 INFO: Scoring 2313 HMMs using HMM-HMM Viterbi alignment
E1210 12:01:01.009967 22603932526400] - 12:00:59.528 INFO: Alternative alignment: 0
E1210 12:01:01.010009 22603932526400] HHblits stderr end
Traceback (most recent call last):
  File "/app/alphafold/", line 310, in <module>
  File "/opt/conda/lib/python3.7/site-packages/absl/", line 312, in run
    _run_main(main, args)
  File "/opt/conda/lib/python3.7/site-packages/absl/", line 258, in _run_main
  File "/app/alphafold/", line 292, in main
  File "/app/alphafold/", line 129, in predict_structure
  File "/app/alphafold/alphafold/data/", line 174, in process
  File "/app/alphafold/alphafold/data/tools/", line 144, in query
    stdout.decode('utf-8'), stderr[:500_000].decode('utf-8')))
RuntimeError: HHblits failed

- 11:49:17.334 INFO: Searching 65983866 column state sequences.

- 11:49:18.440 INFO: Searching 15161831 column state sequences.

- 11:49:18.512 INFO: ./tests/ALP1.faa is in A2M, A3M or FASTA format

- 11:49:18.512 INFO: Iteration 1

- 11:49:19.070 INFO: Prefiltering database

- 11:58:30.028 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 1100976

- 12:00:53.386 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 377309

- 12:00:58.424 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 2313

- 12:00:58.424 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 2313

- 12:00:58.424 INFO: Scoring 2313 HMMs using HMM-HMM Viterbi alignment

- 12:00:59.528 INFO: Alternative alignment: 0

The error occurs both via and when just running the hhblits command. Based on, it seems to be a memory issue, but I've used up to 504 Gb of memory, and I still get the error.

Typical stats for the cluster jobs:

exit_status  1
cpu          3310.335s
mem          18.326TBs
io           253.773GB
iow          0.000s
maxvmem      37.131GB

I've also tried compiling hh-suite for the cluster nodes, and that didn't help.

My conda env based on (only one setup that I've tried):

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
absl-py                   0.13.0                   pypi_0    pypi
astunparse                1.6.3                    pypi_0    pypi
biopython                 1.79                     pypi_0    pypi
ca-certificates           2021.10.26           h06a4308_2
cachetools                4.2.4                    pypi_0    pypi
certifi                   2021.10.8                pypi_0    pypi
charset-normalizer        2.0.9                    pypi_0    pypi
chex                      0.0.7                    pypi_0    pypi
contextlib2               21.6.0                   pypi_0    pypi
cudatoolkit               11.0.3               h15472ef_9    conda-forge
cudnn                        h86fa8c9_0    conda-forge
dm-haiku                  0.0.4                    pypi_0    pypi
dm-tree                   0.1.6                    pypi_0    pypi
fftw                      3.3.10          nompi_h77c792f_102    conda-forge
flatbuffers               1.12                     pypi_0    pypi
gast                      0.4.0                    pypi_0    pypi
google-auth               2.3.3                    pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
google-pasta              0.2.0                    pypi_0    pypi
grpcio                    1.34.1                   pypi_0    pypi
h5py                      3.1.0                    pypi_0    pypi
hhsuite                   3.3.0           py38pl5262hc37a69a_2    bioconda
hmmer                     3.3.2                h1b792b2_1    bioconda
idna                      3.3                      pypi_0    pypi
immutabledict             2.0.0                    pypi_0    pypi
importlib-metadata        4.8.2                    pypi_0    pypi
jax                       0.2.25                   pypi_0    pypi
jaxlib                    0.1.69+cuda111           pypi_0    pypi
kalign2                   2.04                 h779adbc_2    bioconda
keras-nightly             2.5.0.dev2021032900          pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
libblas                   3.9.0           12_linux64_openblas    conda-forge
libcblas                  3.9.0           12_linux64_openblas    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 11.2.0              h1d223b6_11    conda-forge
libgfortran-ng            11.2.0              h69a702a_11    conda-forge
libgfortran5              11.2.0              h5c6108e_11    conda-forge
libgomp                   11.2.0              h1d223b6_11    conda-forge
liblapack                 3.9.0           12_linux64_openblas    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.18          pthreads_h8fe5266_0    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_11    conda-forge
libzlib                   1.2.11            h36c2ea0_1013    conda-forge
markdown                  3.3.6                    pypi_0    pypi
ml-collections            0.1.0                    pypi_0    pypi
ncurses                   6.2                  h58526e2_4    conda-forge
numpy                     1.19.5                   pypi_0    pypi
oauthlib                  3.1.1                    pypi_0    pypi
ocl-icd                   2.3.1                h7f98852_0    conda-forge
ocl-icd-system            1.0.0                         1    conda-forge
openmm                    7.5.1            py38h7850c2e_1    conda-forge
openssl                   3.0.0                h7f98852_2    conda-forge
opt-einsum                3.3.0                    pypi_0    pypi
pandas                    1.3.4                    pypi_0    pypi
pdbfixer                  1.7                pyhd3deb0d_0    conda-forge
perl                      5.26.2            h36c2ea0_1008    conda-forge
pip                       21.3.1             pyhd8ed1ab_0    conda-forge
protobuf                  3.19.1                   pypi_0    pypi
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
python                    3.8.12          hf930737_2_cpython    conda-forge
python-dateutil           2.8.2                    pypi_0    pypi
python_abi                3.8                      2_cp38    conda-forge
pytz                      2021.3                   pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
readline                  8.1                  h46c0cb4_0    conda-forge
requests                  2.26.0                   pypi_0    pypi
requests-oauthlib         1.3.0                    pypi_0    pypi
rsa                       4.8                      pypi_0    pypi
scipy                     1.7.0                    pypi_0    pypi
setuptools                59.4.0           py38h578d9bd_0    conda-forge
six                       1.15.0                   pypi_0    pypi
sqlite                    3.37.0               h9cd32fc_0    conda-forge
tabulate                  0.8.9                    pypi_0    pypi
tensorboard               2.7.0                    pypi_0    pypi
tensorboard-data-server   0.6.1                    pypi_0    pypi
tensorboard-plugin-wit    1.8.0                    pypi_0    pypi
tensorflow                2.5.0                    pypi_0    pypi
tensorflow-cpu            2.5.0                    pypi_0    pypi
tensorflow-estimator      2.5.0                    pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
tk                        8.6.11               h27826a3_1    conda-forge
toolz                     0.11.2                   pypi_0    pypi
typing-extensions                  pypi_0    pypi
urllib3                   1.26.7                   pypi_0    pypi
werkzeug                  2.0.2                    pypi_0    pypi
wheel                     0.37.0             pyhd8ed1ab_1    conda-forge
wrapt                     1.12.1                   pypi_0    pypi
xz                        5.2.5                h516909a_1    conda-forge
zipp                      3.6.0                    pypi_0    pypi
zlib                      1.2.11            h36c2ea0_1013    conda-forge
donglianglxz commented 4 months ago

@nick-youngblut Excuse me! have you solved your problem? I encountered the same problem, but I don't how to deal with it.

`I0525 17:37:18.623581 47755692236224] Started HHsearch query I0525 17:37:51.544831 47755692236224] Finished HHsearch query in 32.921 seconds I0525 17:37:51.552284 47755692236224] Launching subprocess "/usr/bin/hhblits -i test.fasta -cpu 4 -oa3m /tmp/tmp8y6hhyho/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /home/lijundi/protein/alphafold/database/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /home/lijundi/protein/alphafold/database/uniref30/UniRef30_2021_03" I0525 17:37:51.584613 47755692236224] Started HHblits query I0525 17:42:09.718916 47755692236224] Finished HHblits query in 258.134 seconds E0525 17:42:09.719525 47755692236224] HHblits failed. HHblits stderr begin: E0525 17:42:09.719723 47755692236224] - 17:38:15.391 INFO: Searching 65983866 column state sequences. E0525 17:42:09.719834 47755692236224] - 17:38:16.778 INFO: Searching 29291635 column state sequences. E0525 17:42:09.719931 47755692236224] - 17:38:16.886 INFO: test.fasta is in A2M, A3M or FASTA format E0525 17:42:09.720036 47755692236224] - 17:38:16.886 INFO: Iteration 1 E0525 17:42:09.720133 47755692236224] - 17:38:16.952 INFO: Prefiltering database E0525 17:42:09.720228 47755692236224] - 17:41:26.216 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment) : 968028 E0525 17:42:09.720326 47755692236224] - 17:42:06.654 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment) : 323631 E0525 17:42:09.720420 47755692236224] - 17:42:08.259 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment) : 2630 E0525 17:42:09.720514 47755692236224] - 17:42:08.259 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 2630 E0525 17:42:09.720604 47755692236224] - 17:42:08.259 INFO: Scoring 2630 HMMs using HMM-HMM Viterbi alignment E0525 17:42:09.720708 47755692236224] - 17:42:08.330 INFO: Alternative alignment: 0 E0525 17:42:09.720803 47755692236224] HHblits stderr end Traceback (most recent call last): File "/home/lijundi/protein/alphafold/", line 570, in File "/home/lijundi/miniconda3/envs/af2/lib/python3.8/site-packages/absl/", line 312, in run _run_main(main, args) File "/home/lijundi/miniconda3/envs/af2/lib/python3.8/site-packages/absl/", line 258, in _run_main sys.exit(main(argv)) File "/home/lijundi/protein/alphafold/", line 543, in main predict_structure( File "/home/lijundi/protein/alphafold/", line 256, in predict_structure feature_dict = data_pipeline.process( File "/home/lijundi/protein/alphafold/alphafold/data/", line 215, in process hhblits_bfd_uniref_result = run_msa_tool( File "/home/lijundi/protein/alphafold/alphafold/data/", line 96, in run_msa_tool result = msa_runner.query(input_fasta_path)[0] File "/home/lijundi/protein/alphafold/alphafold/data/tools/", line 143, in query raise RuntimeError('HHblits failed\nstdout:\n%s\n\nstderr:\n%s\n' % ( RuntimeError: HHblits failed stdout:

