dptech-corp / Uni-Fold

An open-source platform for developing protein models beyond AlphaFold.
https://doi.org/10.1101/2022.08.04.502811
Apache License 2.0
380 stars 74 forks source link

HMMSearch Failure reading pdb_seqres.txt for Inference #57

Closed bernym12 closed 2 years ago

bernym12 commented 2 years ago

The error appears to take issue with a specific line in the file but I download it using the provided scripts from AlphaFold2.

Traceback (most recent call last):
  File "unifold/homo_search.py", line 313, in <module>
    app.run(main)
  File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "unifold/homo_search.py", line 291, in main
    generate_pkl_features(
  File "unifold/homo_search.py", line 177, in generate_pkl_features
    feature_dict = data_pipeline.process(
  File "/Uni-Fold/unifold/msa/pipeline.py", line 193, in process
    pdb_templates_result = self.template_searcher.query(msa_for_templates)
  File "/Uni-Fold/unifold/msa/tools/hmmsearch.py", line 89, in query
    return self.query_with_hmm(hmm)
  File "/Uni-Fold/unifold/msa/tools/hmmsearch.py", line 128, in query_with_hmm
    raise RuntimeError(
RuntimeError: hmmsearch failed:
stdout:
# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3 (Nov 2019); http://hmmer.org/
# Copyright (C) 2019 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  /tmp/tmprvyby0zh/query.hmm
# target sequence database:        /database/pdb_seqres/pdb_seqres.txt
# MSA of all hits saved to file:   /tmp/tmprvyby0zh/output.sto
# show alignments in output:       no
# sequence reporting threshold:    E-value <= 100
# domain reporting threshold:      E-value <= 100
# sequence inclusion threshold:    E-value <= 100
# domain inclusion threshold:      E-value <= 100
# MSV filter P threshold:       <= 0.1
# Vit filter P threshold:       <= 0.1
# Fwd filter P threshold:       <= 0.1
# number of worker threads:        8
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       query  [M=117]

stderr:
Parse failed (sequence file /database/pdb_seqres/pdb_seqres.txt):
Line 1360658: illegal character 0

Starting prediction...
start to load params /database/monomer.unifold.pt
start to predict T1104
Traceback (most recent call last):
  File "unifold/inference.py", line 266, in <module>
    main(args)
  File "unifold/inference.py", line 118, in main
    batch = load_feature_for_one_target(
  File "unifold/inference.py", line 61, in load_feature_for_one_target
    batch, _ = load_and_process(
  File "/Uni-Fold/unifold/dataset.py", line 233, in load_and_process
    features, labels = load(**load_kwargs, is_monomer=is_monomer)
  File "/Uni-Fold/unifold/dataset.py", line 129, in load
    all_chain_features = [
  File "/Uni-Fold/unifold/dataset.py", line 130, in <listcomp>
    load_single_feature(s, monomer_feature_dir, uniprot_msa_dir, is_monomer)
  File "/Uni-Fold/unifold/data/utils.py", line 33, in wrapper
    return copy_lib.copy(cached_func(*args, **kwargs))
  File "/Uni-Fold/unifold/dataset.py", line 72, in load_single_feature
    monomer_feature = utils.load_pickle(
  File "/Uni-Fold/unifold/data/utils.py", line 33, in wrapper
    return copy_lib.copy(cached_func(*args, **kwargs))
  File "/Uni-Fold/unifold/data/utils.py", line 67, in load_pickle
    ret = load(path)
  File "/Uni-Fold/unifold/data/utils.py", line 64, in load
    with open_fn(path, "rb") as f:
  File "/opt/conda/lib/python3.8/gzip.py", line 58, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/opt/conda/lib/python3.8/gzip.py", line 173, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/data/output/T1104/A.feature.pkl.gz'
ZiyaoLi commented 2 years ago

Hi @bernym12 would you please refer to issue #15 and see if PR #42 fixes this ? thanks.

ZiyaoLi commented 2 years ago

I'm merging this with #15.

ZiyaoLi commented 2 years ago

Hi @bernym12 would you please refer to issue #15 and see if PR #42 fixes this ? thanks.

Sorry for the misleading reply. You can test to see if this works. https://github.com/dptech-corp/Uni-Fold/issues/15#issuecomment-1232550495

bernym12 commented 2 years ago

Running the solution in #15 solved my issue! Thank you!