Open 285880830 opened 2 years ago
I got the same error.
11:42:53.447 WARNING: Ignoring invalid symbol '*' at pos. 1623 in line 2 of /tmp/tmpd42j2lo8/query.a3m
Don't you have '*' at the end of the query sequence?
Yes, I have this at the end the protein, and I am facing identical error. I would run again without this one
OK! Unbelievably, the so-called "WARNING" is what causing the error! By just removing the asterisk symbol from the end of my sequence, my problem was resolved.
CE8eSpRY.fasta 报错:I0910 11:42:53.106589 139985805367104 hhsearch.py:76] Launching subprocess "hhsearch -i /tmp/tmpd42j2lo8/query.a3m -o /tmp/tmpd42j2lo8/output.hhr -maxseq 1000000 -d /mnt/pdb70/pdb70" I0910 11:42:53.259523 139985805367104 utils.py:36] Started HHsearch query I0910 11:42:53.617978 139985805367104 utils.py:40] Finished HHsearch query in 0.358 seconds Traceback (most recent call last): File "run_alphafold.py", line 338, in
app.run(main)
File "/home/alphafold/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/alphafold/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "run_alphafold.py", line 310, in main
predict_structure(
File "run_alphafold.py", line 170, in predict_structure
feature_dict = data_pipeline.process(
File "/home/alphafold/alphafold/alphafold/data/pipeline.py", line 166, in process
hhsearch_result = self.hhsearch_pdb70_runner.query(uniref90_msa_as_a3m)
File "/home/alphafold/alphafold/alphafold/data/tools/hhsearch.py", line 85, in query
raise RuntimeError(
RuntimeError: HHSearch failed:
stdout:
stderr:
11:42:53.447 INFO: /tmp/tmpd42j2lo8/query.a3m is in A2M, A3M or FASTA format
11:42:53.447 WARNING: Ignoring invalid symbol '*' at pos. 1623 in line 2 of /tmp/tmpd42j2lo8/query.a3m
11:42:53.610 ERROR: [subseq from] CRISPR-associated endonuclease Cas9/Csn1 n=212 Tax=root TaxID=1 RepID=CAS9_STRP1
11:42:53.610 ERROR: Error in /opt/conda/conda-bld/hhsuite_1616660820288/work/src/hhalignment.cpp:1244: Compress:
11:42:53.610 ERROR: sequences in /tmp/tmpd42j2lo8/query.a3m do not all have the same number of columns,
11:42:53.610 ERROR:
e.g. first sequence and sequence UniRef90_Q99ZW2/2-1048.
11:42:53.610 ERROR: Check input format for '-M a2m' option and consider using '-M first' or '-M 50'