biopython / biopython

Official git repository for Biopython (originally converted from CVS)
http://biopython.org/
Other
4.4k stars 1.76k forks source link

Blast.parse failing for psi-blast xml inputs #4886

Closed josephfwc closed 5 days ago

josephfwc commented 5 days ago

Setup

I am reporting a problem with Biopython version 1.84, Python version 3.12.3, and operating system as follows:

3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:50:49) [Clang 16.0.6 ]
CPython
macOS-14.6.1-x86_64-i386-64bit
1.84

Expected behaviour

Attempting to parse the xml output of a psi-blast search which was run locally. xml output file attached. psiblast_results.xml.zip Expecting creation of Bio.Blast.Records object from parsed xml data. Suspect this is something to do with unusual structure of psi-blast output. Documentation suggests psi-blast output should be handled ok though.

Actual behaviour

  Cell In[7], line 7
    for blast_record in blast_records:

  File /opt/anaconda3/envs/psite-conservation/lib/python3.12/site-packages/Bio/Blast/__init__.py:808 in __next__
    parser.Parse(data, False)

  File /Users/runner/miniforge3/conda-bld/python-split_1713205507339/work/Modules/pyexpat.c:477 in EndElement

  File /opt/anaconda3/envs/psite-conservation/lib/python3.12/site-packages/Bio/Blast/_parser.py:1172 in _endElementHandler
    method(self, name)

  File /opt/anaconda3/envs/psite-conservation/lib/python3.12/site-packages/Bio/Blast/_parser.py:867 in _end_query_frame
    raise ValueError(

ValueError: unexpected value 0 in tag <Hsp_query-frame> for program psiblast

Steps to reproduce

from Bio import Blast

with open(xml_file, "rb") as result_handle:
    blast_records = Blast.parse(result_handle)

    for blast_record in blast_records:
        print(blast_record)
mdehoon commented 5 days ago

Fixed in #4887

josephfwc commented 5 days ago

Thank you!