biopython / biopython

Official git repository for Biopython (originally converted from CVS)
http://biopython.org/
Other
4.33k stars 1.75k forks source link

Using Bio.SearchIO with Interproscan #3537

Closed Xiaofei-git closed 1 year ago

Xiaofei-git commented 3 years ago

I am trying to start using Bio.SearchIO.InterproscanIO, but got stuck at the very beginning. How to input/read/parse the output of Interproscan into Python?

https://biopython.readthedocs.io/en/latest/api/Bio.SearchIO.InterproscanIO.html#module-Bio.SearchIO.InterproscanIO

I also read the link for blast output. But don't get the idea for Interproscan. I used the code below, but I do't think it is right because I did get anything for qresults and no attribute for 'target'.

https://github.com/biopython/biopython/blob/master/Bio/SearchIO/__init__.py

Thanks so much!

qresults =  SearchIO.parse('interproscan_test.xsd', 'interproscan-xml')

print(qresults.target)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'generator' object has no attribute 'target'
peterjc commented 3 years ago

The convention used in the Biopython parser is a read(...) function will return a single record, while a matching parse(...) function is a generator yielding multiple records. You would typically use that in a for loop.

Try:

for qresult in SearchIO.parse('interproscan_test.xsd', 'interproscan-xml'):
    print(qresult.target)

Or:

qresults = SearchIO.parse('interproscan_test.xsd', 'interproscan-xml')
for qresult in qresults:
    print(qresult.target)

Note you can only loop over the generator once!

Xiaofei-git commented 3 years ago

Thanks a lot!

Here is the code I used as below, but nothing was printed out. Also, how can I get the 'Hit' object? I pasted the contents for "test_interpro.xsd".

qresults = SearchIO.parse('test_interpro.xsd', 'interproscan-xml')
for qresult in qresults:
     print(qresult.target)
     print(qresult.version)

# want to get the 'Hit' object:

for qresult in qresults:
     print(qresult.target)
     for hit in qresult:
             print(hit.id)
<?xml version="1.0" encoding="UTF-8"?><protein-matches xmlns="http://www.ebi.ac.uk/interpro/resources/schemas/interproscan5" interproscan-version="5.46-81.0">
  <protein>
    <sequence md5="e75bb62e350d0b9ada7afcec295e60a8">MRRYLSKLIHGPVGFNPSNSFPQLNCEMGSYCAARRTNRQASFSGNNEYATRAFATTSCASLSEDQPKDNPVSDMLVDSFGRLHTYLRISLTERCNLRCQYCMPADGVELTPSPQLLTKTEILRCANLFVSSGVNKIRLTGGEPTIRKDIEDICLELSNLKGLKTLSMTTNGIALARKLPKLKECGLNSVNISLDTLVPAKFEFMTRRKGHEKVMDAINASIDLGFNPVNCVVMRGFNDDEICDFVELTREKPIDIRFIEFMPFDGNVWNVKKLVPYSEMLDKVMKRFTSLKRVQDHPTDTAKNFTIDGHEGRVSFITSMTEHFCAGCNRLRLLADGNFKVCLFGPSEISLRDPLRRGAEDDELKEIIGAAVKRKKASHAGMFDIAKTANRPMIHIGG</sequence>
    <xref id="SoyZH13_15G247000.m6" name="SoyZH13_15G247000.m6"/>
    <matches>
      <hmmer2-match evalue="3.5E-10" score="49.8">
        <signature ac="SM00729" name="MiaB">
          <entry ac="IPR006638" desc="Elp3/MiaB/NifB" name="Elp3/MiaB/NifB" type="DOMAIN">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0003824" name="catalytic activity"/>
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0051536" name="iron-sulfur cluster binding"/>
          </entry>
          <signature-library-release library="SMART" version="7.1"/>
        </signature>
        <model-ac>SM00729</model-ac>
        <locations>
          <hmmer2-location score="49.8" evalue="3.5E-10" hmm-start="1" hmm-end="246" hmm-length="246" hmm-bounds="COMPLETE" start="85" end="288">
            <location-fragments>
              <hmmer2-location-fragment start="85" end="288" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer2-location>
        </locations>
      </hmmer2-match>
      <hmmer3-match evalue="1.0E-104" score="352.2">
        <signature ac="G3DSA:3.20.20.70" name="Aldolase class I">
          <entry ac="IPR013785" desc="Aldolase-type TIM barrel" name="Aldolase_TIM" type="HOMOLOGOUS_SUPERFAMILY">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0003824" name="catalytic activity"/>
          </entry>
          <signature-library-release library="GENE3D" version="4.2.0"/>
        </signature>
        <model-ac>1tv8B00</model-ac>
        <locations>
          <hmmer3-location env-end="398" env-start="72" post-processed="true" score="352.0" evalue="1.2E-104" hmm-start="4" hmm-end="340" hmm-length="340" hmm-bounds="COMPLETE" start="72" end="398">
            <location-fragments>
              <hmmer3-location-fragment start="72" end="398" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <hmmer3-match evalue="2.8E-106" score="353.2">
        <signature ac="TIGR02666" desc="moaA: molybdenum cofactor biosynthesis protein A" name="TIGR02666">
          <entry ac="IPR013483" desc="Molybdenum cofactor biosynthesis protein A" name="MoaA" type="FAMILY">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0046872" name="metal ion binding"/>
            <go-xref category="BIOLOGICAL_PROCESS" db="GO" id="GO:0006777" name="Mo-molybdopterin cofactor biosynthetic process"/>
            <pathway-xref db="KEGG" id="00790+4.1.99.22" name="Folate biosynthesis"/>
            <pathway-xref db="Reactome" id="R-HSA-947581" name="Molybdenum cofactor biosynthesis"/>
            <pathway-xref db="MetaCyc" id="PWY-6823" name="Molybdenum cofactor biosynthesis"/>
          </entry>
          <signature-library-release library="TIGRFAM" version="15.0"/>
        </signature>
        <model-ac>TIGR02666</model-ac>
        <locations>
          <hmmer3-location env-end="398" env-start="76" post-processed="false" score="353.0" evalue="3.3E-106" hmm-start="1" hmm-end="336" hmm-length="336" hmm-bounds="COMPLETE" start="76" end="398">
            <location-fragments>
              <hmmer3-location-fragment start="76" end="398" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <hmmer3-match evalue="1.2E-6" score="29.0">
        <signature ac="PF13353" desc="4Fe-4S single cluster domain" name="Fer4_12">
          <signature-library-release library="PFAM" version="33.1"/>
        </signature>
        <model-ac>PF13353</model-ac>
        <locations>
          <hmmer3-location env-end="216" env-start="81" post-processed="true" score="27.1" evalue="4.4E-6" hmm-start="11" hmm-end="114" hmm-length="137" hmm-bounds="INCOMPLETE" start="91" end="195">
            <location-fragments>
              <hmmer3-location-fragment start="91" end="195" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <hmmer3-match evalue="4.8E-29" score="101.9">
        <signature ac="PF04055" desc="Radical SAM superfamily" name="Radical_SAM">
          <entry ac="IPR007197" desc="Radical SAM" name="rSAM" type="DOMAIN">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0003824" name="catalytic activity"/>
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0051536" name="iron-sulfur cluster binding"/>
          </entry>
          <signature-library-release library="PFAM" version="33.1"/>
        </signature>
        <model-ac>PF04055</model-ac>
        <locations>
          <hmmer3-location env-end="249" env-start="89" post-processed="true" score="100.9" evalue="9.9E-29" hmm-start="1" hmm-end="166" hmm-length="167" hmm-bounds="N_TERMINAL_COMPLETE" start="89" end="248">
            <location-fragments>
              <hmmer3-location-fragment start="89" end="248" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <hmmer3-match evalue="3.6E-38" score="130.4">
        <signature ac="PF06463" desc="Molybdenum Cofactor Synthesis C" name="Mob_synth_C">
          <entry ac="IPR010505" desc="Molybdenum cofactor synthesis C-terminal" name="Mob_synth_C" type="DOMAIN">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0051539" name="4 iron, 4 sulfur cluster binding"/>
            <go-xref category="BIOLOGICAL_PROCESS" db="GO" id="GO:0006777" name="Mo-molybdopterin cofactor biosynthetic process"/>
            <go-xref category="CELLULAR_COMPONENT" db="GO" id="GO:0019008" name="molybdopterin synthase complex"/>
            <pathway-xref db="KEGG" id="00790+4.1.99.22" name="Folate biosynthesis"/>
            <pathway-xref db="Reactome" id="R-HSA-947581" name="Molybdenum cofactor biosynthesis"/>
            <pathway-xref db="MetaCyc" id="PWY-6823" name="Molybdenum cofactor biosynthesis"/>
          </entry>
          <signature-library-release library="PFAM" version="33.1"/>
        </signature>
        <model-ac>PF06463</model-ac>
        <locations>
          <hmmer3-location env-end="381" env-start="254" post-processed="true" score="129.7" evalue="5.7E-38" hmm-start="1" hmm-end="127" hmm-length="128" hmm-bounds="N_TERMINAL_COMPLETE" start="254" end="380">
            <location-fragments>
              <hmmer3-location-fragment start="254" end="380" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <hmmer3-match-with-sites evalue="0.0" score="394.7">
        <signature ac="SFLDG01383" desc="cyclic pyranopterin phosphate synthase (MoaA-like)" name="cyclic_pyranopterin_phosphate">
          <signature-library-release library="SFLD" version="4"/>
        </signature>
        <model-ac>SFLDG01383</model-ac>
        <locations>
          <hmmer3-location-with-sites env-end="398" env-start="72" score="394.5" evalue="0.0" hmm-start="2" hmm-end="324" hmm-length="324" hmm-bounds="INCOMPLETE" start="73" end="398">
            <location-fragments>
              <hmmer3-location-fragment-with-sites start="73" end="398" dc-status="CONTINUOUS"/>
            </location-fragments>
            <sites>
              <hmmer3-site description=" Binds [4Fe-4S]-AdoMet cluster" numLocations="3">
                <site-locations>
                  <site-location residue="C" start="102" end="102"/>
                  <site-location residue="C" start="99" end="99"/>
                  <site-location residue="C" start="95" end="95"/>
                </site-locations>
              </hmmer3-site>
              <hmmer3-site description=" Binds [4Fe-4S] cluster" numLocations="3">
                <site-locations>
                  <site-location residue="C" start="342" end="342"/>
                  <site-location residue="C" start="328" end="328"/>
                  <site-location residue="C" start="325" end="325"/>
                </site-locations>
              </hmmer3-site>
              <hmmer3-site description=" Binds S-adensosylmethionine" numLocations="1">
                <site-locations>
                  <site-location residue="G" start="142" end="142"/>
                </site-locations>
              </hmmer3-site>
            </sites>
          </hmmer3-location-with-sites>
        </locations>
      </hmmer3-match-with-sites>
      <hmmer3-match-with-sites evalue="0.0" score="394.7">
        <signature ac="SFLDG01072" desc="dehydrogenase like" name="dehydrogenase_like">
          <signature-library-release library="SFLD" version="4"/>
        </signature>
        <model-ac>SFLDG01072</model-ac>
        <locations>
          <hmmer3-location-with-sites env-end="398" env-start="72" score="394.5" evalue="0.0" hmm-start="2" hmm-end="324" hmm-length="371" hmm-bounds="INCOMPLETE" start="73" end="398">
            <location-fragments>
              <hmmer3-location-fragment-with-sites start="73" end="398" dc-status="CONTINUOUS"/>
            </location-fragments>
            <sites/>
          </hmmer3-location-with-sites>
        </locations>
      </hmmer3-match-with-sites>
      <panther-match evalue="8.1E-151" familyName="MOLYBDENUM COFACTOR BIOSYNTHESIS PROTEIN 1" score="506.5">
        <signature ac="PTHR22960" name="MOLYBDOPTERIN COFACTOR SYNTHESIS PROTEIN A">
          <signature-library-release library="PANTHER" version="14.1"/>
        </signature>
        <model-ac>PTHR22960</model-ac>
        <locations>
          <panther-location env-start="60" env-end="398" hmm-start="55" hmm-end="384" hmm-length="636" hmm-bounds="INCOMPLETE" start="70" end="397">
            <location-fragments>
              <panther-location-fragment start="70" end="397" dc-status="CONTINUOUS"/>
            </location-fragments>
          </panther-location>
        </locations>
      </panther-match>
      <panther-match evalue="8.1E-151" familyName="MOLYBDENUM COFACTOR BIOSYNTHESIS PROTEIN 1" score="506.5">
        <signature ac="PTHR22960:SF26" name="MOLYBDENUM COFACTOR BIOSYNTHESIS PROTEIN 1">
          <signature-library-release library="PANTHER" version="14.1"/>
        </signature>
        <model-ac>PTHR22960:SF26</model-ac>
        <locations>
          <panther-location env-start="60" env-end="398" hmm-start="55" hmm-end="384" hmm-length="636" hmm-bounds="INCOMPLETE" start="70" end="397">
            <location-fragments>
              <panther-location-fragment start="70" end="397" dc-status="CONTINUOUS"/>
            </location-fragments>
          </panther-location>
        </locations>
      </panther-match>
      <patternscan-match>
        <signature ac="PS01305" desc="moaA / nifB / pqqE family signature." name="MOAA_NIFB_PQQE">
          <entry ac="IPR000385" desc="MoaA/NifB/PqqE, iron-sulphur binding, conserved site" name="MoaA_NifB_PqqE_Fe-S-bd_CS" type="CONSERVED_SITE">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0051539" name="4 iron, 4 sulfur cluster binding"/>
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0003824" name="catalytic activity"/>
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0046872" name="metal ion binding"/>
            <pathway-xref db="Reactome" id="R-HSA-947581" name="Molybdenum cofactor biosynthesis"/>
            <pathway-xref db="KEGG" id="00790+4.1.99.22" name="Folate biosynthesis"/>
            <pathway-xref db="MetaCyc" id="PWY-6823" name="Molybdenum cofactor biosynthesis"/>
          </entry>
          <signature-library-release library="PROSITE_PATTERNS" version="2019_11"/>
        </signature>
        <model-ac>PS01305</model-ac>
        <locations>
          <patternscan-location level="STRONG" start="91" end="102">
            <location-fragments>
              <patternscan-location-fragment start="91" end="102" dc-status="CONTINUOUS"/>
            </location-fragments>
            <alignment>LterCNLRCqYC</alignment>
          </patternscan-location>
        </locations>
      </patternscan-match>
      <profilescan-match>
        <signature ac="MF_01225_B" desc="GTP 3',8-cyclase [moaA]." name="MoaA_B">
          <entry ac="IPR013483" desc="Molybdenum cofactor biosynthesis protein A" name="MoaA" type="FAMILY">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0046872" name="metal ion binding"/>
            <go-xref category="BIOLOGICAL_PROCESS" db="GO" id="GO:0006777" name="Mo-molybdopterin cofactor biosynthetic process"/>
            <pathway-xref db="KEGG" id="00790+4.1.99.22" name="Folate biosynthesis"/>
            <pathway-xref db="Reactome" id="R-HSA-947581" name="Molybdenum cofactor biosynthesis"/>
            <pathway-xref db="MetaCyc" id="PWY-6823" name="Molybdenum cofactor biosynthesis"/>
          </entry>
          <signature-library-release library="HAMAP" version="2020_01"/>
        </signature>
        <model-ac>MF_01225_B</model-ac>
        <locations>
          <profilescan-location score="35.752" start="76" end="398">
            <location-fragments>
              <profilescan-location-fragment start="76" end="398" dc-status="CONTINUOUS"/>
            </location-fragments>
            <alignment>LVDSFGRLHTYLRISLTERCNLRCQYCMPADGvELT--PSPQLLTKTEILRCANLFVSSGVNKIRLTGGEPTIRKDIEDICLELSNLKGLKTLSMTTNGIALARKLPKLKECGLNSVNISLDTLVPAKFEFMTRRKGHEKVMDAINASIDLGFN--PVNCVVMRGFNDDEICDFVELTREKPIDIRFIEFMPFDGNVWnVKK--LVPYSEMLDKVMKRFTSLKRVQDHPTD-TAKNFTIDGHEGRVSFITSMTEHFCAGCNRLRLLADGNFKVCLFGPSEISLRDPLRRGAEDDELKEIIGAAVKRKKASHAGMFDIAKTaNRPMIHIGG</alignment>
          </profilescan-location>
        </locations>
      </profilescan-match>
      <rpsblast-match>
        <signature ac="cd05644" desc="M28_like" name="M28_like">
          <signature-library-release library="CDD" version="3.17"/>
        </signature>
        <model-ac>cd05644</model-ac>
        <locations>
          <rpsblast-location evalue="1.1685E-4" score="41.9198" start="274" end="363">
            <location-fragments>
              <rpsblast-location-fragment start="274" end="363" dc-status="CONTINUOUS"/>
            </location-fragments>
            <sites/>
          </rpsblast-location>
        </locations>
      </rpsblast-match>
      <rpsblast-match>
        <signature ac="cd01335" desc="Radical_SAM" name="Radical_SAM">
          <signature-library-release library="CDD" version="3.17"/>
        </signature>
        <model-ac>cd01335</model-ac>
        <locations>
          <rpsblast-location evalue="5.48435E-19" score="82.3813" start="89" end="264">
            <location-fragments>
              <rpsblast-location-fragment start="89" end="264" dc-status="CONTINUOUS"/>
            </location-fragments>
            <sites>
              <rpsblast-site description="FeS/SAM binding site" numLocations="15">
                <site-locations>
                  <site-location residue="C" start="102" end="102"/>
                  <site-location residue="N" start="171" end="171"/>
                  <site-location residue="Y" start="101" end="101"/>
                  <site-location residue="T" start="140" end="140"/>
                  <site-location residue="C" start="95" end="95"/>
                  <site-location residue="T" start="170" end="170"/>
                  <site-location residue="F" start="264" end="264"/>
                  <site-location residue="C" start="99" end="99"/>
                  <site-location residue="M" start="234" end="234"/>
                  <site-location residue="S" start="193" end="193"/>
                  <site-location residue="E" start="143" end="143"/>
                  <site-location residue="T" start="169" end="169"/>
                  <site-location residue="G" start="142" end="142"/>
                  <site-location residue="P" start="263" end="263"/>
                  <site-location residue="L" start="97" end="97"/>
                </site-locations>
              </rpsblast-site>
            </sites>
          </rpsblast-location>
        </locations>
      </rpsblast-match>
      <superfamilyhmmer3-match evalue="2.09E-68">
        <signature ac="SSF102114" name="Radical SAM enzymes">
          <signature-library-release library="SUPERFAMILY" version="1.75"/>
        </signature>
        <model-ac>0049709</model-ac>
        <locations>
          <superfamilyhmmer3-location hmm-length="327" start="76" end="360">
            <location-fragments>
              <superfamilyhmmer3-location-fragment start="76" end="360" dc-status="CONTINUOUS"/>
            </location-fragments>
          </superfamilyhmmer3-location>
        </locations>
      </superfamilyhmmer3-match>
    </matches>
  </protein>
  <protein>
    <sequence md5="b510a8bcfa2110dd2e4c26e4edd807c7">MDGDGSSESPIETKKAKSKTPRKPKETILKQKSPAEFFAENKNIAGFDNPGKSLYTTVRELVENSLDSAESISELPVVEITIEEIGKSKFNSMIGLVDRERVDAALYDDYETEKAREKRLAKEARAQEMQAKNAALGKKVKDTPASKAIKGRGEASFYRVTCKDNGKGMPHDDIPNMFGRVLSGTKYGLKQTRGKFGLGAKMALIWSKMSTGLPIEITSSMKNQNYVSFCRLDIDIHKNIPHVHLHEKRENKEHWRGAEIQVVIEGNWTTYRSKILHYMRQMAVITPYAQFLFKFVSDAPDKNVSIRFARRTDVMPPIPMETKHHPSSVDLLLIKRLIAETSKQNLLQFLQHEFVNISKSYAERLIALLYVYIFVTEAQKEILLVAVDILVKRNGSRLRLENDCEVSNFTAISTDSSIASSSQV</sequence>
    <xref id="SoyZH13_13G315500.m1" name="SoyZH13_13G315500.m1"/>
    <matches>
      <coils-match>
        <signature ac="Coil" name="Coil">
          <signature-library-release library="COILS" version="2.2.1"/>
        </signature>
        <model-ac>Coil</model-ac>
        <locations>
          <coils-location start="107" end="132">
            <location-fragments>
              <coils-location-fragment start="107" end="132" dc-status="CONTINUOUS"/>
            </location-fragments>
          </coils-location>
        </locations>
      </coils-match>
      <coils-match>
        <signature ac="Coil" name="Coil">
          <signature-library-release library="COILS" version="2.2.1"/>
        </signature>
        <model-ac>Coil</model-ac>
        <locations>
          <coils-location start="424" end="424">
            <location-fragments>
              <coils-location-fragment start="424" end="424" dc-status="CONTINUOUS"/>
            </location-fragments>
          </coils-location>
        </locations>
      </coils-match>
      <hmmer3-match evalue="1.1E-6" score="30.8">
        <signature ac="G3DSA:1.10.8.50" name="">
          <signature-library-release library="GENE3D" version="4.2.0"/>
        </signature>
        <model-ac>1mu5A02</model-ac>
        <locations>
          <hmmer3-location env-end="373" env-start="323" post-processed="true" score="29.6" evalue="2.7E-6" hmm-start="1" hmm-end="47" hmm-length="78" hmm-bounds="COMPLETE" start="323" end="373">
            <location-fragments>
              <hmmer3-location-fragment start="323" end="373" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <hmmer3-match evalue="1.1E-8" score="35.2">
        <signature ac="PF13589" desc="Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase" name="HATPase_c_3">
          <signature-library-release library="PFAM" version="33.1"/>
        </signature>
        <model-ac>PF13589</model-ac>
        <locations>
          <hmmer3-location env-end="256" env-start="144" post-processed="true" score="30.7" evalue="2.5E-7" hmm-start="35" hmm-end="109" hmm-length="137" hmm-bounds="INCOMPLETE" start="159" end="238">
            <location-fragments>
              <hmmer3-location-fragment start="159" end="238" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <mobidblite-match>
        <signature ac="mobidb-lite" desc="consensus disorder prediction" name="disorder_prediction">
          <signature-library-release library="MOBIDB_LITE" version="2.0"/>
        </signature>
        <model-ac>mobidb-lite</model-ac>
        <locations>
          <mobidblite-location sequence-feature="Polyampholyte" start="7" end="30">
            <location-fragments>
              <mobidblite-location-fragment start="7" end="30" dc-status="CONTINUOUS"/>
            </location-fragments>
          </mobidblite-location>
        </locations>
      </mobidblite-match>
      <mobidblite-match>
        <signature ac="mobidb-lite" desc="consensus disorder prediction" name="disorder_prediction">
          <signature-library-release library="MOBIDB_LITE" version="2.0"/>
        </signature>
        <model-ac>mobidb-lite</model-ac>
        <locations>
          <mobidblite-location sequence-feature="" start="1" end="32">
            <location-fragments>
              <mobidblite-location-fragment start="1" end="32" dc-status="CONTINUOUS"/>
            </location-fragments>
          </mobidblite-location>
        </locations>
      </mobidblite-match>
      <panther-match evalue="1.0E-72" familyName="DNA TOPOISOMERASE 6 SUBUNIT B" score="248.7">
        <signature ac="PTHR10871" name="30S RIBOSOMAL PROTEIN S13/40S RIBOSOMAL PROTEIN S18">
          <signature-library-release library="PANTHER" version="14.1"/>
        </signature>
        <model-ac>PTHR10871</model-ac>
        <locations>
          <panther-location env-start="1" env-end="406" hmm-start="7" hmm-end="283" hmm-length="594" hmm-bounds="INCOMPLETE" start="19" end="369">
            <location-fragments>
              <panther-location-fragment start="19" end="369" dc-status="CONTINUOUS"/>
            </location-fragments>
          </panther-location>
        </locations>
      </panther-match>
      <panther-match evalue="1.0E-72" familyName="DNA TOPOISOMERASE 6 SUBUNIT B" score="248.7">
        <signature ac="PTHR10871:SF4" name="DNA TOPOISOMERASE 6 SUBUNIT B">
          <entry ac="IPR005734" desc="DNA topoisomerase VI, subunit B" name="TopoVI_B" type="FAMILY">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0005524" name="ATP binding"/>
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0003918" name="DNA topoisomerase type II (double strand cut, ATP-hydrolyzing) activity"/>
            <go-xref category="BIOLOGICAL_PROCESS" db="GO" id="GO:0006265" name="DNA topological change"/>
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0003677" name="DNA binding"/>
          </entry>
          <signature-library-release library="PANTHER" version="14.1"/>
        </signature>
        <model-ac>PTHR10871:SF4</model-ac>
        <locations>
          <panther-location env-start="1" env-end="406" hmm-start="7" hmm-end="283" hmm-length="594" hmm-bounds="INCOMPLETE" start="19" end="369">
            <location-fragments>
              <panther-location-fragment start="19" end="369" dc-status="CONTINUOUS"/>
            </location-fragments>
          </panther-location>
        </locations>
      </panther-match>
      <rpsblast-match>
        <signature ac="cd16933" desc="HATPase_TopVIB-like" name="HATPase_TopVIB-like">
          <signature-library-release library="CDD" version="3.17"/>
        </signature>
        <model-ac>cd16933</model-ac>
        <locations>
          <rpsblast-location evalue="1.45378E-96" score="285.781" start="35" end="304">
            <location-fragments>
              <rpsblast-location-fragment start="35" end="304" dc-status="CONTINUOUS"/>
            </location-fragments>
            <sites>
              <rpsblast-site description="ATP binding site" numLocations="21">
                <site-locations>
                  <site-location residue="D" start="67" end="67"/>
                  <site-location residue="K" start="167" end="167"/>
                  <site-location residue="G" start="257" end="257"/>
                  <site-location residue="D" start="164" end="164"/>
                  <site-location residue="N" start="64" end="64"/>
                  <site-location residue="S" start="68" end="68"/>
                  <site-location residue="E" start="60" end="60"/>
                  <site-location residue="G" start="199" end="199"/>
                  <site-location residue="G" start="197" end="197"/>
                  <site-location residue="G" start="166" end="166"/>
                  <site-location residue="S" start="219" end="219"/>
                  <site-location residue="A" start="200" end="200"/>
                  <site-location residue="I" start="260" end="260"/>
                  <site-location residue="S" start="65" end="65"/>
                  <site-location residue="L" start="61" end="61"/>
                  <site-location residue="L" start="198" end="198"/>
                  <site-location residue="C" start="162" end="162"/>
                  <site-location residue="G" start="168" end="168"/>
                  <site-location residue="M" start="169" end="169"/>
                  <site-location residue="I" start="217" end="217"/>
                  <site-location residue="A" start="258" end="258"/>
                </site-locations>
              </rpsblast-site>
              <rpsblast-site description="G-X-G motif" numLocations="4">
                <site-locations>
                  <site-location residue="G" start="199" end="199"/>
                  <site-location residue="G" start="168" end="168"/>
                  <site-location residue="G" start="197" end="197"/>
                  <site-location residue="G" start="166" end="166"/>
                </site-locations>
              </rpsblast-site>
              <rpsblast-site description="Mg binding site" numLocations="1">
                <site-locations>
                  <site-location residue="N" start="64" end="64"/>
                </site-locations>
              </rpsblast-site>
              <rpsblast-site description="ATP-lid" numLocations="2">
                <site-locations>
                  <site-location residue="A" start="200" end="200"/>
                  <site-location residue="F" start="178" end="178"/>
                </site-locations>
              </rpsblast-site>
            </sites>
          </rpsblast-location>
        </locations>
      </rpsblast-match>
      <superfamilyhmmer3-match evalue="2.1E-19">
        <signature ac="SSF55874" name="ATPase domain of HSP90 chaperone/DNA topoisomerase II/histidine kinase">
          <entry ac="IPR036890" desc="Histidine kinase/HSP90-like ATPase superfamily" name="HATPase_C_sf" type="HOMOLOGOUS_SUPERFAMILY"/>
          <signature-library-release library="SUPERFAMILY" version="1.75"/>
        </signature>
        <model-ac>0048223</model-ac>
        <locations>
          <superfamilyhmmer3-location hmm-length="219" start="152" end="294">
            <location-fragments>
              <superfamilyhmmer3-location-fragment start="152" end="294" dc-status="CONTINUOUS"/>
            </location-fragments>
          </superfamilyhmmer3-location>
        </locations>
      </superfamilyhmmer3-match>
    </matches>
  </protein>
  <protein>
    <sequence md5="94e02b232876eee4ad5c68785a43c769">MVERCTSPIVLLCLFSFSALTLAFSPCPLTGLPLVRNISEIPQDNYGRAGLSHMTVAGSLLHGMKEVEVWLQTFSPGTHTPIHRHSCEEVFIVLKGSGTLYLASDSHGRYPGKPQEHFIFPNSTFHIPVNDAHQLWNTNEHEDLQVLVIISRPPVKVFVYEDWSVPHTAAKVKFPYYWDEQCYQEPPKDEL</sequence>
    <xref id="SoyZH13_02G140300.m1" name="SoyZH13_02G140300.m1"/>
    <matches>
      <fingerprints-match evalue="2.4E-79" graphscan="IIIIIII">
        <signature ac="PR00655" desc="Auxin binding protein signature" name="AUXINBINDNGP">
          <entry ac="IPR000526" desc="Auxin-binding protein" name="Auxin-bd" type="FAMILY">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0010011" name="auxin binding"/>
          </entry>
          <signature-library-release library="PRINTS" version="42.0"/>
        </signature>
        <model-ac>PR00655</model-ac>
        <locations>
          <fingerprints-location motifNumber="2" pvalue="7.89E-14" score="83.12" start="53" end="73">
            <location-fragments>
              <fingerprints-location-fragment start="53" end="73" dc-status="CONTINUOUS"/>
            </location-fragments>
          </fingerprints-location>
          <fingerprints-location motifNumber="4" pvalue="2.33E-7" score="58.44" start="107" end="120">
            <location-fragments>
              <fingerprints-location-fragment start="107" end="120" dc-status="CONTINUOUS"/>
            </location-fragments>
          </fingerprints-location>
          <fingerprints-location motifNumber="7" pvalue="1.88E-12" score="77.73" start="163" end="182">
            <location-fragments>
              <fingerprints-location-fragment start="163" end="182" dc-status="CONTINUOUS"/>
            </location-fragments>
          </fingerprints-location>
          <fingerprints-location motifNumber="6" pvalue="3.33E-15" score="84.19" start="141" end="163">
            <location-fragments>
              <fingerprints-location-fragment start="141" end="163" dc-status="CONTINUOUS"/>
            </location-fragments>
          </fingerprints-location>
          <fingerprints-location motifNumber="1" pvalue="1.27E-13" score="70.91" start="33" end="52">
            <location-fragments>
              <fingerprints-location-fragment start="33" end="52" dc-status="CONTINUOUS"/>
            </location-fragments>
          </fingerprints-location>
          <fingerprints-location motifNumber="3" pvalue="1.0E-16" score="84.0" start="76" end="100">
            <location-fragments>
              <fingerprints-location-fragment start="76" end="100" dc-status="CONTINUOUS"/>
            </location-fragments>
          </fingerprints-location>
          <fingerprints-location motifNumber="5" pvalue="2.7E-10" score="75.57" start="120" end="135">
            <location-fragments>
              <fingerprints-location-fragment start="120" end="135" dc-status="CONTINUOUS"/>
            </location-fragments>
          </fingerprints-location>
        </locations>
      </fingerprints-match>
      <hmmer3-match evalue="1.6E-98" score="327.2">
        <signature ac="PF02041" desc="Auxin binding protein" name="Auxin_BP">
          <entry ac="IPR000526" desc="Auxin-binding protein" name="Auxin-bd" type="FAMILY">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0010011" name="auxin binding"/>
          </entry>
          <signature-library-release library="PFAM" version="33.1"/>
        </signature>
        <model-ac>PF02041</model-ac>
        <locations>
          <hmmer3-location env-end="191" env-start="24" post-processed="true" score="327.0" evalue="1.8E-98" hmm-start="2" hmm-end="167" hmm-length="167" hmm-bounds="C_TERMINAL_COMPLETE" start="25" end="191">
            <location-fragments>
              <hmmer3-location-fragment start="25" end="191" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <hmmer3-match evalue="6.2E-61" score="206.2">
        <signature ac="G3DSA:2.60.120.10" name="Jelly Rolls">
          <entry ac="IPR014710" desc="RmlC-like jelly roll fold" name="RmlC-like_jellyroll" type="HOMOLOGOUS_SUPERFAMILY"/>
          <signature-library-release library="GENE3D" version="4.2.0"/>
        </signature>
        <model-ac>1lr5B00</model-ac>
        <locations>
          <hmmer3-location env-end="191" env-start="29" post-processed="true" score="205.9" evalue="7.9E-61" hmm-start="6" hmm-end="161" hmm-length="163" hmm-bounds="COMPLETE" start="29" end="191">
            <location-fragments>
              <hmmer3-location-fragment start="29" end="191" dc-status="CONTINUOUS"/>
            </location-fragments>
          </hmmer3-location>
        </locations>
      </hmmer3-match>
      <panther-match evalue="2.1E-104" familyName="FAMILY NOT NAMED" score="349.6">
        <signature ac="PTHR37236" name="FAMILY NOT NAMED">
          <entry ac="IPR000526" desc="Auxin-binding protein" name="Auxin-bd" type="FAMILY">
            <go-xref category="MOLECULAR_FUNCTION" db="GO" id="GO:0010011" name="auxin binding"/>
          </entry>
          <signature-library-release library="PANTHER" version="14.1"/>
        </signature>
        <model-ac>PTHR37236</model-ac>
        <locations>
          <panther-location env-start="1" env-end="191" hmm-start="7" hmm-end="194" hmm-length="194" hmm-bounds="C_TERMINAL_COMPLETE" start="4" end="191">
            <location-fragments>
              <panther-location-fragment start="4" end="191" dc-status="CONTINUOUS"/>
            </location-fragments>
          </panther-location>
        </locations>
      </panther-match>
      <superfamilyhmmer3-match evalue="3.91E-30">
        <signature ac="SSF51182" name="RmlC-like cupins">
          <entry ac="IPR011051" desc="RmlC-like cupin domain superfamily" name="RmlC_Cupin_sf" type="HOMOLOGOUS_SUPERFAMILY"/>
          <signature-library-release library="SUPERFAMILY" version="1.75"/>
        </signature>
        <model-ac>0051916</model-ac>
        <locations>
          <superfamilyhmmer3-location hmm-length="186" start="32" end="179">
            <location-fragments>
              <superfamilyhmmer3-location-fragment start="32" end="179" dc-status="CONTINUOUS"/>
            </location-fragments>
          </superfamilyhmmer3-location>
        </locations>
      </superfamilyhmmer3-match>
    </matches>
  </protein>
</protein-matches>
peterjc commented 3 years ago

Have you read the chapter in our tutorial about SearchIO? "BLAST and other sequence search tools"

http://biopython.org/DIST/docs/tutorial/Tutorial.html

Xiaofei-git commented 3 years ago

Have you read the chapter in our tutorial about SearchIO? "BLAST and other sequence search tools"

http://biopython.org/DIST/docs/tutorial/Tutorial.html

Yes, but I did't find what is the problem for the codes I used. I will double check and let you know.

Caiofcas commented 3 years ago

This might be a little late, but the problem is that in the second loop the generator has already reached the end of the file, so it would not yield more results. Moving the hit finding into the first loop works for me:

from Bio import SearchIO

qresults = SearchIO.parse('test_interpro.xsd', 'interproscan-xml')
for qresult in qresults:
    print("target: ",qresult.target)
    print("version: ",qresult.version)
    for hit in qresult:
        print("hit: ",hit.id)

Output:

target:  InterPro
version:  5.46-81.0
hit:  SM00729
hit:  G3DSA:3.20.20.70
hit:  TIGR02666
hit:  PF13353
hit:  PF04055
hit:  PF06463
hit:  SFLDG01383
hit:  SFLDG01072
hit:  PTHR22960
hit:  PTHR22960:SF26
hit:  PS01305
hit:  MF_01225_B
hit:  cd05644
hit:  cd01335
hit:  SSF102114
target:  InterPro
version:  5.46-81.0
hit:  Coil
hit:  G3DSA:1.10.8.50
hit:  PF13589
hit:  mobidb-lite
hit:  PTHR10871
hit:  PTHR10871:SF4
hit:  cd16933
hit:  SSF55874
target:  InterPro
version:  5.46-81.0
hit:  PR00655
hit:  PF02041
hit:  G3DSA:2.60.120.10
hit:  PTHR37236
hit:  SSF51182
peterjc commented 1 year ago

Closing as stale, hopefully that was the problem (trying to reuse a generator).