mloesch / sickle

Sickle: OAI-PMH for Humans
Other
106 stars 42 forks source link

AttributeError when harvesting OAI records without a metadata child #37

Closed mrmiguez closed 4 years ago

mrmiguez commented 4 years ago

Our Islandora repository publishes collection records along side item records. The collection records have a <header> child but not a <metadata> child, raising an AttributeError when Sickle harvests them.

Example collection record: http://fsu.digital.flvc.org/oai2?verb=GetRecord&identifier=oai:fsu.digital.flvc.org:fsu_avc50&metadataPrefix=mods

<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2020-04-08T13:25:58Z</responseDate>
  <request>http://fsu.digital.flvc.org/oai2</request>
  <GetRecord>
    <record>
      <header>
        <identifier>oai:fsu.digital.flvc.org:fsu_avc50</identifier>
        <datestamp>2019-02-27T19:54:49Z</datestamp>
        <setSpec>fsu_stucamplifemain</setSpec>
      </header>
    </record>
  </GetRecord>
</OAI-PMH>

Python example:

from sickle import Sickle

h = Sickle("https://fsu.digital.flvc.org/oai2")

# item record works
rec1 = h.GetRecord(identifier="oai:fsu.digital.flvc.org:fsu_666", metadataPrefix='mods')
# collection record fails
rec2 = h.GetRecord(identifier="oai:fsu.digital.flvc.org:fsu_avc50", metadataPrefix='mods')

A try/except block in sickle.models.Record fixes the issue.

    def __init__(self, record_element, strip_ns=True):
            # ...snipped...
            try:
                self.metadata = xml_to_dict(
                    self.xml.find(
                        './/' + self._oai_namespace + 'metadata'
                    ).getchildren()[0], strip_ns=self._strip_ns)
            except AttributeError:
                self.metadata = None
mloesch commented 4 years ago

According to the OAI-PMH specification the metadata XML element is not optional: http://www.openarchives.org/OAI/openarchivesprotocol.html#Record

I suggest that you register your own record implementation as described here: https://sickle.readthedocs.io/en/latest/customizing.html