adsabs / ADSIngestParser

Curation parser library
MIT License
0 stars 7 forks source link

Crossref XML parser isn't catching the abstract in (at least some) XML #30

Closed seasidesparrow closed 1 year ago

seasidesparrow commented 1 year ago

Describe the bug If we harvest crossref-xml via the habanero.cn.content_negotiation method, resulting records have the abstract contained in an <abstract> tag. If we pass the resulting XML to adsingestp.parsers.crossref(xref_xmldata), the abstract is missing from the resulting parsed object.

To Reproduce Fetch the xml for doi="10.3847/1538-4357/ac8c2f" from Crossref via habanero.cn.content_negotiation, and parse the resulting data.

Example code:

from habanero.cn import content_negotiation as cone
from adsingestp.parsers.crossref import CrossrefParser 

doi = '10.3847/1538-4357/ac8c2f'

try:
    result = cone(ids=doi, format='crossref-xml')
    parser = CrossrefParser()
    output = parser.parse(result)
    print(output)
except Exception as err:
    print(err)

Additional context Add any other context about the problem here.

seasidesparrow commented 1 year ago

Fixed in pull request #32