elifesciences / elife-crossref-xml-generation

Crossref deposit of journal articles
MIT License
4 stars 3 forks source link

Abstract XML namespaces change and remove @rid attributes #91

Closed gnott closed 4 years ago

gnott commented 4 years ago

Re issue https://github.com/elifesciences/issues/issues/5743

As a result of viewing some JATS abstracts in the Crossref API, the <jats:abstract> tag is not included in their output. That would make the addition of XML namespaces on that tag to have no effect for any end users, since those namespaces would be stripped away by Crossref.

This code change will additionally add the XML namespaces to <jats:p> tags, when applicable. Traversing the minidom tree of elements, it collects names of tags and attributes of those tags, which are later assessed for whether they are part of a namespace, and if so, it adds the XML namespace attributes to that <jats:p> tag.

Another improvement is to strip away any @rid attributes from tags, in case those would cause XML parsing issues when trying to use the abstract value from Crossref's API.

coveralls commented 4 years ago

Coverage Status

Coverage remained the same at 100.0% when pulling d0800308f1ea3cb0271913a96dbccf4f941efa99 on namespaces into c70c39a9bf07bdda7a97e40d46d8decdac0d29cb on develop.