elifesciences / elife-tools

Python library for parsing eLife article XML data.
MIT License
15 stars 7 forks source link

Convert XML lists to HTML #334

Closed gnott closed 3 years ago

gnott commented 3 years ago

Re issue https://github.com/elifesciences/issues/issues/5828

When converting XML to JSON output, normal lists which represent content blocks will be converted to list blocks. But, if the list is included in a table, the XML in the table is not extensively altered (only some <td> class names are changed in the conversion process).

This PR includes converting <list> and <list-item> JATS tags to <ul> / <ol> and <li> tags when converting XML to HTML.

Since lists would already be converted into a content block by the time the HTML conversion happens, this should result in few alterations outside of the table content this is targetting.

A small refactor of the existing list-type value I started off with, moving that block to a new function in utils.py, but it turned out to not be a natural fit in the HTML rewriting afterall. It doesn't affect any backwards compatibility.

This PR will bump the library version to 0.3.0 since it adds some additional functionality, but it is considered backwards compatible.

coveralls commented 3 years ago

Coverage Status

Coverage increased (+0.002%) to 99.715% when pulling 5af26f359652c2026cf941c251029fb7806588ec on list-html into dd078c9db9fa4f12b77b8e7d5420383249e9fd56 on develop.