elifesciences / decision-letter-parser

Parse docx file containing decision letter and author response content and produce output in other formats
MIT License
0 stars 0 forks source link

Enhancements to asset xref tagging #101

Closed gnott closed 4 years ago

gnott commented 4 years ago

Re issue https://github.com/elifesciences/issues/issues/5795

The existing logic to add <xref> tags around mentions of figures in paragraph text was only surrounding terms that matched the label value exactly (not counting any full stop at the end, which is stripped before looking for matches). If a mention of specific figure panels appeared in the text, the <xref> tag did not expand to included the alphabetical-optionally-hyphenated panel term.

E.g. "Author response image 1C-F" instead of

<xref ref-type="fig" rid="sa2fig1">Author response image 1</xref>C-F

we will now get

<xref ref-type="fig" rid="sa2fig1">Author response image 1C-F</xref>

Code includes refactoring to make these parts easier to understand and to test, and there are test cases based on some real observed content and a couple edge cases discovered when developing the code enhancements.