Describe the bug
Publishers may field affiliations with multiple institution tags having different attributes (e.g. content-type="org-division" and content-type="org-name"). With the current jats parser (v0.9.6) the affiliation data are being decomposed, stripping the embedded data of their context, and may be output with poor formatting (e.g. missing spaces between elements)
To Reproduce
Use ingest parser to parse abstracts/sources/SPRINGER/files/JOU=41467/VOL=2023.14/ISU=1/ART=4026 1/41467_2023_Article_40261_nlm.xml. Parsing produces the following affiliation string:
Clem Jones Centre for Ageing Dementia Research, Queensland Brain InstituteThe University of Queensland4072BrisbaneQLDAustralia
Additional context
Example from the file noted above:
<aff id="Aff1"><label>1</label><institution-wrap><institution-id institution-id-type="GRID">grid.1003.2<
/institution-id><institution-id institution-id-type="ISNI">0000 0000 9320 7537</institution-id><institution content-type="org-division">Clem Jones Centre for Ageing Dementia Research, Queensland Brain Institute</institution><institution content-type="org-name">The University of Queensland</institution></institution-wrap><addr-line content-type="postcode">4072</addr-line><addr-line content-type="city">Brisbane</addr-line><addr-line content-type="state">QLD</addr-line><country country="AU">Australia</country></aff>
Describe the bug Publishers may field affiliations with multiple institution tags having different attributes (e.g.
content-type="org-division"
andcontent-type="org-name"
). With the current jats parser (v0.9.6) the affiliation data are being decomposed, stripping the embedded data of their context, and may be output with poor formatting (e.g. missing spaces between elements)To Reproduce Use ingest parser to parse
abstracts/sources/SPRINGER/files/JOU=41467/VOL=2023.14/ISU=1/ART=4026 1/41467_2023_Article_40261_nlm.xml
. Parsing produces the following affiliation string:Additional context Example from the file noted above: