fordmadox / EAD-to-LUX

EAD to JSON process. Just intended for a proof-of-concept stage of a project.
1 stars 0 forks source link

Adjust transformation for whitespace issues in mixed content #3

Open fordmadox opened 4 years ago

fordmadox commented 4 years ago

Example:

<p>A copy of this material is available in digital form from Manuscripts and Archives and<a href='https://yalemssa.aviaryplatform.com/r/vd6nz81487'>online<\/a>.<\/p>

....no space after "and" and before the HTML anchor tag.

Source data is encoded as expected, though.

See https://luxpoc.collections.yale.edu/catalog/aspace-archival-object-2076706

fordmadox commented 4 years ago

Just my overeager usage of normalize-space, so I need to make sure to preserve space for p tags and the like.

fordmadox commented 4 years ago

Note: this particular example is fixed, but I'm going to keep the issue open until I finalize a list of elements that should not have any space stripped. So far, just added the paragraph tags to that list, but there are more than that.